README¶

Deploy a Kubernetes cluster with Ansible.

Summary¶

This repo builds a generic Kubernetes cluster based on dynamic inventory generated by https://gitlab.com/ska-telescope/sdi/heat-cluster, which in turn relies on ansible collections from https://gitlab.com/ska-telescope/sdi/systems-common-roles .

Checkout cluster-k8s (this repo) and pull the dependent collections with:

git clone git@gitlab.com:ska-telescope/sdi/cluster-k8s.git
cd cluster-k8s
make install

This then needs an ansible vars file that describes the cluster to be built, and to know where to write out the dynamic inventory describing the nodes in the cluster.

Ansible vars for the cluster¶

In order to define the architecture of the cluster, one needs to describe, at a minimum, the loadbalancer, master and worker nodes of which the cluster will be comprised of.

In dev_cluster_vars.yml, it is described a cluster containing 1 loadbalancer (there is always just 1), 1 master and 1 worker. Masters can be scaled out in odd numbers. Workers can be any number. It is suggested to use this file as a starting point for defining and creating a new cluster.

Certificate Management¶

Although there is a predefined certificate defined in dev_cluster_vars.yml, it is suggested that you generate your own k8s_certificate_key, with kubeadm alpha certs certificate-key.

For more information on the subject read Certificate Management with kubeadm.

Ansible inventory file¶

Creating a new cluster by using this playbook, will add the necessary entries, generated by heat-cluster, specifying the new machines in inventory_dev_cluster.

These entries can then be used to add the necessary ssh keys for access to the newly created cluster with the use of distribute-ssh-keys (more on that later).

Setting your OpenStack environment¶

First you will need to install openstacksdk in order to be able to create the VMs using openstack (using heat-cluster).

sudo apt install python-openstackclient

Then, download your OpenStack RC File v3 from the OpenStack Dashboard into your cluster-k8s working directory:

Change the name of the *-openrc.sh file for easier usage:

mv *-openrc.sh to openrc.sh

Then source openrc.sh and test connectivity (it will ask you for a password, use your OpenStack access password):

source ./openrc.sh
openstack server list

The openstack environment variables (set by the openrc.sh file) default values are:

OS_AUTH_URL: "http://192.168.93.215:5000/v3/"
OS_PROJECT_ID: "988f3e60e7834335b3187512411d9072"
OS_PROJECT_NAME: "system-team"
OS_USER_DOMAIN_NAME: "Default"
OS_PROJECT_DOMAIN_ID: "default"
OS_REGION_NAME: "RegionOne"
OS_INTERFACE: "public"
OS_IDENTITY_API_VERSION: 3

The variables with no defaults are:

OS_USERNAME: the openstack username of the user who will access the openstack api
OS_PASSWORD: the openstack password of the user who will access the openstack api

Also, you need to make sure that the OpenStack account used has the stack_owner, as well as the member (sometimes _member) roles.

Referencing custom configuration¶

In order not to have to specify the CLUSTER_KEYPAIR, PRIVATE_VARS and INVENTORY_FILE variables every time a make command is issued, create the PrivateRules.mak file in the cluster-k8s root directory specifying these three variables (these need to be changed accordingly to each specific purpose):

CLUSTER_KEYPAIR=your-openstack-key-name
PRIVATE_VARS=./dev_cluster_vars.yml
INVENTORY_FILE=./inventory_dev_cluster

Note: your-openstack-key-name should be the one defined under OpenStack Key Pairs.

Running the build¶

The build process is broken into two phases:

create the cluster of VMs, and do the common install steps
build the HAProxy loadbalancer and the Kubernetes Cluster

This can be run simply with:

make build

Or, in case you didn’t define the PrivateRules.mak with:

make build CLUSTER_KEYPAIR=your-openstack-key-name PRIVATE_VARS=./dev_cluster_vars.yml INVENTORY_FILE=./inventory_dev_cluster

You can also break the make build command into the three steps comprising it:

# create the VMs and do the common install
make build_nodes
# build the loadbalancer
make build_haproxy
# build the Kubernetes cluster
make build_k8s

Customising Namespaces, Limit Range and Resource Quotas¶

The created cluster will have the default Limit Range and Resource Quotas defined in the systems-common-roles/systems_k8s role.

In order to override default values, fill in the variables specified by the PRIVATE_VARS variable (default: dev_cluster_vars.yml) file or run make apply_resource_quotas anytime providing the cluster is created.

You can also use this target with only extractvalues tag (i.e. make apply_resource_quotas TAGS=extractvalues) so that only the csv and json files are created without creating namespaces and applying quotas.

Note: These variables are commented out as not to conflict with default variables.

Variables:

chart_url and chart_dir: Used as described above to define the Helm Chart
charts_namespaces: Namespaces to be created
charts_quotas_*: Default total values of resources to be applied to the namespaces. These values are overridden if a chart is defined with above variables.
memory_format: Memory size format. Set this true if you want csv file memory format to be human readable. It is false by default to enable easy calculations on values

How to use If chart_url variable is defined, then the values are extracted from the helm chart from the url. chart_url can be either a git repository or a packaged helm chart.

If the Helm chart is a git repository then chart_dir (top-level folder of the chart) should be defined to define where to search for the helm chart.

Then, the helm chart is templated using --dependency-update flag set to cover dependencies as well.

Total values are extracted and a csv file(named resources.csv) for all the apps are created in the directory of the playbook with a json file(named resources.json) defining total values unless current_dir is defined.

Deploy the SSH access keys on the newly created cluster¶

After creating the cluster, only the user issuing the build commands specified above, will have access to the cluster.

In order to have a group of users (usually your team) being able to login into the various VMs that were created, it is needed to distribute their respective ssh keys into those VMs.

To do this, one needs to use the functionality provided by the distribute-ssh-keys SKA repository.

It is beyond the scope of this README to explain all the functionality of the distribute-ssh-keys repo, but for this specific purpose one needs to accomplish the following steps:

1: Modify the inventory-ssh-keys in the distribute-ssh-keys repo¶

When issuing the build command(s) specified above, the file specified by the INVENTORY_FILE variable (inventory_dev_cluster by default) is automatically updated with the newly created VMs:

[cluster:children]
k8s_dev_cluster_loadbalancer
k8s_dev_cluster_master
k8s_dev_cluster_worker

[k8s_dev_cluster_loadbalancer]
k8s-dev-cluster-loadbalancer-0 ansible_host=192.168.93.137 docker_vol_diskid="5fca38ce-3edc-4fe8-b" data_vol_diskid="11c94e16-8482-47bc-a"  data2_vol_diskid=""

...

[k8s_dev_cluster_master]
k8s-dev-cluster-master-0 ansible_host=192.168.93.106 docker_vol_diskid="0b53dc74-5709-46c8-b" data_vol_diskid="30e1928c-3bfd-43c4-b"  data2_vol_diskid=""
k8s-dev-cluster-master-1 ansible_host=192.168.93.130 docker_vol_diskid="e8c4b3c3-8166-4cc6-b" data_vol_diskid="ce5116d7-3c87-40af-9"  data2_vol_diskid=""
k8s-dev-cluster-master-2 ansible_host=192.168.93.24 docker_vol_diskid="cd976a69-d37b-4492-b" data_vol_diskid="b48fa69c-37e1-4a18-9"  data2_vol_diskid=""

...

[k8s_dev_cluster_worker]
k8s-dev-cluster-worker-0 ansible_host=192.168.93.125 docker_vol_diskid="e75f9f0f-ccaa-46a9-a" data_vol_diskid="d98e0d88-8b57-47c7-9"  data2_vol_diskid=""
k8s-dev-cluster-worker-1 ansible_host=192.168.93.119 docker_vol_diskid="db5236ce-4769-47bb-8" data_vol_diskid="76b5aba6-af41-4416-b"  data2_vol_diskid=""
k8s-dev-cluster-worker-2 ansible_host=192.168.93.85 docker_vol_diskid="92f1f155-2c6c-494d-a" data_vol_diskid="3eeef438-f6cd-44a4-a"  data2_vol_diskid=""

...

# Specific roles for cluster deployment assignments
[cluster_nodes:children]
k8s_dev_cluster_loadbalancer
k8s_dev_cluster_master
k8s_dev_cluster_worker

In this particular example, you need to add these newly created machines into the inventory-ssh-keys in the distribute-ssh-keys repo by creating a new section similar to the following:

# K8s dev cluster - k8s-dev-cluster
[k8s_dev_cluster_loadbalancer]
k8s-dev-cluster-loadbalancer-0 ansible_host=192.168.93.137

[k8s_dev_cluster_master]
k8s-dev-cluster-master-0 ansible_host=192.168.93.106
k8s-dev-cluster-master-1 ansible_host=192.168.93.130
k8s-dev-cluster-master-2 ansible_host=192.168.93.24

[k8s_dev_cluster_worker]
k8s-dev-cluster-worker-0 ansible_host=192.168.93.125
k8s-dev-cluster-worker-1 ansible_host=192.168.93.119
k8s-dev-cluster-worker-2 ansible_host=192.168.93.85

[k8s_dev_cluster:children]
k8s_dev_cluster_loadbalancer
k8s_dev_cluster_master
k8s_dev_cluster_worker

[k8s_dev_cluster:vars]
ansible_python_interpreter=python3
ansible_user=ubuntu

2: Add your keys to the ssh_key_vars.yml in the distribute-ssh-keys repo¶

You then need to specify which keys will be distributed among the new nodes. In order to do so, you add a new entry to the ssh_key_vars.yml file, like such:

# Bruno
  - ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDI5IHBq3DUh97aWzSAlBFov5FaNtgut6oW9QvZ7NRFplhskKqze57xcWOkL0y22n1Rao3bqZBkC4cjwm5x0kqAN+nsDRK/PB5blAUB8DJEJXz19py3pVb3BML2PBHYN+p/wqCNoKu2n22grmYphVnY5rjjgW4K4Y8AkBa8vv4YzyFvXrPRp4GD3THelkM7YsgnlZsU/QHw7rxOtWRTpeM4U4ZVdLmHWG45L1x/FjFvnsLSMGipRIaY0Oy/bC+29MCGRFZYrojSmy8KOPxjEBmwyrRe8ooDRtMtkMLQDCD8baidLnIv2yM+BdZXHBXh22f4YHJMakoPwC3n57o5x/NUAjn+0DRdgmjhimQFdh6Untd1kAnfQOd6kl/IMKyiJBZd8HacVF5XdIn9kc6l98d05uqZM71noHusHlvLKv0dtbCCM7myNmIPotT4albEJGAv22+siN8awLxepOYOBFL5sOsEWl6HnISziHUDIXmoe2qX6j0hFE9YH1ZM8sBALWx2+4sRJAljArItOj7+I0KqxUnAOAn4/KmACaH7mKiBLtq0wwW8xxoibLWFCX8C3VIFS3GOWrQU/Q49XSC3RNqqc7VgW1xZjYYb3BELT5tPMJbH3hKilfrg5NqWPj/2TDFDS6Za1QhIPbyMSxvfkf8usAqLVRW7nF4Q+UcazVg2nQ== jb.morgado@gmail.com bruno

Warning: This must be done locally, only the System Team should update the distribute-ssh-keys repo on GitLab.

3. Distribute the new keys¶

Now the only thing left is to distribute the keys among the nodes you added to the inventory_ssh_keys file. In our case, since we defined the k8s_syscore:children structure with reference to our new machines, we can just issue:

make add NODES=k8s_syscore

You now should be able to login in any of the new nodes by doing (we will use the loadbalancer ip address on our example 192.168.93.137):

ssh 192.168.93.137

Destroying the nodes¶

Caution: The following command deletes everything that was installed and destroys the all cluster specified by the PRIVATE_VARS variable.

In case you want to delete the all stack issue the make clean command.

Help¶

Run make to get the help:

make targets:
Makefile:build                 Build nodes, haproxy and k8s
Makefile:build_charts          Build charts
Makefile:build_common          apply the common roles
Makefile:build_docker          apply the docker roles
Makefile:build_haproxy         Build haproxy
Makefile:build_k8s             Build k8s
Makefile:build_nodes           Build nodes based on heat-cluster
Makefile:check_nodes           Check nodes based on heat-cluster
Makefile:clean_k8s             clean k8s cluster
Makefile:clean_nodes           destroy the nodes - CAUTION THIS DELETES EVERYTHING!!!
Makefile:help                  show this help.
Makefile:install               Install dependent ansible collections
Makefile:lint                  Lint check playbook
Makefile:reinstall             reinstall collections
Makefile:test                  Smoke test for new created cluster
Makefile:vars                  Variables

make vars (+defaults):
Makefile:ANSIBLE_USER          centos## ansible user for the playbooks
Makefile:CLUSTER_KEYPAIR       piers-engage## key pair available on openstack to be put in the created VMs
Makefile:COLLECTIONS_PATHS     ./collections
Makefile:CONTAINERD            true
Makefile:DEBUG                 false
Makefile:EXTRA_VARS ?=
Makefile:IGNORE_NVIDIA_FAIL    false
Makefile:INVENTORY_FILE        ./inventory_k8s##inventory file to be generated
Makefile:NVIDIA                false
Makefile:PRIVATE_VARS          ./centos_vars.yml##template variable for heat-cluster
Makefile:TAGS ?=

Extending the Kubernetes Cluster¶

From time to time we may want to add additional nodes to a Kubernetes cluster. To do this we need to follow a number of steps (using the example of the syscore cluster):

all steps must be performed from the bifrost. Use tmux to save painful experiences.
source your login for the OpenStack cluster
ensure that your PrivateRules.mak file contains the entries:

## Production
PRIVATE_VARS = k8s_system_core_vars.yml
INVENTORY_FILE = ./inventory_k8s_system_core
CLUSTER_KEYPAIR = <your key pair>

update your local ansible collections with make reinstall
login to k8s-syscore-master-0 and generate a new certificate key using kubeadm alpha certs certificate-key
amend the cluster instance vars file k8s_system_core_vars.yml to include the k8s_certificate_key and to increment the number of workers genericnode_worker.num_nodes. DO NOT ALTER THE flavor, image or name IT WILL NOT WORK!!!
make build_nodes
Check new nodes have been added via the OpenStack API - resize if required
make build_k8s
Check the progress by running something like watch kubectl get nodes -o wide - see the nodes come in and switch to Ready.
Once the nodes are up and running and integrated in the cluster, we now add logging
Switch to a checked out version of cluster-elasticstack
Amend the inventory_logging file to include the additional nodes declared in cluster-k8s/inventory_k8s_system_core
copy /etc/kubernetes/admin.conf from master-0 to the same file on all the new worker nodes.
run make logging NODES=k8s_syscore_worker to add the logging to the new workers
Switch to a checked out version of deploy-gitlab-runners
Amend the inventory_runners file to include the additional nodes declared in cluster-k8s/inventory_k8s_system_core
run make docker to prepare the worker nodes for the gitlab docker instance - note: it is possible that the TASK [restart docker-for-gitlab] will fail for existing nodes as they will be in use, but the restart must work for the new nodes.
run make label_nodes to add the ci-runner label to new worker nodes
Ensure that the worker nodes are not tainted eg: kubectl taint node k8s-syscore-worker-NN node-role.kubernetes.io/master=true:NoSchedule-. Check with kubectl describe nodes k8s-syscore-worker-NN.
Switch to a checked out version of deploy-prometheus
run make node-exporter INVENTORY_FILE=../cluster-k8s/inventory_k8s_system_core NODES=k8s-syscore-worker-## to install the node_exporter (replace ## with the new worker number)
distribute ssh keys for the new nodes
Finally, don’t forget to commit all the changes to cluster-k8s, deploy-gitlab-runners and cluster-elasticstack