High Available Kubernetes control plane install using Ansible

Installing a High available Kubernetes cluster has two different approaches and also for a high tolerant cluster it is recommended to have a minimum of 3 or 5 master nodes( ideally we should have  51% of total nodes should be available at once), in any circumstance if one of the master nodes goes down we can still have 2 more master nodes in place to handle the requests. 


1. Two master nodes and if one goes down we have 1 master node to serve the traffic.


in this scenario, we are left with 50% availability of total.

2,  Three master nodes, and if one goes down in unplanned situations.


In this scenario, we are left with 2/3=66% availability.

Two available approaches for the high available cluster are:

1. With stacked control plane nodes. This approach requires less infrastructure. The etcd members and control plane nodes are co-located.

2. With an external etcd cluster. This approach requires more infrastructure. The control plane nodes and etcd members are separated.

In this tutorial, I am going to install the stacked control plane nodes with 2 masters, the same procedure goes with 3 nodes as well.

In our previous tutorials, we have taken care of system pre-requisites so we will proceed with kubelet, kubeadm and kubectl packages install on all our target machines.  


$ #Execute the below snippet and apply the ansible-playbook as given below.

cat <<EOF> hosts_file.yaml
- name: Install k8s packages
  hosts: master,worker
  become: yes
    - name: Install kubelet and kubeadm on all machines
          - kubelet
          - kubeadm
        state: present
        update_cache: true

    - name: start kubelet
       name: kubelet
       enabled: yes
       state: started
    -  name: print the kubeadm and kubelet version
       shell: kubeadm version && kubelet --version
       register: versionout
    - debug: msg={{versionout}}
- name: Install kubelet on Master nodes
  hosts: master
  become: yes
     - name: install kubectl
         name: kubectl-1.16.0
         state: present
         allow_downgrade: yes

ansible-playbook -i inventory k8s.yaml --ask-become-pass

Step2: Now we can initialize the control plane using installed kubeadm.

#Execute the below snippet and apply the ansible-playbook as given below.

cat <<EOF> k8s-kubeadm.yaml
- name: control plane init
  become: yes
  hosts: master1
      - name: Kubeadm init on master
        shell: kubeadm init --control-plane-endpoint {{hostvars['haproxy'].ansible_host}}:6443 \
               --upload-certs \
               --apiserver-advertise-address {{hostvars['master1'].ansible_host}} \
               --pod-network-cidr \
        register: kubeadmout

      - shell: netstat -anp|grep {{item}}
        ignore_errors: yes
                - 6443
                - 10259
        with_items: "{{ portcheck }}"
        register: netstat

      - debug: msg={{ports are already in use }}
        when: netstat.results[0].rc == "1" and nestat.results[1] == "1"

      - local_action: copy content=((kubeadmout.stdout}} dest="./token"

      - name: create the .kube home folder
        shell: |
           mkdir -p $HOME/.kube 
           sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config 
           sudo chown $(id -u):$(id -g) $HOME/.kube/config 

      - name: Install flannel CNI
        command: kubectl apply -f https://raw.githubusercontent.com/vamsi1967/k8s-HA-ansible-install/master/flannel-cni.yaml
        register: cniout

      - local_action: copy content=((cni.stdout}} dest="./cnioutput"

$ ansible-playbook -i inventory k8s-kubeadm.yaml --ask-become-pass

Step3: Now we join the other control plane nodes to the leader master node.

#Execute the below snippet and apply the ansible-playbook as given below.

cat <<EOF> k8s-controlplane-join.yaml
- hosts: master1
   become: yes
   gather_facts: no
       - name: generate the join command
         shell: kubeadm token create --print-join-command
         register: join_command

       - name: set join command fact
            join_command: "{{join_command.stdout_lines[0]}}"

       - name: generate the control plane cert key
         shell: kubeadm init phase upload-certs --upload-certs
         register: kubeadm_cert_key

       - name: register the cert key
                 control_plane_certkey: "{{kubeadm_cert_key.stdout_lines[2]}}"

 - hosts: master2
   become: yes
      - name: join control-plane nodes
        shell: "{{hostvars['master1'].join_command}} --control-plane --certificate-key {{hostvars['master1'].control_plane_certkey}}"
        register: joinmaster

      - debug: msg={{joinmaster}}
      - local_action: copy content=((joinmaster.stdout}} dest="./join_controlplane_node"


$ ansible-playbook -i inventory k8s-controlplane-join.yaml --ask-become-pass

Sample Output:

[kubeadmin@ansible-master k8s-HA-ansible-install]$ ansible-playbook -i inventory k8s-controlplane-join.yaml --ask-become-pass

BECOME password:

PLAY [master1] *************************************************************************************************************************

TASK [generate the join command] *******************************************************************************************************
changed: [master1]
TASK [set join command fact] ***********************************************************************************************************
ok: [master1] 
TASK [generate the control plane cert key] *********************************************************************************************
changed: [master1]
TASK [register the cert key] ***********************************************************************************************************
ok: [master1]
PLAY [master2] *************************************************************************************************************************
TASK [Gathering Facts] *****************************************************************************************************************
ok: [master2]
TASK [join control-plane nodes] ********************************************************************************************************
changed: [master2]
TASK [debug] ***************************************************************************************************************************
ok: [master2] => {
    "msg": {
        "changed": true,
        "cmd": "kubeadm join --token qzp6f7.mjzjwfc5tyx7vmkz     --discovery-token-ca-cert-hash sha256:787bcf56aed6588bf052e8c3be0453fa4fe3f6cd06249f10669364d4940cd97d  --control-plane --certificate-key 85c15dd88a157a03e1f873244a9fdc58d27a2efff84ade62ec21529375119188",
        "delta": "0:01:17.932220",
        "end": "2020-12-02 14:10:47.127348",
        "failed": false,
        "rc": 0,
        "start": "2020-12-02 14:09:29.195128",
        "stderr": "\t[WARNING IsDockerSystemdCheck]: detected \"cgroupfs\" as the Docker cgroup driver. The recommended driver is \"systemd\". Please follow the guide at https://kubernetes.io/docs/setup/cri/",
        "stderr_lines": [
            "\t[WARNING IsDockerSystemdCheck]: detected \"cgroupfs\" as the Docker cgroup driver. The recommended driver is \"systemd\". Please follow the guide at https://kubernetes.io/docs/setup/cri/"
        "stdout": "[preflight] Running pre-flight checks\n[preflight] Reading configuration from the cluster...\n[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'\n[preflight] Running pre-flight checks before initializing the new control plane instance\n[preflight] Pulling images required for setting up a Kubernetes cluster\n[preflight] This might take a minute or two, depending on the speed of your internet connection\n[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'\n[download-certs] Downloading the certificates in Secret \"kubeadm-certs\" in the \"kube-system\" Namespace\n[certs] Using certificateDir folder \"/etc/kubernetes/pki\"\n[certs] Generating \"apiserver-etcd-client\" certificate and key\n[certs] Generating \"etcd/server\" certificate and key\n[certs] etcd/server serving cert is signed for DNS names [localhost master2] and IPs [ ::1]\n[certs] Generating \"etcd/peer\" certificate and key\n[certs] etcd/peer serving cert is signed for DNS names [localhost master2] and IPs [ ::1]\n[certs] Generating \"etcd/healthcheck-client\" certificate and key\n[certs] Generating \"apiserver\" certificate and key\n[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local master2] and IPs []\n[certs] Generating \"apiserver-kubelet-client\" certificate and key\n[certs] Generating \"front-proxy-client\" certificate and key\n[certs] Valid certificates and keys now exist in \"/etc/kubernetes/pki\"\n[certs] Using the existing \"sa\" key\n[kubeconfig] Generating kubeconfig files\n[kubeconfig] Using kubeconfig folder \"/etc/kubernetes\"\n[kubeconfig] Writing \"admin.conf\" kubeconfig file\n[kubeconfig] Writing \"controller-manager.conf\" kubeconfig file\n[kubeconfig] Writing \"scheduler.conf\" kubeconfig file\n[control-plane] Using manifest folder \"/etc/kubernetes/manifests\"\n[control-plane] Creating static Pod manifest for \"kube-apiserver\"\n[control-plane] Creating static Pod manifest for \"kube-controller-manager\"\n[control-plane] Creating static Pod manifest for \"kube-scheduler\"\n[check-etcd] Checking that the etcd cluster is healthy\n[kubelet-start] Writing kubelet configuration to file \"/var/lib/kubelet/config.yaml\"\n[kubelet-start] Writing kubelet environment file with flags to file \"/var/lib/kubelet/kubeadm-flags.env\"\n[kubelet-start] Starting the kubelet\n[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...\n[etcd] Announced new etcd member joining to the existing etcd cluster\n[etcd] Creating static Pod manifest for \"etcd\"\n[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s\n[upload-config] Storing the configuration used in ConfigMap \"kubeadm-config\" in the \"kube-system\" Namespace\n[mark-control-plane] Marking the node master2 as control-plane by adding the label \"node-role.kubernetes.io/master=''\"\n[mark-control-plane] Marking the node master2 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]\n\nThis node has joined the cluster and a new control plane instance was created:\n\n* Certificate signing request was sent to apiserver and approval was received.\n* The Kubelet was informed of the new secure connection details.\n* Control plane (master) label and taint were applied to the new node.\n* The Kubernetes control plane instances scaled up.\n* A new etcd member was added to the local/stacked etcd cluster.\n\nTo start administering your cluster from this node, you need to run the following as a regular user:\n\n\tmkdir -p $HOME/.kube\n\tsudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config\n\tsudo chown $(id -u):$(id -g) $HOME/.kube/config\n\nRun 'kubectl get nodes' to see this node join the cluster.",
        "stdout_lines": [
            "[preflight] Running pre-flight checks",
            "[preflight] Reading configuration from the cluster...",
            "[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'",
            "[preflight] Running pre-flight checks before initializing the new control plane instance",
            "[preflight] Pulling images required for setting up a Kubernetes cluster",
            "[preflight] This might take a minute or two, depending on the speed of your internet connection",
            "[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'",
            "[download-certs] Downloading the certificates in Secret \"kubeadm-certs\" in the \"kube-system\" Namespace",
            "[certs] Using certificateDir folder \"/etc/kubernetes/pki\"",
            "[certs] Generating \"apiserver-etcd-client\" certificate and key",
            "[certs] Generating \"etcd/server\" certificate and key",
            "[certs] etcd/server serving cert is signed for DNS names [localhost master2] and IPs [ ::1]",
            "[certs] Generating \"etcd/peer\" certificate and key",
            "[certs] etcd/peer serving cert is signed for DNS names [localhost master2] and IPs [ ::1]",
            "[certs] Generating \"etcd/healthcheck-client\" certificate and key",
            "[certs] Generating \"apiserver\" certificate and key",
            "[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local master2] and IPs []",
            "[certs] Generating \"apiserver-kubelet-client\" certificate and key",
            "[certs] Generating \"front-proxy-client\" certificate and key",
            "[certs] Valid certificates and keys now exist in \"/etc/kubernetes/pki\"",
            "[certs] Using the existing \"sa\" key",
            "[kubeconfig] Generating kubeconfig files",
            "[kubeconfig] Using kubeconfig folder \"/etc/kubernetes\"",
            "[kubeconfig] Writing \"admin.conf\" kubeconfig file",
            "[kubeconfig] Writing \"controller-manager.conf\" kubeconfig file",
            "[kubeconfig] Writing \"scheduler.conf\" kubeconfig file",
            "[control-plane] Using manifest folder \"/etc/kubernetes/manifests\"",
            "[control-plane] Creating static Pod manifest for \"kube-apiserver\"",
            "[control-plane] Creating static Pod manifest for \"kube-controller-manager\"",
            "[control-plane] Creating static Pod manifest for \"kube-scheduler\"",
            "[check-etcd] Checking that the etcd cluster is healthy",
            "[kubelet-start] Writing kubelet configuration to file \"/var/lib/kubelet/config.yaml\"",
            "[kubelet-start] Writing kubelet environment file with flags to file \"/var/lib/kubelet/kubeadm-flags.env\"",
            "[kubelet-start] Starting the kubelet",
            "[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...",
            "[etcd] Announced new etcd member joining to the existing etcd cluster",
            "[etcd] Creating static Pod manifest for \"etcd\"",
            "[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s",
            "[upload-config] Storing the configuration used in ConfigMap \"kubeadm-config\" in the \"kube-system\" Namespace",
            "[mark-control-plane] Marking the node master2 as control-plane by adding the label \"node-role.kubernetes.io/master=''\"",
            "[mark-control-plane] Marking the node master2 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]",
            "This node has joined the cluster and a new control plane instance was created:",
            "* Certificate signing request was sent to apiserver and approval was received.",
            "* The Kubelet was informed of the new secure connection details.",
            "* Control plane (master) label and taint were applied to the new node.",
            "* The Kubernetes control plane instances scaled up.",
            "* A new etcd member was added to the local/stacked etcd cluster.",
            "To start administering your cluster from this node, you need to run the following as a regular user:",
            "\tmkdir -p $HOME/.kube",
            "\tsudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config",
            "\tsudo chown $(id -u):$(id -g) $HOME/.kube/config",
            "Run 'kubectl get nodes' to see this node join the cluster."
PLAY RECAP *****************************************************************************************************************************
master1                    : ok=4    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
master2                    : ok=3    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

Validation: Let us validate from the leader control plane to check if cluster-info and total master nodes availability.

1. ssh kubeadmin@master1
2. kubectl cluster-info
3. kubectl get nodes -owide

Conclusion: We are now ready with our control plane and now it is time to join all the worker nodes to the control plane node.

