Kubernetes HA Cluster Setup

Installation of Kubernetes

To deploy a highly available Kubernetes cluster across three control plane nodes, ensuring fault tolerance and scalability.

Overview

The setup involves installing Kubernetes on three control plane nodes using kubeadm as the installation tool and containerd as the container runtime. The installation process is automated with Ansible, requiring familiarity with Ansible playbooks and scripts.

Prerequisites

Before proceeding with the installation, ensure the following:

  1. Ansible Knowledge:
    Familiarity with Ansible and writing Ansible playbooks is necessary, as the installation process is automated using Ansible scripts.
  2. Prepared Installation Scripts:
    The installation scripts for Kubernetes deployment are pre-written and available as Ansible playbooks.
  3. System Requirements:
    • Three physical or virtual nodes with at least two network interfaces each.
    • Each node should meet Kubernetes hardware requirements (CPU, RAM, and disk).
  4. Networking Configuration:
    Ensure proper network setup for both control plane communication and pod networking.

Implementation Details

  1. Installation Tool:
    • Use kubeadm for initializing and configuring the Kubernetes cluster.
  2. Container Runtime:
    • containerd is used as the container runtime to manage containers efficiently.
Next Steps
  1. Prepare the nodes with the required dependencies (e.g., setting up the kernel modules, disabling swap or Conigure kubelet to not fail when swap in enabled).
  2. Use the Ansible scripts to automate the installation and configuration of kubeadm, containerd, and necessary Kubernetes components.
  3. Initialize the cluster on one control plane node and join the remaining nodes to form a multi-master setup.
  4. Verify the cluster’s health and ensure all control plane nodes are operational.

The installation process uses a pre-configured Ansible playbook located in the DevopsAutomationScripts repository. The playbook automates the installation and configuration of Kubernetes on target nodes.

Steps for Installation

1. Prepare the Environment

  1. Clone the Repository
    Clone the repository containing the installation playbook:

    https://github.com/DevOps-Model/DevOps-Automation-Scripts/tree/main/ansible_scripts
    cd DevOps-Automation-Scripts/ansible_scripts
    
  2. Locate the Playbook
    The Kubernetes installation playbook is located at:

    install_k8s.yaml
    

2. Update the Inventory File

  • Open the inventory file referenced in the playbook (hosts file).
  • Update it with the IP addresses or hostnames of the target nodes. For example:
    [kubernetes_nodes]
    node1 ansible_host=192.168.1.10 ansible_user=root
    node2 ansible_host=192.168.1.11 ansible_user=root
    node3 ansible_host=192.168.1.12 ansible_user=root
    

3. Update the Ansible Configuration

  • Open the ansible.cfg file.
  • Modify the configuration to match your environment. Key sections to update include:
    • inventory: Path to your updated hosts file.
    • remote_user: User for accessing target nodes.
    • private_key_file: Path to your private SSH key (if applicable).

4. Set Up Passwordless SSH Access

  • Ensure the target nodes can be accessed without a password using SSH keys:
    ssh-keygen -t rsa
    ssh-copy-id root@<target-node-IP>
    

5. Run the Ansible Playbook

  • Execute the playbook to install Kubernetes:
    ansible-playbook install_k8s.yaml
    

6. Verify Installation

  • After the playbook completes, check the Kubernetes installation:
    systemctl status containerd
    systemctl status kubelet
    

7. Managing Swap in Kubernetes

By default, Kubernetes requires swap to be disabled for proper functioning of the kubelet service. However, if disabling swap is not feasible, you can configure Kubernetes to allow swap by modifying the kubelet configuration.

Steps to Configure Kubernetes with Swap Enabled
  1. Edit the Kubelet Configuration File

    • Open the kubelet default configuration file located at:

      /etc/default/kubelet
      
    • Add the following argument to allow the kubelet to run even when swap is enabled:

      KUBELET_EXTRA_ARGS="--fail-swap-on=false"
      
  2. Restart the Kubelet Service

    • Apply the changes by restarting the kubelet service:
      systemctl restart kubelet
      
  3. Verify the Kubelet Status

    • Check that the kubelet is running successfully:
      systemctl status kubelet
      

    It will be in activating mode.

Here’s the section rewritten as part of a larger document:

Configuring containerd to Use a Custom Data Directory

In Kubernetes environments, containerd serves as the container runtime and stores all images and runtime data in its default directory, typically located at /var/lib/containerd. If the default directory does not have sufficient storage, you can configure containerd to use a custom data directory.

  • The new directory should have sufficient disk space to handle container images and runtime data.
  • If there is existing data in /var/lib/containerd that you wish to preserve, consider manually migrating it to the new directory before restarting the service.

Steps to Reconfigure containerd

  1. Edit the containerd Configuration File

    • Open the configuration file using a text editor:
      sudo nano /etc/containerd/config.toml
      
  2. Modify the Data Directory Path

    • Locate the root parameter in the configuration file and change it to your desired directory. For example:
      root = "/data3/containerd"
      
    • Replace /data3/containerd with a path on a disk or partition that has adequate storage.
  3. Restart the containerd Service

    • Reload the systemd daemon and restart the containerd service to apply the changes:
      sudo systemctl daemon-reload
      sudo systemctl restart containerd
      
  4. Verify the New Configuration

    • Confirm that containerd is running and using the new directory:
      sudo systemctl status containerd
      

By reconfiguring containerd in this way, you can ensure it operates effectively on systems with custom storage requirements.

Here’s the documentation section for configuring the load balancer using HAProxy:

Configuring the Load Balancer

In this setup, the load balancer is a dedicated bare-metal CPU node (172.21.0.63) and is not part of the Kubernetes cluster nodes. The load balancer is configured using HAProxy to distribute traffic between the control plane nodes.

Step 1: Install HAProxy

  1. Install HAProxy on the Load Balancer Node
    Run the following command to install HAProxy on the node (172.21.0.63):
    sudo apt update
    sudo apt install haproxy -y
    

Step 2: Configure HAProxy

  1. Edit the HAProxy Configuration File
    Open the configuration file:

    sudo nano /etc/haproxy/haproxy.cfg
    
  2. Replace the File Content with the Following Configuration:

    global
      maxconn 1028
      daemon
      user haproxy
      group haproxy
    
    defaults
      timeout connect 5000ms
      timeout client 50000ms
      timeout server 50000ms
    
    frontend fe-apiserver
      bind 0.0.0.0:6443
      mode tcp
      option tcplog
      default_backend be-apiserver
      http-request set-header X-Forwarded-Proto https
    
    backend be-apiserver
      mode tcp
      option tcplog
      option tcp-check
      balance roundrobin
      default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100
      server master1 172.21.0.20:6443 check
      server master2 172.21.0.32:6443 check
      server master3 172.21.0.36:6443 check
      #server master4 172.21.1.41:6443 check
    
    listen stats
      bind *:9500
      mode http
      stats enable
      stats uri /stats
      stats auth haproxy:1!Qhaproxy
    

    Explanation of Key Sections:

    • Frontend (fe-apiserver):
      • Listens on port 6443 and forwards API requests to the backend servers.
      • Adds the X-Forwarded-Proto header for HTTPS.
    • Backend (be-apiserver):
      • Distributes traffic to the control plane nodes using round-robin load balancing.
      • Each control plane node is checked for availability using TCP health checks.

Step 3: Restart and Verify HAProxy

  1. Restart the HAProxy Service
    Apply the configuration changes by restarting the HAProxy service:

    sudo systemctl restart haproxy
    
  2. Verify HAProxy Status
    Check if HAProxy is running correctly:

    sudo systemctl status haproxy
    

Initializing the Kubernetes Cluster

Before initializing the Kubernetes cluster, ensure that the load balancer is configured to facilitate communication between the three control plane nodes. This ensures high availability and load balancing for the API server.

Step 1: Configure the Load Balancer

  1. Add an entry to the /etc/hosts file on all nodes to resolve the load balancer’s hostname:
    k8s-cluster01-api  172.21.0.63
    
    Here:
    • k8s-cluster01-api: The hostname of the load balancer.
    • 172.21.0.63: The IP address of the load balancer.

Step 2: Initialize the Cluster with kubeadm

Use the kubeadm init command to initialize the Kubernetes control plane on the first node:

kubeadm init --control-plane-endpoint="k8s-cluster01-api:6443" \
--upload-certs \
--apiserver-advertise-address=172.21.0.20 \
--pod-network-cidr=10.244.0.0/16
  • --control-plane-endpoint="k8s-cluster01-api:6443"

    • Specifies the endpoint through which all control plane nodes communicate.
    • The hostname (k8s-cluster01-api) resolves to the load balancer’s IP (172.21.0.63) to ensure high availability.
    • :6443 denotes the default port for the Kubernetes API server.
  • --upload-certs

    • Uploads the certificates needed for establishing communication between control plane nodes.
    • These certificates will be used when joining additional control plane nodes to the cluster.
  • --apiserver-advertise-address=172.21.0.20

    • Specifies the IP address on which the API server will advertise itself.
    • This should be the IP of the current node being initialized as part of the control plane.
  • --pod-network-cidr=10.244.0.0/16

    • Configures the CIDR range for the pod network.
    • This example uses 10.244.0.0/16, which is required for the Flannel network plugin. Adjust the CIDR range if using a different CNI plugin.

The sample output is


kubeadm join k8s-cluster01-api:6443 --token xiysz5.xt9t78i6r2aluwhr \
        --discovery-token-ca-cert-hash sha256:0d23b55054fc27496b72353d262a9436f5f2bbdfefcb61cfe8f8bac77a4e49ca \
        --control-plane --certificate-key a415c31b6c1762d89d2f3c46a22b430b5635106f83d2538172c7681ea57239b6

Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join k8s-cluster01-api:6443 --token xiysz5.xt9t78i6r2aluwhr \
        --discovery-token-ca-cert-hash sha256:0d23b55054fc27496b72353d262a9436f5f2bbdfefcb61cfe8f8bac77a4e49ca 

Step 3: Post-Initialization Steps

  1. After the initialization, the command output will include instructions to:

    • Save the join command for worker nodes.
    • Use the kubeconfig file to interact with the cluster.
  2. To make kubectl commands available for your user, run:

    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config
    

Install and Configure the Calico CNI Plugin

Calico is a popular CNI(Container Network Interface) plugin for Kubernetes that provides network connectivity and network policy management. Below are the steps to install and configure Calico using the operator and custom resources.

Step 1: Install the Calico Operator

To install the Calico operator in your cluster, which manages Calico’s lifecycle, use the following command:

kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.29.1/manifests/tigera-operator.yaml

The Calico operator will be installed in the calico-system namespace.

Step 2: Download the Custom Resources for Calico

After installing the operator, download the custom resources necessary to configure Calico networking. Run the following curl command to fetch the resources:

curl https://raw.githubusercontent.com/projectcalico/calico/v3.29.1/manifests/custom-resources.yaml -O

This file contains the necessary custom resources that will configure Calico’s network settings, including IP address management (CIDR blocks for pod networking).

Before applying the custom resources, you need to modify the CIDR block for pod networking:

  1. Open the custom-resources.yaml file in a text editor, such as vi:

    vi custom-resources.yaml
    
  2. Find the line that defines the podCIDR. Change it from the default 192.168.0.0/16 to 10.244.0.0/16 (or your preferred CIDR block for pod networking):

    podCIDR: "10.244.0.0/16"
    
  3. Save and close the file.

Step 4: Apply the Custom Resources

Once you’ve modified the custom-resources.yaml file, apply it to your cluster to configure Calico:

kubectl create -f custom-resources.yaml

This command will create the necessary resources and configuration to set up Calico as the CNI plugin for your Kubernetes cluster.

Step 5: Verify the Calico Installation

After applying the resources, you can verify that Calico has been installed and is running correctly by checking the status of the Calico pods:

watch kubectl get pods -n calico-system

You should see the Calico node pods starting up. A successful installation will show output similar to the following:

NAME                                      READY   STATUS    RESTARTS   AGE
calico-kube-controllers-85469dc84-sb5jr   1/1     Running   0          5d20h
calico-node-fwwqb                         1/1     Running   0          5d20h
calico-node-qtr4n                         1/1     Running   0          5d20h
calico-node-wcrqg                         1/1     Running   0          5d20h
calico-typha-7f88d5764c-cjzvs             1/1     Running   0          5d20h
calico-typha-7f88d5764c-vh5cj             1/1     Running   0          5d20h
csi-node-driver-4q8md                     2/2     Running   0          5d20h
csi-node-driver-kchsw                     2/2     Running   0          5d20h
csi-node-driver-mgcvt                     2/2     Running   0          5d20h

The calico-node pods should be in the Running state, indicating that Calico is properly installed and managing network connectivity.

With Calico installed and configured, your Kubernetes cluster is now equipped with networking capabilities, including support for network policies to control pod-to-pod communication.

Here’s the continuation of the documentation, including the steps to verify that the nodes and CoreDNS pods are in the correct state after configuring the Calico CNI plugin:

Step 6: Verify Node Status

Once the Calico CNI plugin is installed and the necessary resources have been created, run the following command to check the status of the nodes in your Kubernetes cluster:

kubectl get nodes

The output should show that all your nodes are in the Ready state, indicating that they are successfully registered and able to communicate within the cluster:

NAME           STATUS   ROLES           AGE   VERSION
master-node1   Ready    control-plane   10m   v1.23.0
master-node2   Ready    control-plane   10m   v1.23.0
master-node3   Ready    control-plane   10m   v1.23.0

Step 7: Verify Pod Status (Including CoreDNS)

Next, check the status of the pods in the kube-system namespace to ensure that all the necessary system pods, especially the CoreDNS pods, are running properly. Run the following command:

kubectl get pods -n kube-system

You should see that all the key system pods, including the coredns pods, are in the Running and Ready state. A successful output should look something like this:

NAME                                       READY   STATUS    RESTARTS   AGE
calico-node-xyz123                         1/1     Running   0          5m
coredns-abc123                              1/1     Running   0          3m
coredns-def456                              1/1     Running   0          3m
kube-proxy-xyz456                           1/1     Running   0          5m

Ensure that the CoreDNS pods are listed and their READY column shows 1/1, which means they are functioning correctly. If any of the pods are not in a Running state or are not Ready, further troubleshooting may be required.

By running the above commands and verifying the output, you can confirm that the Calico CNI plugin is properly configured, the nodes are in a Ready state, and the essential system pods, including CoreDNS, are up and running.

Here’s the continuation of the documentation, including the step to wait for all pods to reach the Running and Ready state after joining the control plane nodes:

Here’s the section to join additional control plane nodes to the Kubernetes cluster, including instructions on how to retrieve the necessary kubeadm join command:

Step 8: Join Additional Control Plane Nodes to the Cluster

After initializing the Kubernetes cluster on the first control plane node, you can join the other control plane nodes to the cluster. Use the following kubeadm join command to add them:

kubeadm join k8s-cluster01-api:6443 --token <your-token> \
        --discovery-token-ca-cert-hash sha256:<your-hash>
  • Replace <your-token> with the token generated during the initialization step.
  • Replace <your-hash> with the sha256 discovery token CA certificate hash.

The actual command for the existing cluster is

  kubeadm join k8s-cluster01-api:6443 --token xiysz5.xt9t78i6r2aluwhr \
        --discovery-token-ca-cert-hash sha256:0d23b55054fc27496b72353d262a9436f5f2bbdfefcb61cfe8f8bac77a4e49ca \
        --control-plane --certificate-key a415c31b6c1762d89d2f3c46a22b430b5635106f83d2538172c7681ea57239b6

If you do not have the kubeadm join command or if you’ve lost the token/hash, you can retrieve the necessary information to join the cluster by running the following command on the initial control plane node:

kubeadm token create --print-join-command

This command will generate and print the full kubeadm join command, including the required token and discovery token hash, which you can then run on the additional control plane nodes to join them to the cluster.

Step 9: Verify Node Status After Joining

Once the other control plane nodes have successfully joined the cluster, run the following command again to check the status of all the nodes:

kubectl get nodes

All the control plane nodes should now be listed in the Ready state.

By following these steps, you can join additional control plane nodes to your Kubernetes cluster and ensure that they are properly integrated and ready to handle workloads.

Step 10: Wait for Pods to Reach Running and Ready State

After successfully joining all the control plane nodes to the cluster, it’s important to wait for the system pods, including Calico and CoreDNS pods, to reach the Running and Ready state. This ensures that the Kubernetes cluster is fully operational and the networking is correctly configured.

  1. Verify Pods in calico-system Namespace:

    Run the following command to check the status of the Calico pods:

    watch kubectl get pods -n calico-system
    

    The calico-node pods should eventually show a status of Running and Ready. The output should look like:

    NAMESPACE      NAME                READY   STATUS    RESTARTS   AGE
    calico-system  calico-node-xyz123  1/1     Running   0          5m
    
  2. Verify Pods in kube-system Namespace:

    Similarly, verify the status of the system pods in the kube-system namespace by running:

    watch kubectl get pods -n kube-system
    

    Make sure all essential pods, especially coredns, are in the Running and Ready state. The expected output might look like:

    NAMESPACE      NAME                 READY   STATUS    RESTARTS   AGE
    kube-system    coredns-abc123        1/1     Running   0          5m
    kube-system    coredns-def456        1/1     Running   0          5m
    kube-system    calico-node-xyz123    1/1     Running   0          5m
    kube-system    kube-proxy-xyz456     1/1     Running   0          5m
    

    All pods should show READY as 1/1, indicating they are running as expected.

Step 11: Verify Cluster Health

Once all the pods are in the Running and Ready state, it’s a good practice to check the overall health of the Kubernetes cluster:

kubectl get componentstatus

You should see the status of key Kubernetes components, including the API server, scheduler, and controller manager, all in the Healthy state.

By following these steps, you can ensure that all control plane nodes are properly joined to the cluster and the system pods are running as expected.

Step 12: Run a Test NGINX Pod

To verify that your Kubernetes cluster is fully functional, you can deploy a test NGINX pod. This will help ensure that the cluster is able to schedule and run workloads properly.

  1. Create the NGINX Pod:

    Run the following command to create an NGINX pod in the default namespace:

    kubectl run nginx --image=nginx --restart=Never
    

    This command will create a pod named nginx using the official NGINX image and will ensure that the pod doesn’t automatically restart once it finishes (since --restart=Never).

  2. Verify the Pod Status:

    To check the status of the NGINX pod, run:

    kubectl get pods -o wide
    

    The output should show the nginx pod in the Running and Ready state:

    NAME    READY   STATUS    RESTARTS   AGE
    nginx   1/1     Running   0          1m
    

    The READY column should show 1/1, indicating that the pod is running and all containers inside the pod are ready.

3. Delete the Test NGINX Pod (Optional)

If you no longer need the test pod, you can delete it by running:

kubectl delete pod nginx

This will remove the nginx pod from the cluster.

By following these steps, you can confirm that your Kubernetes cluster is able to run and manage pods successfully.

Managing Network Complexity in Kubernetes Cluster

To avoid the complexity of network setup caused by multiple IP ranges, custom IP rules were configured for seamless communication within the Kubernetes cluster. The network setup in the environment is as follows:

  • 172.21.x.x: Used for bridged network.
  • 10.10.x.x: Used for the host-only network for KVM (Kernel-based Virtual Machine).
  • 192.168.x.x: Used for Airtel network connectivity.

Given this complexity, Kubernetes pods were assuming different network configurations that created networking issues. To resolve this, we configured the Kubernetes pod network CIDR (10.244.0.0/16) and applied a custom IP rule to ensure proper routing within the cluster.

Custom IP Rule Configuration

The following custom rule was added to the server to allow Kubernetes pods to communicate correctly:

from all to 10.244.0.0/16 lookup main

This rule ensures that traffic destined for the Kubernetes pod network (10.244.0.0/16) is routed properly through the main routing table, enabling seamless communication within the cluster.

With this configuration, the Kubernetes cluster was successfully able to manage network complexities and ensure reliable pod communication.

Here’s how you can update the documentation with the solution to the Calico node issue:

Resolving Calico CNI Issues:

After setting up the Kubernetes cluster with Calico as the CNI, an issue was identified where the calico-node-x pods were not ready. This was caused by the BGP (Border Gateway Protocol) not being established, and the BIRD protocol not being ready for node communication.

Steps to Resolve:

To resolve this, the Calico node address autodetection configuration was modified to ensure proper communication between nodes:

  1. Run the following command to edit the Calico installation configuration:

    kubectl edit installation default -n calico-system
    
  2. Locate the nodeAddressAutodetectionV4 field and change it to detect the virbr.* interfaces:

    - nodeAddressAutodetectionV4:
        interface: virbr.*
    
  3. Check the status of calico-node-X pods are running and ready

    kubectl get pods -n calico-system
    

    The expected output is

    NAME                                      READY   STATUS    RESTARTS   AGE
    calico-kube-controllers-85469dc84-sb5jr   1/1     Running   0          5d23h
    calico-node-fwwqb                         1/1     Running   0          5d22h
    calico-node-qtr4n                         1/1     Running   0          5d23h
    calico-node-wcrqg                         1/1     Running   0          5d22h
    calico-typha-7f88d5764c-cjzvs             1/1     Running   0          5d22h
    calico-typha-7f88d5764c-vh5cj             1/1     Running   0          5d23h
    csi-node-driver-4q8md                     2/2     Running   0          5d23h
    csi-node-driver-kchsw                     2/2     Running   0          5d22h
    csi-node-driver-mgcvt                     2/2     Running   0          5d23h
    
  4. Create 5 nginx replicas to test and try pinging to the pods that are spinned up on another kubernetes node and make sure it works properly.

This change ensures that the Calico nodes can correctly detect the virtual bridge interfaces (virbr.*) on each node, enabling proper communication and BGP establishment.