To deploy a highly available Kubernetes cluster across three control plane nodes, ensuring fault tolerance and scalability.
The setup involves installing Kubernetes on three control plane nodes using kubeadm
as the installation tool and containerd
as the container runtime. The installation process is automated with Ansible, requiring familiarity with Ansible playbooks and scripts.
Before proceeding with the installation, ensure the following:
kubeadm
for initializing and configuring the Kubernetes cluster.containerd
is used as the container runtime to manage containers efficiently.kubeadm
, containerd
, and necessary Kubernetes components.The installation process uses a pre-configured Ansible playbook located in the DevopsAutomationScripts
repository. The playbook automates the installation and configuration of Kubernetes on target nodes.
Clone the Repository
Clone the repository containing the installation playbook:
https://github.com/DevOps-Model/DevOps-Automation-Scripts/tree/main/ansible_scripts
cd DevOps-Automation-Scripts/ansible_scripts
Locate the Playbook
The Kubernetes installation playbook is located at:
install_k8s.yaml
hosts
file).[kubernetes_nodes]
node1 ansible_host=192.168.1.10 ansible_user=root
node2 ansible_host=192.168.1.11 ansible_user=root
node3 ansible_host=192.168.1.12 ansible_user=root
ansible.cfg
file.inventory
: Path to your updated hosts file.remote_user
: User for accessing target nodes.private_key_file
: Path to your private SSH key (if applicable).ssh-keygen -t rsa
ssh-copy-id root@<target-node-IP>
ansible-playbook install_k8s.yaml
systemctl status containerd
systemctl status kubelet
By default, Kubernetes requires swap to be disabled for proper functioning of the kubelet
service. However, if disabling swap is not feasible, you can configure Kubernetes to allow swap by modifying the kubelet configuration.
Edit the Kubelet Configuration File
Open the kubelet default configuration file located at:
/etc/default/kubelet
Add the following argument to allow the kubelet to run even when swap is enabled:
KUBELET_EXTRA_ARGS="--fail-swap-on=false"
Restart the Kubelet Service
systemctl restart kubelet
Verify the Kubelet Status
systemctl status kubelet
It will be in activating mode.
Here’s the section rewritten as part of a larger document:
In Kubernetes environments, containerd
serves as the container runtime and stores all images and runtime data in its default directory, typically located at /var/lib/containerd
. If the default directory does not have sufficient storage, you can configure containerd
to use a custom data directory.
/var/lib/containerd
that you wish to preserve, consider manually migrating it to the new directory before restarting the service.Edit the containerd Configuration File
sudo nano /etc/containerd/config.toml
Modify the Data Directory Path
root
parameter in the configuration file and change it to your desired directory. For example:
root = "/data3/containerd"
/data3/containerd
with a path on a disk or partition that has adequate storage.Restart the containerd Service
containerd
service to apply the changes:
sudo systemctl daemon-reload
sudo systemctl restart containerd
Verify the New Configuration
containerd
is running and using the new directory:
sudo systemctl status containerd
By reconfiguring containerd
in this way, you can ensure it operates effectively on systems with custom storage requirements.
Here’s the documentation section for configuring the load balancer using HAProxy:
In this setup, the load balancer is a dedicated bare-metal CPU node (172.21.0.63
) and is not part of the Kubernetes cluster nodes. The load balancer is configured using HAProxy to distribute traffic between the control plane nodes.
172.21.0.63
):
sudo apt update
sudo apt install haproxy -y
Edit the HAProxy Configuration File
Open the configuration file:
sudo nano /etc/haproxy/haproxy.cfg
Replace the File Content with the Following Configuration:
global
maxconn 1028
daemon
user haproxy
group haproxy
defaults
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
frontend fe-apiserver
bind 0.0.0.0:6443
mode tcp
option tcplog
default_backend be-apiserver
http-request set-header X-Forwarded-Proto https
backend be-apiserver
mode tcp
option tcplog
option tcp-check
balance roundrobin
default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100
server master1 172.21.0.20:6443 check
server master2 172.21.0.32:6443 check
server master3 172.21.0.36:6443 check
#server master4 172.21.1.41:6443 check
listen stats
bind *:9500
mode http
stats enable
stats uri /stats
stats auth haproxy:1!Qhaproxy
Explanation of Key Sections:
fe-apiserver
):
6443
and forwards API requests to the backend servers.X-Forwarded-Proto
header for HTTPS.be-apiserver
):
Restart the HAProxy Service
Apply the configuration changes by restarting the HAProxy service:
sudo systemctl restart haproxy
Verify HAProxy Status
Check if HAProxy is running correctly:
sudo systemctl status haproxy
Before initializing the Kubernetes cluster, ensure that the load balancer is configured to facilitate communication between the three control plane nodes. This ensures high availability and load balancing for the API server.
/etc/hosts
file on all nodes to resolve the load balancer’s hostname:
k8s-cluster01-api 172.21.0.63
k8s-cluster01-api
: The hostname of the load balancer.172.21.0.63
: The IP address of the load balancer.Use the kubeadm init
command to initialize the Kubernetes control plane on the first node:
kubeadm init --control-plane-endpoint="k8s-cluster01-api:6443" \
--upload-certs \
--apiserver-advertise-address=172.21.0.20 \
--pod-network-cidr=10.244.0.0/16
--control-plane-endpoint="k8s-cluster01-api:6443"
k8s-cluster01-api
) resolves to the load balancer’s IP (172.21.0.63
) to ensure high availability.:6443
denotes the default port for the Kubernetes API server.--upload-certs
--apiserver-advertise-address=172.21.0.20
--pod-network-cidr=10.244.0.0/16
10.244.0.0/16
, which is required for the Flannel network plugin. Adjust the CIDR range if using a different CNI plugin.The sample output is
kubeadm join k8s-cluster01-api:6443 --token xiysz5.xt9t78i6r2aluwhr \
--discovery-token-ca-cert-hash sha256:0d23b55054fc27496b72353d262a9436f5f2bbdfefcb61cfe8f8bac77a4e49ca \
--control-plane --certificate-key a415c31b6c1762d89d2f3c46a22b430b5635106f83d2538172c7681ea57239b6
Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join k8s-cluster01-api:6443 --token xiysz5.xt9t78i6r2aluwhr \
--discovery-token-ca-cert-hash sha256:0d23b55054fc27496b72353d262a9436f5f2bbdfefcb61cfe8f8bac77a4e49ca
After the initialization, the command output will include instructions to:
To make kubectl
commands available for your user, run:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Calico is a popular CNI(Container Network Interface) plugin for Kubernetes that provides network connectivity and network policy management. Below are the steps to install and configure Calico using the operator and custom resources.
To install the Calico operator in your cluster, which manages Calico’s lifecycle, use the following command:
kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.29.1/manifests/tigera-operator.yaml
The Calico operator will be installed in the calico-system
namespace.
After installing the operator, download the custom resources necessary to configure Calico networking. Run the following curl
command to fetch the resources:
curl https://raw.githubusercontent.com/projectcalico/calico/v3.29.1/manifests/custom-resources.yaml -O
This file contains the necessary custom resources that will configure Calico’s network settings, including IP address management (CIDR blocks for pod networking).
Before applying the custom resources, you need to modify the CIDR block for pod networking:
Open the custom-resources.yaml
file in a text editor, such as vi
:
vi custom-resources.yaml
Find the line that defines the podCIDR
. Change it from the default 192.168.0.0/16
to 10.244.0.0/16
(or your preferred CIDR block for pod networking):
podCIDR: "10.244.0.0/16"
Save and close the file.
Once you’ve modified the custom-resources.yaml
file, apply it to your cluster to configure Calico:
kubectl create -f custom-resources.yaml
This command will create the necessary resources and configuration to set up Calico as the CNI plugin for your Kubernetes cluster.
After applying the resources, you can verify that Calico has been installed and is running correctly by checking the status of the Calico pods:
watch kubectl get pods -n calico-system
You should see the Calico node pods starting up. A successful installation will show output similar to the following:
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-85469dc84-sb5jr 1/1 Running 0 5d20h
calico-node-fwwqb 1/1 Running 0 5d20h
calico-node-qtr4n 1/1 Running 0 5d20h
calico-node-wcrqg 1/1 Running 0 5d20h
calico-typha-7f88d5764c-cjzvs 1/1 Running 0 5d20h
calico-typha-7f88d5764c-vh5cj 1/1 Running 0 5d20h
csi-node-driver-4q8md 2/2 Running 0 5d20h
csi-node-driver-kchsw 2/2 Running 0 5d20h
csi-node-driver-mgcvt 2/2 Running 0 5d20h
The calico-node
pods should be in the Running
state, indicating that Calico is properly installed and managing network connectivity.
With Calico installed and configured, your Kubernetes cluster is now equipped with networking capabilities, including support for network policies to control pod-to-pod communication.
Here’s the continuation of the documentation, including the steps to verify that the nodes and CoreDNS pods are in the correct state after configuring the Calico CNI plugin:
Once the Calico CNI plugin is installed and the necessary resources have been created, run the following command to check the status of the nodes in your Kubernetes cluster:
kubectl get nodes
The output should show that all your nodes are in the Ready
state, indicating that they are successfully registered and able to communicate within the cluster:
NAME STATUS ROLES AGE VERSION
master-node1 Ready control-plane 10m v1.23.0
master-node2 Ready control-plane 10m v1.23.0
master-node3 Ready control-plane 10m v1.23.0
Next, check the status of the pods in the kube-system
namespace to ensure that all the necessary system pods, especially the CoreDNS pods, are running properly. Run the following command:
kubectl get pods -n kube-system
You should see that all the key system pods, including the coredns
pods, are in the Running
and Ready
state. A successful output should look something like this:
NAME READY STATUS RESTARTS AGE
calico-node-xyz123 1/1 Running 0 5m
coredns-abc123 1/1 Running 0 3m
coredns-def456 1/1 Running 0 3m
kube-proxy-xyz456 1/1 Running 0 5m
Ensure that the CoreDNS pods are listed and their READY
column shows 1/1
, which means they are functioning correctly. If any of the pods are not in a Running
state or are not Ready
, further troubleshooting may be required.
By running the above commands and verifying the output, you can confirm that the Calico CNI plugin is properly configured, the nodes are in a Ready
state, and the essential system pods, including CoreDNS, are up and running.
Here’s the continuation of the documentation, including the step to wait for all pods to reach the Running
and Ready
state after joining the control plane nodes:
Here’s the section to join additional control plane nodes to the Kubernetes cluster, including instructions on how to retrieve the necessary kubeadm join
command:
After initializing the Kubernetes cluster on the first control plane node, you can join the other control plane nodes to the cluster. Use the following kubeadm join
command to add them:
kubeadm join k8s-cluster01-api:6443 --token <your-token> \
--discovery-token-ca-cert-hash sha256:<your-hash>
<your-token>
with the token generated during the initialization step.<your-hash>
with the sha256
discovery token CA certificate hash.The actual command for the existing cluster is
kubeadm join k8s-cluster01-api:6443 --token xiysz5.xt9t78i6r2aluwhr \
--discovery-token-ca-cert-hash sha256:0d23b55054fc27496b72353d262a9436f5f2bbdfefcb61cfe8f8bac77a4e49ca \
--control-plane --certificate-key a415c31b6c1762d89d2f3c46a22b430b5635106f83d2538172c7681ea57239b6
If you do not have the kubeadm join
command or if you’ve lost the token/hash, you can retrieve the necessary information to join the cluster by running the following command on the initial control plane node:
kubeadm token create --print-join-command
This command will generate and print the full kubeadm join
command, including the required token and discovery token hash, which you can then run on the additional control plane nodes to join them to the cluster.
Once the other control plane nodes have successfully joined the cluster, run the following command again to check the status of all the nodes:
kubectl get nodes
All the control plane nodes should now be listed in the Ready
state.
By following these steps, you can join additional control plane nodes to your Kubernetes cluster and ensure that they are properly integrated and ready to handle workloads.
After successfully joining all the control plane nodes to the cluster, it’s important to wait for the system pods, including Calico and CoreDNS pods, to reach the Running
and Ready
state. This ensures that the Kubernetes cluster is fully operational and the networking is correctly configured.
Verify Pods in calico-system
Namespace:
Run the following command to check the status of the Calico pods:
watch kubectl get pods -n calico-system
The calico-node
pods should eventually show a status of Running
and Ready
. The output should look like:
NAMESPACE NAME READY STATUS RESTARTS AGE
calico-system calico-node-xyz123 1/1 Running 0 5m
Verify Pods in kube-system
Namespace:
Similarly, verify the status of the system pods in the kube-system
namespace by running:
watch kubectl get pods -n kube-system
Make sure all essential pods, especially coredns
, are in the Running
and Ready
state. The expected output might look like:
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-abc123 1/1 Running 0 5m
kube-system coredns-def456 1/1 Running 0 5m
kube-system calico-node-xyz123 1/1 Running 0 5m
kube-system kube-proxy-xyz456 1/1 Running 0 5m
All pods should show READY
as 1/1
, indicating they are running as expected.
Once all the pods are in the Running
and Ready
state, it’s a good practice to check the overall health of the Kubernetes cluster:
kubectl get componentstatus
You should see the status of key Kubernetes components, including the API server, scheduler, and controller manager, all in the Healthy
state.
By following these steps, you can ensure that all control plane nodes are properly joined to the cluster and the system pods are running as expected.
To verify that your Kubernetes cluster is fully functional, you can deploy a test NGINX pod. This will help ensure that the cluster is able to schedule and run workloads properly.
Create the NGINX Pod:
Run the following command to create an NGINX pod in the default namespace:
kubectl run nginx --image=nginx --restart=Never
This command will create a pod named nginx
using the official NGINX image and will ensure that the pod doesn’t automatically restart once it finishes (since --restart=Never
).
Verify the Pod Status:
To check the status of the NGINX pod, run:
kubectl get pods -o wide
The output should show the nginx
pod in the Running
and Ready
state:
NAME READY STATUS RESTARTS AGE
nginx 1/1 Running 0 1m
The READY
column should show 1/1
, indicating that the pod is running and all containers inside the pod are ready.
3. Delete the Test NGINX Pod (Optional)
If you no longer need the test pod, you can delete it by running:
kubectl delete pod nginx
This will remove the nginx
pod from the cluster.
By following these steps, you can confirm that your Kubernetes cluster is able to run and manage pods successfully.
To avoid the complexity of network setup caused by multiple IP ranges, custom IP rules were configured for seamless communication within the Kubernetes cluster. The network setup in the environment is as follows:
Given this complexity, Kubernetes pods were assuming different network configurations that created networking issues. To resolve this, we configured the Kubernetes pod network CIDR (10.244.0.0/16
) and applied a custom IP rule to ensure proper routing within the cluster.
The following custom rule was added to the server to allow Kubernetes pods to communicate correctly:
from all to 10.244.0.0/16 lookup main
This rule ensures that traffic destined for the Kubernetes pod network (10.244.0.0/16
) is routed properly through the main routing table, enabling seamless communication within the cluster.
With this configuration, the Kubernetes cluster was successfully able to manage network complexities and ensure reliable pod communication.
Here’s how you can update the documentation with the solution to the Calico node issue:
After setting up the Kubernetes cluster with Calico as the CNI, an issue was identified where the calico-node-x
pods were not ready. This was caused by the BGP (Border Gateway Protocol) not being established, and the BIRD protocol not being ready for node communication.
To resolve this, the Calico node address autodetection configuration was modified to ensure proper communication between nodes:
Run the following command to edit the Calico installation configuration:
kubectl edit installation default -n calico-system
Locate the nodeAddressAutodetectionV4
field and change it to detect the virbr.*
interfaces:
- nodeAddressAutodetectionV4:
interface: virbr.*
Check the status of calico-node-X
pods are running and ready
kubectl get pods -n calico-system
The expected output is
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-85469dc84-sb5jr 1/1 Running 0 5d23h
calico-node-fwwqb 1/1 Running 0 5d22h
calico-node-qtr4n 1/1 Running 0 5d23h
calico-node-wcrqg 1/1 Running 0 5d22h
calico-typha-7f88d5764c-cjzvs 1/1 Running 0 5d22h
calico-typha-7f88d5764c-vh5cj 1/1 Running 0 5d23h
csi-node-driver-4q8md 2/2 Running 0 5d23h
csi-node-driver-kchsw 2/2 Running 0 5d22h
csi-node-driver-mgcvt 2/2 Running 0 5d23h
Create 5 nginx replicas to test and try pinging to the pods that are spinned up on another kubernetes node and make sure it works properly.
This change ensures that the Calico nodes can correctly detect the virtual bridge interfaces (virbr.*
) on each node, enabling proper communication and BGP establishment.