Hybrid Kubernetes Cluster: Detailed Setup with Linux and Windows Nodes


Automating Kubernetes Cluster Deployment on Windows with Calico Networking

Introduction

The automation of Kubernetes deployments is crucial for scaling complex, modern infrastructure environments effectively. This approach minimizes the need for manual configurations, mitigates the risk of human error, and ensures consistency across all environments. In this article, we provide an advanced discussion on the automation of Kubernetes deployments on Windows nodes, using Calico as a networking solution. By the end of this article, you will gain a deep understanding of how to build an efficient, highly scalable Kubernetes cluster that integrates seamlessly across both Linux and Windows nodes.

Kubernetes is a sophisticated container orchestration system that facilitates the management of distributed systems at scale. Calico, an open-source networking and network security solution, enhances Kubernetes by providing a flexible, scalable, and secure networking layer. This article guides you through the intricacies of automating this deployment using Ansible, focusing on the architectural challenges, bottlenecks, and optimal strategies for automating Kubernetes setup in a heterogeneous environment.

Prerequisites

Before proceeding with the deployment, it is imperative to meet the following prerequisites:

  • Hardware and Software Requirements: A set of target servers is needed to serve as master and worker nodes. These can be either physical servers or virtual machines, running Windows or Linux. The control machine, responsible for orchestration, should have Ansible installed.
  • Installing Ansible and Dependencies: Ansible must be installed on the control machine, which will be responsible for managing remote nodes. Additionally, Python, as a core dependency, should be installed to facilitate Ansible operations.
  • Environment Setup: Prepare both the master and worker nodes by updating their operating systems, installing essential packages, and configuring their network settings to support node-to-node communication. Ensure that all necessary firewall rules are configured to facilitate Kubernetes network traffic, specifically between the master and worker nodes.

Setting Up the Master Node

Environment Preparation

To initiate the setup of the master node, essential dependencies must be installed. The following script (master.sh) facilitates the installation of Python and Ansible, as well as the creation of necessary users to orchestrate the cluster effectively:

#!/bin/bash
sudo apt update
sudo apt install software-properties-common --yes
sudo apt-add-repository --yes --update ppa:ansible/ansible
sudo apt update
sudo apt install python ansible -y
sudo apt-get install acl -y

sudo useradd -m -s /bin/bash kube-cluster
sudo passwd kube-cluster -d
sudo sed -i '20a kube-cluster  ALL=(ALL:ALL) NOPASSWD: ALL' /etc/sudoers
sudo useradd -m -s /bin/bash ansible-control-panel
sudo passwd ansible-control-panel -d
sudo sed -i '20a ansible-control-panel ALL=(ALL:ALL) NOPASSWD: ALL' /etc/sudoers
sudo mkdir /home/ansible-control-panel/.ssh
sudo chown -R ansible-control-panel:ansible-control-panel /home/ansible-control-panel/.ssh/
sudo mkdir /home/ansible-control-panel/ansible
sudo chown -R ansible-control-panel:ansible-control-panel /home/ansible-control-panel/ansible/
sudo -u ansible-control-panel ssh-keygen -f /home/ansible-control-panel/.ssh/id_rsa -t rsa -N ''
sudo cat /home/ansible-control-panel/.ssh/id_rsa.pub # Copy id_rsa.pub for authorized_keys on worker nodes
# Add to authorized_keys in master and worker nodes (can be automated with Ansible)

This script performs the following actions:

  • System Update and Package Installation: Updates the system repositories and installs Python and Ansible, which are essential for running Ansible playbooks.
  • User Creation: Creates two users, kube-cluster and ansible-control-panel, which will manage Kubernetes operations and Ansible orchestration, respectively.
  • Passwordless Sudo Access: Grants passwordless sudo privileges to both users by modifying the /etc/sudoers file.
  • SSH Key Generation: Generates SSH keys for the ansible-control-panel user to enable secure communication between the control machine and the nodes.
  • Adding Keys and Configurations
  • SSH Keys Setup: Setting up secure SSH keys is critical for establishing trusted communication between nodes. The script provided facilitates the generation of SSH keys for the ansible-control-panel user, which is essential for executing automated tasks across nodes.

Running Playbooks

With the environment prepared, proceed to execute the Ansible playbooks that automate the configuration of the master node.

Distributing SSH Keys

The addkeys.yml playbook distributes SSH keys across all nodes, enabling secure, automated communication between the master and worker nodes:

- hosts: master
  become: yes
  become_user: root
  vars:
    key: "<worker-public-key>"
  tasks:
    - name: Add worker's public key to authorized_keys
      lineinfile:
        path: /home/ansible-control-panel/.ssh/authorized_keys
        line: "{{ key }}"
        state: present

This playbook:

  • Hosts Definition: Targets the master node.
  • Privilege Escalation: Uses become to execute tasks with elevated privileges.
  • SSH Key Distribution: Appends the worker node’s public SSH key to the authorized_keys file of the ansible-control-panel user on the master node.
  • Installing Kubernetes Dependencies
  • The kube-dependencies.yml playbook installs essential Kubernetes components (kubeadm, kubectl, and kubelet) on the master node:
- hosts: linux_new_qa
  become: yes
  become_user: root
  vars:
    kubernetes_version: "1.27.*"

  tasks:
    - name: Update APT packages
      apt:
        update_cache: yes

    - name: Install necessary packages
      apt:
        name:
          - apt-transport-https
          - ca-certificates
          - curl
          - software-properties-common
          - gnupg2
        state: present

    - name: Disable SWAP and UFW
      shell: |
        sudo swapoff -a
        sudo ufw disable

    - name: Configure iptables to see bridged traffic
      shell: |
        cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
        overlay
        br_netfilter
        EOF
        sudo modprobe overlay
        sudo modprobe br_netfilter
        cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
        net.bridge.bridge-nf-call-iptables  = 1
        net.bridge.bridge-nf-call-ip6tables = 1
        net.ipv4.ip_forward                 = 1
        EOF
        sudo sysctl --system

    - name: Add Kubernetes apt repository
      shell: |
        curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
        echo "deb https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
        sudo apt update

    - name: Install kubelet, kubeadm, and kubectl
      apt:
        name:
          - kubelet={{ kubernetes_version }}
          - kubeadm={{ kubernetes_version }}
          - kubectl={{ kubernetes_version }}
        state: present

    - name: Hold Kubernetes packages at current version
      apt:
        name:
          - kubelet
          - kubeadm
          - kubectl
        state: hold

    - name: Install containerd
      shell: |
        wget https://github.com/containerd/containerd/releases/download/v1.6.21/containerd-1.6.21-linux-amd64.tar.gz
        sudo tar Cxzvf /usr/local containerd-1.6.21-linux-amd64.tar.gz
        sudo curl -L https://raw.githubusercontent.com/containerd/containerd/main/containerd.service -o /etc/systemd/system/containerd.service
        sudo systemctl daemon-reload
        sudo systemctl enable --now containerd
        wget https://github.com/opencontainers/runc/releases/download/v1.1.7/runc.amd64
        sudo install -m 755 runc.amd64 /usr/local/sbin/runc
        sudo mkdir /etc/containerd
        containerd config default | sudo tee /etc/containerd/config.toml
        sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/g' /etc/containerd/config.toml
        sudo systemctl restart containerd

    - name: Pull necessary Kubernetes images
      command: kubeadm config images pull

This playbook performs several critical steps:

  • System Preparation: Updates packages and installs necessary dependencies.
  • Kernel Module Configuration: Loads kernel modules and sets system parameters required by Kubernetes.
  • Kubernetes Installation: Adds the Kubernetes APT repository and installs kubelet, kubeadm, and kubectl.
  • Container Runtime Installation: Installs containerd and configures it as the container runtime for Kubernetes.
  • Image Pre-pulling: Pulls Kubernetes control plane images to speed up the cluster initialization process.
  • Initializing the Kubernetes Control Plane
  • The master.yml playbook sets up the Kubernetes control plane by initializing the cluster, configuring kubeconfig, and deploying the Calico network plugin:
---
- hosts: master
  become_user: kube-cluster
  become: yes
  vars:
    user: kube-cluster
  tasks:
    - name: Initialize the cluster
      command: kubeadm init --pod-network-cidr=192.169.0.0/16
      args:
        creates: /etc/kubernetes/admin.conf

    - name: Create .kube directory
      file:
        path: /home/{{ user }}/.kube
        state: directory
        mode: '0755'

    - name: Copy admin.conf to user's kube config
      copy:
        src: /etc/kubernetes/admin.conf
        dest: /home/{{ user }}/.kube/config
        owner: "{{ user }}"
        mode: '0644'

    - name: Deploy Calico operator
      get_url:
        url: "https://raw.githubusercontent.com/CPUtester5465/ansible-kubernetes-calico/main/tigera-operator.yaml"
        dest: "/home/{{ user }}/tigera-operator.yaml"
      become: yes

    - name: Apply Calico operator
      shell: "kubectl apply -f /home/{{ user }}/tigera-operator.yaml"
      args:
        chdir: "/home/{{ user }}/"

    - name: Deploy Calico node
      get_url:
        url: "https://raw.githubusercontent.com/CPUtester5465/ansible-kubernetes-calico/main/calico-node/calico-node.yaml"
        dest: "/home/{{ user }}/calico-node.yaml"
      become: yes

    - name: Apply Calico node
      shell: "kubectl apply -f /home/{{ user }}/calico-node.yaml"
      args:
        chdir: "/home/{{ user }}/"

    - name: Create repository directory
      become_user: ansible-control-panel
      file:
        path: /home/ansible-control-panel/repo
        state: directory

    - name: Copy kube config to repository
      become_user: ansible-control-panel
      copy:
        src: /home/{{ user }}/.kube/config
        dest: /home/ansible-control-panel/repo
        owner: ansible-control-panel
        mode: '0644'

    - name: Generate kubeadm join command
      command: kubeadm token create --print-join-command
      register: join_command_output

    - name: Write join command to file
      copy:
        content: "{{ join_command_output.stdout }}"
        dest: /home/{{ user }}/kubeadmtoken/kube-connect
        owner: "{{ user }}"
        mode: '0755'

    - name: Copy kubeadm token to repository
      become_user: ansible-control-panel
      copy:
        src: /home/{{ user }}/kubeadmtoken/kube-connect
        dest: /home/ansible-control-panel/repo
        owner: ansible-control-panel
        mode: '0755'

Key actions performed by this playbook include:

  • Cluster Initialization: Uses kubeadm init to set up the Kubernetes control plane.
  • Configuration Setup: Copies the Kubernetes admin configuration to the user’s home directory for easy access.
  • Network Plugin Deployment: Deploys Calico using the operator and node manifests.
  • Token Generation: Creates a kubeadm join token for worker nodes and stores it for distribution.
  • Setting Up the Worker Nodes
  • Environment Preparation
  • Worker nodes need to be set up by installing the necessary dependencies, which is facilitated using the worker.sh script:
#!/bin/bash
sudo apt update
sudo apt install software-properties-common --yes
sudo apt-add-repository --yes --update ppa:ansible/ansible
sudo apt update
sudo apt install python ansible -y
sudo apt-get install acl

sudo useradd -m -s /bin/bash ansible
sudo passwd ansible -d
sudo mkdir /home/ansible/.ssh
sudo chown ansible:ansible /home/ansible/.ssh
sudo -u ansible ssh-keygen -f /home/ansible/.ssh/id_rsa -t rsa -N ''
sudo useradd -m -s /bin/bash kube-cluster
sudo passwd kube-cluster -d
sudo sed -i '20a kube-cluster  ALL=(ALL:ALL) NOPASSWD: ALL' /etc/sudoers
sudo sed -i '20a ansible  ALL=(ALL:ALL) NOPASSWD: ALL' /etc/sudoers
sudo touch /home/ansible/.ssh/authorized_keys
sudo echo "<master-public-key>" >> /home/ansible/.ssh/authorized_keys
sudo chown ansible:ansible /home/ansible/.ssh/authorized_keys
sudo chmod 600 /home/ansible/.ssh/authorized_keys
sudo cat /home/ansible/.ssh/id_rsa.pub

This script:

  • System Update and Package Installation: Updates the system and installs Python and Ansible.
  • User and SSH Setup: Creates the ansible user, sets up SSH keys, and prepares for secure communication.
  • Privilege Configuration: Grants passwordless sudo access to necessary users.
  • Connecting to the Master Node
  • Execute the worker.yml playbook to join the worker nodes to the Kubernetes cluster:
---
- hosts: linux1, linux2
  vars:
    user: kube-cluster
    ipmr: "<master-node-ip>"

  tasks:
    - name: Download kubeadm join command
      become_user: ansible
      shell: |
        scp -o StrictHostKeyChecking=no ansible-control-panel@{{ ipmr }}:/home/ansible-control-panel/repo/kube-connect /home/ansible/

    - name: Copy kube-connect file
      become: yes
      become_user: "{{ user }}"
      copy:
        src: /home/ansible/kube-connect
        dest: /home/{{ user }}/
        mode: '0755'
        remote_src: yes

    - name: Join the cluster
      become: yes
      become_user: "{{ user }}"
      shell: |
        sudo /home/{{ user }}/kube-connect

This playbook:

  • File Transfer: Copies the kubeadm join command from the master to the worker node.
  • Cluster Joining: Executes the join command to add the worker node to the cluster.
  • Configuring Windows Server Nodes
  • Dependency Installation and Configuration
  • Windows nodes require additional configuration steps to be compatible with Kubernetes. Use the w1.yml playbook to install necessary Windows features such as Containers and Hyper-V to support containerization on Windows:
---
- name: Install dependencies and configure Kubernetes on Windows Server
  hosts: windows_new_qa
  become_method: runas
  gather_facts: false
  vars:
    ipmr: "<master-node-ip>"

  tasks:
    - name: Set firewall and install Containers feature
      win_shell: |
        Set-NetFirewallProfile -Profile Domain, Public, Private -Enabled false
        Install-WindowsFeature Containers

    - name: Format IP Address for Machine Name
      win_shell: |
        $ipAddress = (Get-NetIPAddress | Where-Object {$_.InterfaceAlias -like "*Ethernet*" -and $_.AddressFamily -eq "IPv4"}).IPAddress
        $formattedIpAddress = "ip-" + $ipAddress.Replace('.', '-')
        Rename-Computer -NewName "$formattedIpAddress" -Force -PassThru

    - name: Reboot the machine
      win_reboot:
        reboot_timeout: 300
        msg: "Rebooting the machine"

    - name: Install containerd
      win_shell: |
        Invoke-WebRequest https://docs.tigera.io/calico/3.25/scripts/Install-Containerd.ps1 -OutFile c:\Install-Containerd.ps1
        [System.Environment]::SetEnvironmentVariable('CNI_BIN_DIR', 'c:\Program Files\containerd\cni\bin')
        [System.Environment]::SetEnvironmentVariable('CNI_CONF_DIR', 'c:\Program Files\containerd\cni\conf')
        c:\Install-Containerd.ps1 -ContainerDVersion 1.6.16 -CNIConfigPath "c:/Program Files/containerd/cni/conf" -CNIBinPath "c:/Program Files/containerd/cni/bin"
        curl.exe -LO https://github.com/kubernetes-sigs/cri-tools/releases/download/v1.26.0/crictl-v1.26.0-windows-amd64.tar.gz
        tar xvf crictl-v1.26.0-windows-amd64.tar.gz
        Move-Item -Path C:\Users\ansible\crictl.exe -Destination C:\Program Files\containerd\crictl.exe

    - name: Create necessary directories and download configurations
      win_file:
        path: C:\k
        state: directory

    - name: Copy kubeconfig from master
      win_shell: |
        scp -o StrictHostKeyChecking=no ansible-control-panel@{{ ipmr }}:/home/ansible-control-panel/repo/config C:\Users\ansible\repo
        Move-Item -Path C:\Users\ansible\repo\config -Destination C:\k

    - name: Download and install Calico
      win_shell: |
        curl.exe -LO https://github.com/projectcalico/calico/releases/download/v3.25.0/calicoctl-windows-amd64.exe
        Move-Item -Path calicoctl-windows-amd64.exe -Destination C:\Windows\calicoctl.exe
        Invoke-WebRequest https://github.com/projectcalico/calico/releases/download/v3.25.0/install-calico-windows.ps1 -OutFile c:\install-calico-windows.ps1
        Start-Transcript -Path 'C:\Users\ansible\Desktop\install_calico.log'
        c:\install-calico-windows.ps1 -KubeVersion 1.24.10 -Datastore kubernetes -ServiceCidr 10.96.0.0/16 -DNSServerIPs 10.96.0.10 -CalicoBackend vxlan

This playbook performs the following:

  • Firewall Configuration: Disables firewall profiles to ensure seamless communication.
  • Feature Installation: Installs necessary Windows features for containerization.
  • Machine Renaming: Renames the machine based on its IP address for consistency.
  • Container Runtime Setup: Installs containerd and configures environment variables.
  • Calico Deployment: Downloads and installs Calico for networking.
  • Installing Kubernetes Services
  • To start Kubernetes services like kubelet and kube-proxy on Windows nodes, use the following tasks in the w1.yml playbook:
---
- name: Install and run kube services on Windows Server
  hosts: windows_new_qa
  gather_facts: false

  tasks:
    - name: Delete existing kubelet-service.ps1
      win_file:
        path: C:\CalicoWindows\kubernetes\kubelet-service.ps1
        state: absent

    - name: Download kubelet-service script
      win_shell: |
        scp -o StrictHostKeyChecking=no ansible-control-panel@{{ ipmr }}:/home/ansible-control-panel/repo/kubelet-service.ps1 C:\CalicoWindows\kubernetes

    - name: Install kubelet and kube-proxy
      win_shell: |
        C:\CalicoWindows\kubernetes\install-kube-services.ps1
        Start-Service -Name kubelet
        Start-Service -Name kube-proxy

These tasks:

  • Script Management: Ensures the latest kubelet-service.ps1 script is used.
  • Service Installation: Installs and starts the kubelet and kube-proxy services.
  • Deploying Calico on Windows Nodes
  • To configure Calico networking on Windows nodes, use the win-init.ps1 script to establish VXLAN as the networking backend:
$Username = "ansible"
$Password = ""

$SecurePassword = ConvertTo-SecureString $Password -AsPlainText -Force
$Credential = New-Object System.Management.Automation.PSCredential ($Username, $SecurePassword)

Start-Transcript -Path "C:\Users\$Username\Desktop\init_ps.log"
Invoke-WebRequest -Uri 'https://raw.githubusercontent.com/ansible/ansible/stable-2.10/examples/scripts/ConfigureRemotingForAnsible.ps1' -OutFile "C:\Users\$Username\Desktop\ConfigureRemotingForAnsible.ps1"
Start-Process PowerShell -Credential $Credential -ArgumentList "-NoProfile -ExecutionPolicy Bypass -File C:\Users\$Username\Desktop\ConfigureRemotingForAnsible.ps1"
$scriptBlock = {
    Set-ExecutionPolicy Bypass -Scope Process -Force
    iwr https://community.chocolatey.org/install.ps1 -UseBasicParsing | iex
    choco install git -yes
    Import-Module $env:ChocolateyInstall\helpers\chocolateyProfile.psm1
    RefreshEnv
}

Start-Process PowerShell -Credential $Credential -ArgumentList "-NoProfile -ExecutionPolicy Bypass -Command $scriptBlock"
mkdir C:\Users\$Username\.ssh

ssh-keygen -f C:\Users\$Username\.ssh\id_rsa -t rsa -N ''
mkdir C:\Users\$Username\repo

This script:

  • PowerShell Remoting Configuration: Sets up the system for remote management via Ansible.
  • Package Manager Installation: Installs Chocolatey for package management.
  • Git Installation: Installs Git, which may be necessary for pulling configurations.
  • SSH Key Generation: Generates SSH keys for secure communication.

Challenges and Bottlenecks

  • Token Expiry:

    The join token used for adding worker nodes to the cluster has a limited validity period. If the token expires, generate a new one using kubeadm token create.

  • Network Configuration:

    Worker nodes must be able to reach the master node. Verify IP reachability, ensure firewall rules are configured correctly, and check DNS settings to mitigate potential issues.

  • Service Start Failure:

    If kubelet or kube-proxy services fail to start, ensure configuration files are properly set up and network paths are accurate.

Note: All the code snippets and scripts provided in this article are integral parts of the deployment process. Ensure to customize variables like IP addresses and user credentials according to your environment before executing them. This article still in draft do there will be a lot of changes