Kubeinit / kubeinit

Ansible automation to have a KUBErnetes cluster INITialized as soon as possible...

Home Page:https://www.kubeinit.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Workloads not working in k8s

ccamacho opened this issue · comments

Describe the bug
Workload Pods are not running

To Reproduce
Steps to reproduce the behavior:

  1. Deploy k8s
  2. Run a sample app
  3. The workloads are not allocated

Expected behavior
Workloads running

Additional context
1- From the controller run a kubectl get nodes, then check that in the case there are no workers, we have this issue

Events:
  Type     Reason                  Age                   From               Message
  ----     ------                  ----                  ----               -------
  Warning  FailedScheduling        13m                   default-scheduler  0/3 nodes are available: 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.

We need to run the following command because there are no workers in this CI cluster:

kubectl taint node controller-01.k8scluster.kubeinit.local node-role.kubernetes.io/master:NoSchedule-
kubectl taint node controller-02.k8scluster.kubeinit.local node-role.kubernetes.io/master:NoSchedule-
kubectl taint node controller-03.k8scluster.kubeinit.local node-role.kubernetes.io/master:NoSchedule-

2- After deploying a simple app

[root@controller-01 ~]# kubectl get pods -l app=nginx
NAME                                READY   STATUS              RESTARTS   AGE
nginx-deployment-66b6c48dd5-2g7cj   0/1     ContainerCreating   0          13m
nginx-deployment-66b6c48dd5-46ntm   0/1     ContainerCreating   0          13m
nginx-deployment-66b6c48dd5-47k9r   0/1     ContainerCreating   0          13m

Pods hangs because of:

Warning  FailedCreatePodSandBox  55s               kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_nginx-deployment-66b6c48dd5-2g7cj_default_4b860f60-a0ce-4a8a-a160-6651f7416f8c_0(908b72da2ea2e45cbcd48fa082220b65ed92284ebcacd8f79c420f5a2135cebb): error adding pod default_nginx-deployment-66b6c48dd5-2g7cj to CNI network "cbr0": failed to delegate add: failed to set bridge addr: "cni0" already has an IP address different from 10.244.0.1/24

The cluster has after running kubectl edit nodes controller-01.k8scluster.kubeinit.local:

  labels:
    beta.kubernetes.io/arch: amd64
    beta.kubernetes.io/os: linux
    kubernetes.io/arch: amd64
    kubernetes.io/hostname: controller-01.k8scluster.kubeinit.local
    kubernetes.io/os: linux
    node-role.kubernetes.io/control-plane: ""
    node-role.kubernetes.io/master: ""
    node.kubernetes.io/exclude-from-external-load-balancers: ""
  name: controller-01.k8scluster.kubeinit.local
  resourceVersion: "6786"
  uid: b6933b3c-1f7f-453a-9793-61aff36efdbc
spec:
  podCIDR: 10.244.0.0/24
  podCIDRs:
  - 10.244.0.0/24
status:
  addresses:
  - address: 10.0.0.1
    type: InternalIP
  - address: controller-01.k8scluster.kubeinit.local
    type: Hostname
  allocatable:

In which 10.244.0.0/24 do not match with https://github.com/Kubeinit/kubeinit/blob/main/kubeinit/roles/kubeinit_k8s/defaults/main.yml#L26

3- In the first controller cni0 should match iirc 10.244.0.0/16

ip a
3: cni0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether d2:2c:da:fe:4b:77 brd ff:ff:ff:ff:ff:ff
    inet 10.85.0.1/16 brd 10.85.255.255 scope global cni0
       valid_lft forever preferred_lft forever
    inet6 1100:200::1/24 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::d02c:daff:fefe:4b77/64 scope link 
       valid_lft forever preferred_lft forever

We should create a small role to deploy a simple app and make sure the workloads are able to run.