Workloads not working in k8s
ccamacho opened this issue · comments
Describe the bug
Workload Pods are not running
To Reproduce
Steps to reproduce the behavior:
- Deploy k8s
- Run a sample app
- The workloads are not allocated
Expected behavior
Workloads running
Additional context
1- From the controller run a kubectl get nodes, then check that in the case there are no workers, we have this issue
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 13m default-scheduler 0/3 nodes are available: 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.
We need to run the following command because there are no workers in this CI cluster:
kubectl taint node controller-01.k8scluster.kubeinit.local node-role.kubernetes.io/master:NoSchedule-
kubectl taint node controller-02.k8scluster.kubeinit.local node-role.kubernetes.io/master:NoSchedule-
kubectl taint node controller-03.k8scluster.kubeinit.local node-role.kubernetes.io/master:NoSchedule-
2- After deploying a simple app
[root@controller-01 ~]# kubectl get pods -l app=nginx
NAME READY STATUS RESTARTS AGE
nginx-deployment-66b6c48dd5-2g7cj 0/1 ContainerCreating 0 13m
nginx-deployment-66b6c48dd5-46ntm 0/1 ContainerCreating 0 13m
nginx-deployment-66b6c48dd5-47k9r 0/1 ContainerCreating 0 13m
Pods hangs because of:
Warning FailedCreatePodSandBox 55s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_nginx-deployment-66b6c48dd5-2g7cj_default_4b860f60-a0ce-4a8a-a160-6651f7416f8c_0(908b72da2ea2e45cbcd48fa082220b65ed92284ebcacd8f79c420f5a2135cebb): error adding pod default_nginx-deployment-66b6c48dd5-2g7cj to CNI network "cbr0": failed to delegate add: failed to set bridge addr: "cni0" already has an IP address different from 10.244.0.1/24
The cluster has after running kubectl edit nodes controller-01.k8scluster.kubeinit.local
:
labels:
beta.kubernetes.io/arch: amd64
beta.kubernetes.io/os: linux
kubernetes.io/arch: amd64
kubernetes.io/hostname: controller-01.k8scluster.kubeinit.local
kubernetes.io/os: linux
node-role.kubernetes.io/control-plane: ""
node-role.kubernetes.io/master: ""
node.kubernetes.io/exclude-from-external-load-balancers: ""
name: controller-01.k8scluster.kubeinit.local
resourceVersion: "6786"
uid: b6933b3c-1f7f-453a-9793-61aff36efdbc
spec:
podCIDR: 10.244.0.0/24
podCIDRs:
- 10.244.0.0/24
status:
addresses:
- address: 10.0.0.1
type: InternalIP
- address: controller-01.k8scluster.kubeinit.local
type: Hostname
allocatable:
In which 10.244.0.0/24 do not match with https://github.com/Kubeinit/kubeinit/blob/main/kubeinit/roles/kubeinit_k8s/defaults/main.yml#L26
3- In the first controller cni0 should match iirc 10.244.0.0/16
ip a
3: cni0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether d2:2c:da:fe:4b:77 brd ff:ff:ff:ff:ff:ff
inet 10.85.0.1/16 brd 10.85.255.255 scope global cni0
valid_lft forever preferred_lft forever
inet6 1100:200::1/24 scope global
valid_lft forever preferred_lft forever
inet6 fe80::d02c:daff:fefe:4b77/64 scope link
valid_lft forever preferred_lft forever
We should create a small role to deploy a simple app and make sure the workloads are able to run.
Fixed by: #548