Failed to start ContainerManager

Question

Failed to start ContainerManager

zar3bski opened this issue 4 years ago · comments

All my pods end up stuck to Pending

kubectl get -n kube-system pods                                                                   
NAME                                         READY   STATUS    RESTARTS   AGE
calico-kube-controllers-847c8c99d-t4kqm      0/1     Pending   0          164m
calico-node-rmc8z                            0/1     Pending   0          164m
coredns-86f78bb79c-gr5bf                     0/1     Pending   0          138m
hostpath-provisioner-5c65fbdb4f-bb698        0/1     Pending   0          137m
metrics-server-8bbfb4bdb-vnlcl               0/1     Pending   0          136m
kubernetes-dashboard-7ffd448895-69nf7        0/1     Pending   0          136m
dashboard-metrics-scraper-6c4568dc68-xxh2c   0/1     Pending   0          136m

Because the default node does not seem available

kubectl get events --all-namespaces | grep -i kubernetes-dashboard-7ffd448895-69nf7                                                
kube-system   10s         Warning   FailedScheduling          pod/kubernetes-dashboard-7ffd448895-69nf7        0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.

It seems stuck in a weird loop

kubectl describe nodes
.....
  Normal   Starting                 37s    kubelet, bifrost  Starting kubelet.
  Warning  InvalidDiskCapacity      37s    kubelet, bifrost  invalid capacity 0 on image filesystem
  Normal   NodeHasSufficientMemory  37s    kubelet, bifrost  Node bifrost status is now: NodeHasSufficientMemory
  Normal   Starting                 30s    kubelet, bifrost  Starting kubelet.
  Warning  InvalidDiskCapacity      30s    kubelet, bifrost  invalid capacity 0 on image filesystem
  Normal   NodeHasSufficientPID     30s    kubelet, bifrost  Node bifrost status is now: NodeHasSufficientPID
  Normal   NodeHasNoDiskPressure    30s    kubelet, bifrost  Node bifrost status is now: NodeHasNoDiskPressure
  Normal   NodeHasSufficientMemory  30s    kubelet, bifrost  Node bifrost status is now: NodeHasSufficientMemory
  Normal   Starting                 24s    kubelet, bifrost  Starting kubelet.
.....

And I wonder whether has something to do with what I found in journalctl

microk8s.daemon-kubelet[1001446]: E0922 12:22:28.089018 1001446 kubelet.go:1765] skipping pod synchronization - container runtime status check may not have completed yet
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.091302 1001446 kubelet_node_status.go:70] Attempting to register node bifrost
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.102758 1001446 kubelet_node_status.go:108] Node bifrost was previously registered
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.102861 1001446 kubelet_node_status.go:73] Successfully registered node bifrost
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.103699 1001446 cpu_manager.go:184] [cpumanager] starting with none policy
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.103734 1001446 cpu_manager.go:185] [cpumanager] reconciling every 10s
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.103766 1001446 state_mem.go:36] [cpumanager] initializing new in-memory state store
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.103994 1001446 state_mem.go:88] [cpumanager] updated default cpuset: ""
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.104003 1001446 state_mem.go:96] [cpumanager] updated cpuset assignments: "map[]"
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.104012 1001446 policy_none.go:43] [cpumanager] none policy: Start
microk8s.daemon-kubelet[1001446]: F0922 12:22:28.104037 1001446 kubelet.go:1296] Failed to start ContainerManager failed to get rootfs info: failed to get device for dir "/var/snap/microk8s/common/var/lib/kubelet": could not find device with major: 0, minor: 28 in cached partitions map
microk8s.daemon-kubelet[1001446]: goroutine 337 [running]:
microk8s.daemon-kubelet[1001446]: k8s.io/kubernetes/vendor/k8s.io/klog/v2.stacks(0xc000010001, 0xc00060d800, 0xfd, 0xfd)
microk8s.daemon-kubelet[1001446]:         /build/microk8s/parts/k8s-binaries/build/go/src/github.com/kubernetes/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:996 +0xb9
microk8s.daemon-kubelet[1001446]: k8s.io/kubernetes/vendor/k8s.io/klog/v2.(*loggingT).output(0x6ef8140, 0xc000000003, 0x0, 0x0, 0xc0005c0070, 0x6b48993, 0xa, 0x510, 0x0)
.....

How could I fix this? (If that's the reason why it fails)

inspection-report-20200922_135021.tar.gz

Nita Vesa · Answer 1 · Tue Sep 22 2020 21:37:28 GMT+0800 (China Standard Time)

@zar3bski This looks like the same problem I already made an issue about a couple of days back. At least in my case it seems to be because I am using Btrfs.

zar3bski · Answer 2 · Tue Sep 22 2020 21:58:52 GMT+0800 (China Standard Time)

@WereCatf quite a different story: it used to work on my Ubuntu 20.04. What did change was that I moved to a new flat where my public ip is an Ipv6. @stgraber mentioned here
something about ipv6 support but I am not sure which conf file I should edit

Nita Vesa · Answer 3 · Tue Sep 22 2020 22:01:52 GMT+0800 (China Standard Time)

@zar3bski But were you using Microk8s 1.19 on Ubuntu 20.04, or an earlier Microk8s? Microk8s 1.19 refuses to work for me, but 1.18 works fine.

zar3bski · Answer 4 · Tue Sep 22 2020 22:03:25 GMT+0800 (China Standard Time)

snap probably upgraded my cluster at one point. I'll give it a shoot

Konstantinos Tsakalozos · Answer 5 · Wed Sep 23 2020 16:11:52 GMT+0800 (China Standard Time)

@zar3bski @WereCatf could you try one of the following suggestions:

zar3bski · Answer 6 · Wed Sep 23 2020 17:52:03 GMT+0800 (China Standard Time)

Thanks @ktsakalozos , it is probably it. Sorry for the noob question but I am not quite used to snap yet. Where should I put the kubelet conf file suggested by #80633 so it is taken into account?

/snap/microk8s/${the_actual_version}/etc ?

Konstantinos Tsakalozos · Answer 7 · Wed Sep 23 2020 19:07:53 GMT+0800 (China Standard Time)

@zar3bski It looks like this is a feature gate [1] setup. You will need to configure kubelet with the respective feature gate [2] (--feature-gates=....). The kubelet arguments are placed in /var/snap/microk8s/current/args/kubelet and after editing that file you will need to restart MicroK8s with microk8s.stop; microk8s.start;.

[1] https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/
[2] https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/

zar3bski · Answer 8 · Wed Sep 23 2020 19:33:23 GMT+0800 (China Standard Time)

Adding

--feature-gates="LocalStorageCapacityIsolation=false"

to /var/snap/microk8s/current/args/kubelet solved the issue. Many thanks @ktsakalozos !
What about you, @WereCatf ?

Nita Vesa · Answer 9 · Wed Sep 23 2020 22:23:26 GMT+0800 (China Standard Time)

@ktsakalozos @zar3bski Yes, adding that feature-gate seems to workaround the issue and microk8s seems to be running now. It's a rather ugly workaround, but it's better than nothing.

Benjamin Elder · Answer 10 · Thu Aug 04 2022 07:26:32 GMT+0800 (China Standard Time)

👋 Kubernetes featuregates are only available until the feature goes GA (or is removed), and then the featuregate is removed and the feature is just on (or removed), LocalStorageCapacityIsolation is going GA in Kubernetes v1.25.0, but I've requested a kubelet option to enable use cases like this and there will be a kubelet option localStorageCapacityIsolation you can set instead (no idea how to do that with microk8s, sorry).

Benjamin Elder · Answer 11 · Thu Aug 04 2022 07:26:53 GMT+0800 (China Standard Time)

kubernetes/kubernetes#111513, kubernetes/enhancements#361