Failed to start ContainerManager
zar3bski opened this issue · comments
All my pods end up stuck to Pending
kubectl get -n kube-system pods
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-847c8c99d-t4kqm 0/1 Pending 0 164m
calico-node-rmc8z 0/1 Pending 0 164m
coredns-86f78bb79c-gr5bf 0/1 Pending 0 138m
hostpath-provisioner-5c65fbdb4f-bb698 0/1 Pending 0 137m
metrics-server-8bbfb4bdb-vnlcl 0/1 Pending 0 136m
kubernetes-dashboard-7ffd448895-69nf7 0/1 Pending 0 136m
dashboard-metrics-scraper-6c4568dc68-xxh2c 0/1 Pending 0 136m
Because the default node does not seem available
kubectl get events --all-namespaces | grep -i kubernetes-dashboard-7ffd448895-69nf7
kube-system 10s Warning FailedScheduling pod/kubernetes-dashboard-7ffd448895-69nf7 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
It seems stuck in a weird loop
kubectl describe nodes
.....
Normal Starting 37s kubelet, bifrost Starting kubelet.
Warning InvalidDiskCapacity 37s kubelet, bifrost invalid capacity 0 on image filesystem
Normal NodeHasSufficientMemory 37s kubelet, bifrost Node bifrost status is now: NodeHasSufficientMemory
Normal Starting 30s kubelet, bifrost Starting kubelet.
Warning InvalidDiskCapacity 30s kubelet, bifrost invalid capacity 0 on image filesystem
Normal NodeHasSufficientPID 30s kubelet, bifrost Node bifrost status is now: NodeHasSufficientPID
Normal NodeHasNoDiskPressure 30s kubelet, bifrost Node bifrost status is now: NodeHasNoDiskPressure
Normal NodeHasSufficientMemory 30s kubelet, bifrost Node bifrost status is now: NodeHasSufficientMemory
Normal Starting 24s kubelet, bifrost Starting kubelet.
.....
And I wonder whether has something to do with what I found in journalctl
microk8s.daemon-kubelet[1001446]: E0922 12:22:28.089018 1001446 kubelet.go:1765] skipping pod synchronization - container runtime status check may not have completed yet
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.091302 1001446 kubelet_node_status.go:70] Attempting to register node bifrost
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.102758 1001446 kubelet_node_status.go:108] Node bifrost was previously registered
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.102861 1001446 kubelet_node_status.go:73] Successfully registered node bifrost
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.103699 1001446 cpu_manager.go:184] [cpumanager] starting with none policy
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.103734 1001446 cpu_manager.go:185] [cpumanager] reconciling every 10s
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.103766 1001446 state_mem.go:36] [cpumanager] initializing new in-memory state store
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.103994 1001446 state_mem.go:88] [cpumanager] updated default cpuset: ""
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.104003 1001446 state_mem.go:96] [cpumanager] updated cpuset assignments: "map[]"
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.104012 1001446 policy_none.go:43] [cpumanager] none policy: Start
microk8s.daemon-kubelet[1001446]: F0922 12:22:28.104037 1001446 kubelet.go:1296] Failed to start ContainerManager failed to get rootfs info: failed to get device for dir "/var/snap/microk8s/common/var/lib/kubelet": could not find device with major: 0, minor: 28 in cached partitions map
microk8s.daemon-kubelet[1001446]: goroutine 337 [running]:
microk8s.daemon-kubelet[1001446]: k8s.io/kubernetes/vendor/k8s.io/klog/v2.stacks(0xc000010001, 0xc00060d800, 0xfd, 0xfd)
microk8s.daemon-kubelet[1001446]: /build/microk8s/parts/k8s-binaries/build/go/src/github.com/kubernetes/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:996 +0xb9
microk8s.daemon-kubelet[1001446]: k8s.io/kubernetes/vendor/k8s.io/klog/v2.(*loggingT).output(0x6ef8140, 0xc000000003, 0x0, 0x0, 0xc0005c0070, 0x6b48993, 0xa, 0x510, 0x0)
.....
How could I fix this? (If that's the reason why it fails)
@zar3bski This looks like the same problem I already made an issue about a couple of days back. At least in my case it seems to be because I am using Btrfs.
@zar3bski But were you using Microk8s 1.19 on Ubuntu 20.04, or an earlier Microk8s? Microk8s 1.19 refuses to work for me, but 1.18 works fine.
snap probably upgraded my cluster at one point. I'll give it a shoot
Thanks @ktsakalozos , it is probably it. Sorry for the noob question but I am not quite used to snap yet. Where should I put the kubelet conf file suggested by #80633 so it is taken into account?
/snap/microk8s/${the_actual_version}/etc
?
@zar3bski It looks like this is a feature gate [1] setup. You will need to configure kubelet with the respective feature gate [2] (--feature-gates=....
). The kubelet arguments are placed in /var/snap/microk8s/current/args/kubelet
and after editing that file you will need to restart MicroK8s with microk8s.stop; microk8s.start;
.
[1] https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/
[2] https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/
Adding
--feature-gates="LocalStorageCapacityIsolation=false"
to /var/snap/microk8s/current/args/kubelet
solved the issue. Many thanks @ktsakalozos !
What about you, @WereCatf ?
@ktsakalozos @zar3bski Yes, adding that feature-gate seems to workaround the issue and microk8s seems to be running now. It's a rather ugly workaround, but it's better than nothing.
👋 Kubernetes featuregates are only available until the feature goes GA (or is removed), and then the featuregate is removed and the feature is just on (or removed), LocalStorageCapacityIsolation is going GA in Kubernetes v1.25.0, but I've requested a kubelet option to enable use cases like this and there will be a kubelet option localStorageCapacityIsolation
you can set instead (no idea how to do that with microk8s, sorry).