coreos / coreos-kubernetes

CoreOS Container Linux+Kubernetes documentation & Vagrant installers

Home Page:

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Multi-node: GuaranteedUpdate of /registry/minions/<NODE> failed because of a conflict

ashwinp opened this issue · comments

Issue Details:

  • Worker nodes fail to update the node status.
  • kubectl get nodes on the master does not list one or more worker nodes.
  • Issue can be reproduced intermittently.
  • Restarting kubelet and/or kube-apiserver does not help.
  • This isn't a transient failure. The worker nodes are never able to update the status. They never show up in kubectl get nodes.

Setup details:

  • 3 worker nodes, 1 master node, 1 etcd node
  • All nodes run CoreOS-stable-1409.6.0-hvm (ami-00110279)
  • Issue can be reproduced with Kubernetes 1.6.4 as well as 1.7.0.
  • Issue can be reproduced with etcd 3.5.4 as well as 2.7.* (older version).

kubelet on the worker nodes fails to update the worker node status after claiming to have registered successfully:

kubelet-wrapper[1657]: I0804 16:42:15.216223    1657 kubelet_node_status.go:77] Attempting to register node
kubelet-wrapper[1657]: I0804 16:42:15.218882    1657 kubelet_node_status.go:80] Successfully registered node
kubelet-wrapper[1657]: E0804 16:42:25.230766    1657 kubelet_node_status.go:326] Error updating node status, will retry: error getting node "": nodes "" not found
kubelet-wrapper[1657]: E0804 16:42:25.232449    1657 kubelet_node_status.go:326] Error updating node status, will retry: error getting node "": nodes "" not found

Looking at the Kubernetes API server logs reveals the fact that there is a conflict while updating the node in etcd, due to which the API server deletes the node:

I0804 16:42:15.220414       1 wrap.go:75] GET /api/v1/nodes/ (736.057µs) 200 

[[hyperkube/v1.6.4+coreos.0 (linux/amd64) kubernetes/8996efd/node-controller]]
I0804 16:42:15.227137       1 store.go:329] GuaranteedUpdate of /registry/minions/ failed because of a conflict, going to retry
I0804 16:42:15.227245       1 store.go:329] GuaranteedUpdate of /registry/minions/ failed because of a conflict, going to retry

I0804 16:42:15.227280       1 wrap.go:75] GET /api/v1/pods?fieldSelector=spec.nodeName%3D172.0.60.57: (7.793419ms) 200 [[hyperkube/v1.6.4+coreos.0 (linux/amd64) kubernetes/8996efd/node-controller]]
I0804 16:42:15.227314       1 wrap.go:75] PUT /api/v1/nodes/ (6.490089ms) 409 [[hyperkube/v1.6.4+coreos.0 (linux/amd64) kubernetes/8996efd/node-controller]]
I0804 16:42:15.227250       1 wrap.go:75] PATCH /api/v1/nodes/ (6.805385ms) 200 [[hyperkube/v1.6.4+coreos.0 (linux/amd64) kubernetes/8996efd/ttl-controller]]
I0804 16:42:15.228557       1 wrap.go:75] GET /api/v1/nodes/ (708.958µs) 200 [[hyperkube/v1.6.4+coreos.0 (linux/amd64) kubernetes/8996efd/node-controller]]
I0804 16:42:15.228820       1 wrap.go:75] PATCH /api/v1/namespaces/default/events/ (11.479188ms) 200 [[kubelet/v1.6.4+coreos.0 (linux/amd64) kubernetes/8996efd]]
I0804 16:42:15.228837       1 wrap.go:75] PATCH /api/v1/nodes/ (6.707276ms) 200 [[kubelet/v1.6.4+coreos.0 (linux/amd64) kubernetes/8996efd]]
I0804 16:42:15.229323       1 wrap.go:75] PUT /api/v1/nodes/ (406.754µs) 409 [[hyperkube/v1.6.4+coreos.0 (linux/amd64) kubernetes/8996efd/node-controller]]
I0804 16:42:15.230566       1 wrap.go:75] GET /api/v1/nodes/ (719.769µs) 200 [[hyperkube/v1.6.4+coreos.0 (linux/amd64) kubernetes/8996efd/node-controller]]
I0804 16:42:15.232358       1 wrap.go:75] PUT /api/v1/nodes/ (1.469816ms) 200 [[hyperkube/v1.6.4+coreos.0 (linux/amd64) kubernetes/8996efd/node-controller]]
I0804 16:42:15.232840       1 wrap.go:75] PATCH /api/v1/namespaces/default/events/ (3.188002ms) 200 [[kubelet/v1.6.4+coreos.0 (linux/amd64) kubernetes/8996efd]]
I0804 16:42:15.235985       1 wrap.go:75] PATCH /api/v1/namespaces/default/events/ (2.451278ms) 200 [[kubelet/v1.6.4+coreos.0 (linux/amd64) kubernetes/8996efd]]

I0804 16:42:17.732567       1 wrap.go:75] DELETE /api/v1/nodes/ (2.582459ms) 200 [[hyperkube/v1.6.4+coreos.0 (linux/amd64) kubernetes/8996efd/node-controller]]