when scaling up cluster and 2 or more nodes are starting in the same time there is a race condition
Tchirana opened this issue · comments
When 2 or more nodes are starting in the same time a race conditions occurs from time to time and in some cases ips do not get assigned to nodes. You should watch first is a AddInstanceAddress request exists and after that start a new asignament
here is a snip from 2 nodes starting in the same time:
kubeip-f26p9 kubeip time="2023-12-06T14:20:15Z" level=debug msg="found 8 available addresses"
kubeip-mr6kt kubeip time="2023-12-06T14:20:18Z" level=debug msg="found 8 available addresses"
AND:
kubeip-g9mxj kubeip time="2023-12-06T14:20:59Z" level=error msg="failed to assign static public IP address xx.xx.xx.xx" func="github.com/doitintl/kubeip/internal/address.(*gcpAssigner).Assign" file="/app/internal/address/gcp.go:250" error="address is already assigned" version=sha-ce43fbb
kubeip-g9mxj kubeip time="2023-12-06T14:20:59Z" level=info msg="adding public IP address to instance" func="github.com/doitintl/kubeip/internal/address.(*gcpAssigner).AddInstanceAddress" file="/app/internal/address/gcp.go:188" accessConfig="&{ 0 compute#accessConfig External IP yy.yy.yy.yy false ONE_TO_ONE_NAT [] []}" version=sha-ce43fbb
@Tchirana thank you for reporting
Need to implement distributed Mutex, planned in the future. Currently added a random 10s-max sleep before listing available IP addresses. Supposed to reduce conflicts a bit.
fixed