GoogleCloudPlatform / guest-agent

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

bug: `google_set_hostname` changes hostname after reboot

diamonwiggins opened this issue Β· comments

X-posting this one here from GoogleCloudPlatform/guest-configs#69 as it appears that some functionality for the guest configs may be moving here into the agent based off comments in GoogleCloudPlatform/guest-configs#60

Hello πŸ‘‹ . I work for Replicated where we maintain and support a kubeadm based Kubernetes installer, and we've noticed an issue recently where the hostname of a RHEL9 GCP instance changes during the DHCP lease renewal process(generally observed during reboot) which causes that node in the K8s cluster not to recover.

What we observe is the following:

  1. New instance is created and the hostname is set to the fqdn of the instance name

my-node.some.domain.internal

  1. Once the machine is rebooted or if you run a simple dhclient -r && dhclient to obtain a new lease, the script in /usr/bin/google_set_hostname is run which populates /etc/host with:
10.150.15.215 my-node.some.domain.internal my-node  # Added by Google
169.254.169.254 metadata.google.internal  # Added by Google

The interesting thing here is that in order for this code path to be hit, the following conditional must be true:

if [ -n "$new_host_name" ] && [ -n "$new_ip_address" ];

My best guest at the moment is that for a freshly created instance with default base images that this is always true or at least true the first time?

  1. After /etc/hosts is updated, for systems that have Network Manager installed, the following command is run:
nmcli general hostname "${new_host_name%%.*}"

Before the google_set_hostname script is run, the hostname is the fqdn. Afterwards its just the short-hand name which in our case breaks anything that relies on the hostname not changing.

The feels like a bug to us. Any guidance is a appreciated. Thanks!