canonical / cloud-init

Official upstream for the cloud-init: cloud instance initialization

Home Page:https://cloud-init.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Interface rename fails when using dhpcd due to interface being up

asciiprod opened this issue · comments

Bug report

When testing the Ubuntu 24.04 cloud images on Hetzner Cloud, we noticed that boot up took very long and when the system finished boot up, IPv6 was not configured and the interface was not renamed to eth0.
Upon investigating we found that now dhpcd is used instead of isc-dhcp-client. This prevents prevent renaming the interface as it is still up. To test this, we installed isc-dhcp-client and change the preference to it. Instance is now booting up without problems

Steps to reproduce the problem

provide network config which renames the interface while using dhcpcd

Environment details

  • Cloud-init version: 24.1.3-0ubuntu1
  • Operating System Distribution: 24.04 (noble)
  • Cloud provider, platform or installer type: Hetzner Cloud

cloud-init logs

With dhcpcd

2024-03-28 11:34:34,107 - subp.py[DEBUG]: Running command ['/usr/sbin/dhcpcd', '--ipv4only', '--waitip', '--persistent', '--noarp', '--script=/bin/true', 'enp1s0', '-P'] with allowed return codes [0] (shell=False, capture=True)
2024-03-28 11:34:34,109 - util.py[DEBUG]: Reading from /run/dhcpcd/enp1s0-4.pid (quiet=False)
2024-03-28 11:34:34,109 - util.py[DEBUG]: Read 4 bytes from /run/dhcpcd/enp1s0-4.pid
2024-03-28 11:34:34,109 - util.py[DEBUG]: Reading from /proc/637/stat (quiet=True)
2024-03-28 11:34:34,109 - util.py[DEBUG]: Read 304 bytes from /proc/637/stat
2024-03-28 11:34:34,109 - dhcp.py[DEBUG]: killing dhcpcd with pid=637 gid=636
2024-03-28 11:34:34,110 - ephemeral.py[DEBUG]: Received dhcp lease on enp1s0 for 135.181.159.204/255.255.255.255
...
2024-03-28 11:34:34,196 - stages.py[DEBUG]: applying net config names for {'config': [{'type': 'physical', 'mac_address': '96:00:03:27:7c:46', 'name': 'eth0', 'subnets': [{'type': 'dhcp'}, {'type': 'static', 'address': '2a01:4f9:c011:c12f::1/64', 'gateway': 'fe80::1', 'ipv6': True}]}], 'version': 1}
2024-03-28 11:34:34,202 - subp.py[DEBUG]: Running command ['ip', '-4', 'addr', 'show'] with allowed return codes [0] (shell=False, capture=True)
2024-03-28 11:34:34,205 - net[DEBUG]: Detected interfaces {'enp1s0': {'downable': False, 'device_id': '0x0001', 'driver': 'virtio_net', 'mac': '96:00:03:27:7c:46', 'name': 'enp1s0', 'up': True}, 'lo': {'downable': False, 'device_id': None, 'driver': None, 'mac': '00:00:00:00:00:00', 'name': 'lo', 'up': True}}
2024-03-28 11:34:34,205 - net[DEBUG]: unable to do any work for renaming of [['96:00:03:27:7c:46', 'eth0', None, None]]
2024-03-28 11:34:34,205 - stages.py[WARNING]: Failed to rename devices: Failed to apply network config names: [busy] Error renaming mac=96:00:03:27:7c:46 from enp1s0 to eth0
2024-03-28 11:34:34,205 - stages.py[INFO]: Applying network configuration from ds bringup=False: {'config': [{'type': 'physical', 'mac_address': '96:00:03:27:7c:46', 'name': 'eth0', 'subnets': [{'type': 'dhcp'}, {'type': 'static', 'address': '2a01:4f9:c011:c12f::1/64', 'gateway': 'fe80::1', 'ipv6': True}]}], 'version': 1}

With isc-dhcp-client

2024-03-28 11:45:43,406 - subp.py[DEBUG]: Running command ['/usr/sbin/dhclient', '-1', '-v', '-lf', '/run/dhclient.lease', '-pf', '/run/dhclient.pid', '-sf', '/bin/true', 'enp1s0'] with allowed return codes [0] (shell=False, capture=True)
2024-03-28 11:45:43,535 - subp.py[DEBUG]: command ['/usr/sbin/dhclient', '-1', '-v', '-lf', '/run/dhclient.lease', '-pf', '/run/dhclient.pid', '-sf', '/bin/true', 'enp1s0'] took 0.1s to run
2024-03-28 11:45:43,535 - util.py[DEBUG]: All files appeared after 0 seconds: ['/run/dhclient.pid', '/run/dhclient.lease']
2024-03-28 11:45:43,536 - ephemeral.py[DEBUG]: Received dhcp lease on enp1s0 for 135.181.159.160/255.255.255.255
2024-03-28 11:45:43,537 - url_helper.py[DEBUG]: [0/1] open 'http://169.254.169.254/hetzner/v1/metadata/instance-id' with {'url': 'http://169.254.169.254/hetzner/v1/metadata/instance-id', 'stream': False, 'allow_redirects': True, 'method': 'GET', 'timeout': 5.0, 'headers': {'User-Agent': 'Cloud-Init/24.1.2-0ubuntu1'}} configuration
2024-03-28 11:45:43,538 - ephemeral.py[DEBUG]: Attempting setup of ephemeral network on enp1s0 with 135.181.159.160/32 brd 135.181.159.160
2024-03-28 11:45:43,539 - subp.py[DEBUG]: Running command ['ip', '-family', 'inet', 'addr', 'add', '135.181.159.160/32', 'broadcast', '135.181.159.160', 'dev', 'enp1s0'] with allowed return codes [0] (shell=False, capture=True)
2024-03-28 11:45:43,542 - subp.py[DEBUG]: Running command ['ip', '-family', 'inet', 'link', 'set', 'dev', 'enp1s0', 'up'] with allowed return codes [0] (shell=False, capture=True)
2024-03-28 11:45:43,546 - subp.py[DEBUG]: Running command ['ip', '-4', 'route', 'append', '172.31.1.1/32', 'dev', 'enp1s0'] with allowed return codes [0] (shell=False, capture=True)
2024-03-28 11:45:43,549 - subp.py[DEBUG]: Running command ['ip', '-4', 'route', 'append', '0.0.0.0/0', 'via', '172.31.1.1', 'dev', 'enp1s0'] with allowed return codes [0] (shell=False, capture=True)
2024-03-28 11:45:43,571 - subp.py[DEBUG]: Running command ['ip', '-4', 'route', 'del', '0.0.0.0/0', 'via', '172.31.1.1', 'dev', 'enp1s0'] with allowed return codes [0] (shell=False, capture=True)
2024-03-28 11:45:43,575 - subp.py[DEBUG]: Running command ['ip', '-4', 'route', 'del', '172.31.1.1/32', 'dev', 'enp1s0'] with allowed return codes [0] (shell=False, capture=True)
2024-03-28 11:45:43,578 - subp.py[DEBUG]: Running command ['ip', '-family', 'inet', 'link', 'set', 'dev', 'enp1s0', 'down'] with allowed return codes [0] (shell=False, capture=True)
2024-03-28 11:45:43,582 - subp.py[DEBUG]: Running command ['ip', '-family', 'inet', 'addr', 'del', '135.181.159.160/32', 'dev', 'enp1s0'] with allowed return codes [0] (shell=False, capture=True)
2024-03-28 11:45:43,669 - stages.py[DEBUG]: applying net config names for {'config': [{'mac_address': '96:00:03:27:81:b7', 'name': 'eth0', 'subnets': [{'ipv4': True, 'type': 'dhcp'}, {'address': '2a01:4f9:c011:c090::1/64', 'dns_nameservers': ['2a01:4ff:ff00::add:1', '2a01:4ff:ff00::add:2'], 'gateway': 'fe80::1', 'ipv6': True, 'type': 'static'}], 'type': 'physical'}], 'version': 1}
2024-03-28 11:45:43,670 - subp.py[DEBUG]: Running command ['ip', '-6', 'addr', 'show', 'permanent', 'scope', 'global'] with allowed return codes [0] (shell=False, capture=True)
2024-03-28 11:45:43,674 - subp.py[DEBUG]: Running command ['ip', '-4', 'addr', 'show'] with allowed return codes [0] (shell=False, capture=True)
2024-03-28 11:45:43,678 - net[DEBUG]: Detected interfaces {'enp1s0': {'downable': True, 'device_id': '0x0001', 'driver': 'virtio_net', 'mac': '96:00:03:27:81:b7', 'name': 'enp1s0', 'up': False}, 'lo': {'downable': False, 'device_id': None, 'driver': None, 'mac': '00:00:00:00:00:00', 'name': 'lo', 'up': True}}
2024-03-28 11:45:43,678 - net[DEBUG]: achieving renaming of [['96:00:03:27:81:b7', 'eth0', None, None]] with ops [('rename', '96:00:03:27:81:b7', 'eth0', ('enp1s0', 'eth0'))]
2024-03-28 11:45:43,678 - subp.py[DEBUG]: Running command ['ip', 'link', 'set', 'enp1s0', 'name', 'eth0'] with allowed return codes [0] (shell=False, capture=True)
2024-03-28 11:45:43,695 - stages.py[INFO]: Applying network configuration from ds bringup=False: {'config': [{'mac_address': '96:00:03:27:81:b7', 'name': 'eth0', 'subnets': [{'ipv4': True, 'type': 'dhcp'}, {'address': '2a01:4f9:c011:c090::1/64', 'dns_nameservers': ['2a01:4ff:ff00::add:1', '2a01:4ff:ff00::add:2'], 'gateway': 'fe80::1', 'ipv6': True, 'type': 'static'}], 'type': 'physical'}], 'version': 1}

@asciiprod can you please include the full logs from both systems?

Upon investigating we found that the dhpcd is not terminating fast enough.

As you can see from the logs that you posted, dhcpcd actually ran significantly faster than dhclient, so I don't think that this assumption is accurate.

@asciiprod #5115 may have a fix if you don't mind testing it

@holmanb Thanks for the quick patch. Works just fine. I have updated the description of the issue to match the actual cause.

2024-03-28 21:41:01,990 - dhcp.py[DEBUG]: killing dhcpcd with pid=629 gid=628
2024-03-28 21:41:01,991 - ephemeral.py[DEBUG]: Received dhcp lease on enp1s0 for 135.181.159.204/255.255.255.255
...
2024-03-28 21:41:02,007 - subp.py[DEBUG]: Running command ['ip', '-family', 'inet', 'link', 'set', 'dev', 'enp1s0', 'down'] with allowed return codes [0] (shell=False, capture=True)
2024-03-28 21:41:02,009 - subp.py[DEBUG]: Running command ['ip', '-family', 'inet', 'addr', 'del', '135.181.159.204/32', 'dev', 'enp1s0'] with allowed return codes [0] (shell=False, capture=True)
...
2024-03-28 21:41:02,069 - networking.py[DEBUG]: net: all expected physical devices present
2024-03-28 21:41:02,069 - stages.py[DEBUG]: applying net config names for {'config': [{'type': 'physical', 'mac_address': '96:00:03:27:7c:46', 'name': 'eth0', 'subnets': [{'type': 'dhcp'}, {'type': 'static', 'address': '2a01:4f9:c011:c12f::1/64', 'gateway': 'fe80::1', 'ipv6': True}]}], 'version': 1}
2024-03-28 21:41:02,071 - subp.py[DEBUG]: Running command ['ip', '-6', 'addr', 'show', 'permanent', 'scope', 'global'] with allowed return codes [0] (shell=False, capture=True)
2024-03-28 21:41:02,073 - subp.py[DEBUG]: Running command ['ip', '-4', 'addr', 'show'] with allowed return codes [0] (shell=False, capture=True)
2024-03-28 21:41:02,075 - net[DEBUG]: Detected interfaces {'lo': {'downable': False, 'device_id': None, 'driver': None, 'mac': '00:00:00:00:00:00', 'name': 'lo', 'up': True}, 'enp1s0': {'downable': True, 'device_id': '0x0001', 'driver': 'virtio_net', 'mac': '96:00:03:27:7c:46', 'name': 'enp1s0', 'up': False}}
2024-03-28 21:41:02,075 - net[DEBUG]: Renamed [['96:00:03:27:7c:46', 'eth0', None, None]] with ops [('rename', '96:00:03:27:7c:46', 'eth0', ('enp1s0', 'eth0'))]
2024-03-28 21:41:02,075 - subp.py[DEBUG]: Running command ['ip', 'link', 'set', 'enp1s0', 'name', 'eth0'] with allowed return codes [0] (shell=False, capture=True)
2024-03-28 21:41:02,090 - stages.py[INFO]: Applying network configuration from ds bringup=False: {'config': [{'type': 'physical', 'mac_address': '96:00:03:27:7c:46', 'name': 'eth0', 'subnets': [{'type': 'dhcp'}, {'type': 'static', 'address': '2a01:4f9:c011:c12f::1/64', 'gateway': 'fe80::1', 'ipv6': True}]}], 'version': 1}
2024-03-28 21:41:02,090 - util.py[DEBUG]: Writing to /run/cloud-init/sem/apply_network_config.once - wb: [644] 24 bytes
2024-03-28 21:41:02,092 - distros[DEBUG]: Selected renderer 'netplan' from priority list: ['netplan', 'eni', 'sysconfig']
2024-03-28 21:41:02,094 - subp.py[DEBUG]: Running command ['netplan', 'info'] with allowed return codes [0] (shell=False, capture=True)
2024-03-28 21:41:02,264 - subp.py[DEBUG]: command ['netplan', 'info'] took 0.1s to run

@asciiprod thanks for reporting and testing!

Fixed in #5115