coreos / afterburn

A one-shot cloud provider agent

Home Page:https://coreos.github.io/afterburn/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

azure: fails with ignition+password auth

cgwalters opened this issue · comments

I booted a FCOS VM in Azure with just an Ignition config and --authentication-type password to disable Azure's requirement for SSH keys.

[root@walters-fcos ~]# systemctl status afterburn-sshkeys@core
● afterburn-sshkeys@core.service - Afterburn (SSH Keys)
   Loaded: loaded (/usr/lib/systemd/system/afterburn-sshkeys@.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Tue 2020-04-28 15:26:00 UTC; 48s ago
  Process: 903 ExecStart=/usr/bin/afterburn ${AFTERBURN_OPT_PROVIDER} --ssh-keys=core (code=exited, status=1/FAILURE)
 Main PID: 903 (code=exited, status=1/FAILURE)

Apr 28 15:26:00 walters-fcos afterburn[903]: Apr 28 15:26:00.092 INFO Fetch successful
Apr 28 15:26:00 walters-fcos afterburn[903]: Error: failed to run
Apr 28 15:26:00 walters-fcos afterburn[903]: Caused by: writing ssh keys
Apr 28 15:26:00 walters-fcos afterburn[903]: Caused by: failed to get certs
Apr 28 15:26:00 walters-fcos afterburn[903]: Caused by: failed to get certificates
Apr 28 15:26:00 walters-fcos afterburn[903]: Caused by: failed to parse uri
Apr 28 15:26:00 walters-fcos afterburn[903]: Caused by: relative URL without a base
Apr 28 15:26:00 walters-fcos systemd[1]: afterburn-sshkeys@core.service: Main process exited, code=exited, status=1/FAILURE
Apr 28 15:26:00 walters-fcos systemd[1]: afterburn-sshkeys@core.service: Failed with result 'exit-code'.
Apr 28 15:26:00 walters-fcos systemd[1]: Failed to start Afterburn (SSH Keys).

Thanks for the report.
I think that means that the XML value for this property is not a full-blown URL.
It would be interesting to dump the content of the goalstate response and see what it actually contains.

Still seeing this in e.g. openshift/installer#3613

@cgwalters to the best of my knowledge you shouldn't be seeing this in OpenShift, as by design it is not supposed to use cloud SSH keys (unless that design decision changed at some point).

@cgwalters to the best of my knowledge you shouldn't be seeing this in OpenShift, as by design it is not supposed to use cloud SSH keys (unless that design decision changed at some point).

Ah, right: https://gitlab.cee.redhat.com/coreos/redhat-coreos/merge_requests/972

I do suspect we are getting back an empty property value or some other kind of magic marker. If that's indeed the way for the platform to signal "SSH keys are disabled", we should probably gracefully warn and exit without error in that specific case.

I just checked on a password-auth instance, and indeed there is no Certificates entry in the Configuration stanza. This is how the XML section there looks like:

<Configuration>
  <HostingEnvironmentConfig>http://168.63.129.16:80/...</HostingEnvironmentConfig>
  <SharedConfig>http://168.63.129.16:80/...</SharedConfig>
  <ExtensionsConfig>http://168.63.129.16:80/...</ExtensionsConfig>
  <FullConfig>http://168.63.129.16:80/...</FullConfig>
  <ConfigName>http://168.63.129.16:80/...</ConfigName>
</Configuration>