test-kitchen / kitchen-digitalocean

A Test Kitchen driver for DigitalOcean

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SSH password prompt during converge

jwadolowski opened this issue · comments

Hey,

I've recently raised a ticket related to droplet_kit that you guys use under the hood, however I think there's an element that may indicate a bug in the driver itself.

droplet_kit issue: digitalocean/droplet_kit#80
Similar ticket from last year: #43

TL;DR

$ kitchen converge server-centos-7-0-x64-chef-12 -l debug
<cut>
-----> Creating <server-centos-7-0-x64-chef-12>...
D      digitalocean:name servercentos70x-kuba-nysa-awv1te7
D      digitalocean:imagecentos-7-0-x64
D      digitalocean:size 1gb
D      digitalocean:region fra1
D      digitalocean:ssh_key_ids <SSH_KEY_ID>
D      digitalocean:private_networking true
D      digitalocean:ipv6 false
D      digitalocean:user_data 
D      digitalocean_api_key <API_KEY>
       Digital Ocean instance <DROPLET_ID> created.
D      digitalocean_api_key <API_KEY>
D      digitalocean_api_key <API_KEY>
D      [SSH] opening connection to root@46.101.X.Y<{:user_known_hosts_file=>"/dev/null", :paranoid=>false, :port=>"22", :compression=>true, :compression_level=>6, :keepalive=>true, :keepalive_interval=>60, :timeout=>15}>
D      [SSH] connection failed (#<Timeout::Error: execution expired>)
       Waiting for SSH service on 46.101.X.Y:22, retrying in 3 seconds
D      [SSH] opening connection to root@46.101.X.Y<{:user_known_hosts_file=>"/dev/null", :paranoid=>false, :port=>"22", :compression=>true, :compression_level=>6, :keepalive=>true, :keepalive_interval=>60, :timeout=>15, :user=>"root"}>
D      [SSH] connection failed (#<Timeout::Error: execution expired>)
       Waiting for SSH service on 46.101.X.Y:22, retrying in 3 seconds
D      [SSH] opening connection to root@46.101.X.Y<{:user_known_hosts_file=>"/dev/null", :paranoid=>false, :port=>"22", :compression=>true, :compression_level=>6, :keepalive=>true, :keepalive_interval=>60, :timeout=>15, :user=>"root"}>
D      [SSH] connection failed (#<Errno::ECONNREFUSED: Connection refused - connect(2) for "46.101.X.Y" port 22>)
       Waiting for SSH service on 46.101.X.Y:22, retrying in 3 seconds
D      [SSH] opening connection to root@46.101.X.Y<{:user_known_hosts_file=>"/dev/null", :paranoid=>false, :port=>"22", :compression=>true, :compression_level=>6, :keepalive=>true, :keepalive_interval=>60, :timeout=>15, :user=>"root"}>
root@46.101.X.Y's password:

But at the same time:

$ ssh root@46.101.X.Y
Last login: Thu Mar 10 12:35:53 2016 from x.x.x.x
[root@servercentos70x-kuba-nysa-sjvlc0c ~]#

Happens only on CentOS 7 quite regularly. Unfortunately it's rather nondeterministic, however occurs pretty frequently.

One more thing - when this prompt pops up, driver assumes that droplet wasn't created at all:

$ kitchen list server-centos-7-0-x64-chef-12
Instance                       Driver        Provisioner  Verifier  Transport  Last Action
server-centos-7-0-x64-chef-12  Digitalocean  ChefZero     Busser    Ssh        <Not Created>

Got the same here...
but seem permanent, both centos or ubuntu targets.
ssh key is existing on droplet as separate ssh is working

$ KITCHEN_YAML=".kitchen.digitalocean.yml" kitchen verify default-ubuntu-1604 -l debug
-----> Starting Kitchen (v1.11.0)
-----> Creating <default-ubuntu-1604>...
D      digitalocean:name defaultubuntu16-xxx-xxx-u2qhulz
D      digitalocean:imageubuntu-16-04-x64
D      digitalocean:size 512mb
D      digitalocean:region nyc1
D      digitalocean:ssh_key_ids SSH_KEYS_ID
D      digitalocean:private_networking true
D      digitalocean:ipv6 false
D      digitalocean:user_data 
D      digitalocean_api_key API_KEY
       Digital Ocean instance <23416149> created.
D      digitalocean_api_key API_KEY
D      digitalocean_api_key API_KEY
D      [SSH] opening connection to root@x.y.165.50<{:user_known_hosts_file=>"/dev/null", :paranoid=>false, :port=>"22", :compression=>false, :compression_level=>0, :keepalive=>true, :keepalive_interval=>60, :timeout=>15}>
D      [SSH] connection failed (#<Net::SSH::ConnectionTimeout: Net::SSH::ConnectionTimeout>)
       Waiting for SSH service on x.y.165.50:22, retrying in 3 seconds
D      [SSH] opening connection to root@x.y.165.50<{:user_known_hosts_file=>"/dev/null", :paranoid=>false, :port=>"22", :compression=>false, :compression_level=>0, :keepalive=>true, :keepalive_interval=>60, :timeout=>15, :user=>"root"}>
D      [SSH] connection failed (#<Errno::ECONNREFUSED: Connection refused - connect(2) for x.y.165.50:22>)
       Waiting for SSH service on x.y.165.50:22, retrying in 3 seconds
D      [SSH] opening connection to root@x.y.165.50<{:user_known_hosts_file=>"/dev/null", :paranoid=>false, :port=>"22", :compression=>false, :compression_level=>0, :keepalive=>true, :keepalive_interval=>60, :timeout=>15, :user=>"root"}>
D      [SSH] connection failed (#<Errno::ECONNREFUSED: Connection refused - connect(2) for x.y.165.50:22>)
       Waiting for SSH service on x.y.165.50:22, retrying in 3 seconds
D      [SSH] opening connection to root@x.y.165.50<{:user_known_hosts_file=>"/dev/null", :paranoid=>false, :port=>"22", :compression=>false, :compression_level=>0, :keepalive=>true, :keepalive_interval=>60, :timeout=>15, :user=>"root"}>
Text will be echoed in the clear. Please install the HighLine or Termios libraries to suppress echoed text.
root@x.y.165.50's password:^CD      [SSH] shutting previous connection root@x.y.165.50<{:user_known_hosts_file=>"/dev/null", :paranoid=>false, :port=>"22", :compression=>false, :compression_level=>0, :keepalive=>true, :keepalive_interval=>60, :timeout=>15, :user=>"root"}>
$ gem list

*** LOCAL GEMS ***

activesupport (5.0.0.1)
addressable (2.3.8)
artifactory (2.3.3)
axiom-types (0.1.1)
bigdecimal (1.2.8)
bundler-unload (1.0.2)
coercible (1.0.0)
concurrent-ruby (1.0.2)
descendants_tracker (0.0.4)
did_you_mean (1.0.0)
droplet_kit (1.4.3)
equalizer (0.0.11)
executable-hooks (1.3.2)
faraday (0.9.2)
gem-wrappers (1.2.7)
i18n (0.7.0)
ice_nine (0.11.2)
io-console (0.4.5)
json (1.8.3)
kartograph (0.2.4)
kitchen-ansible (0.45.2)
kitchen-digitalocean (0.9.5)
kitchen-lxd_cli (2.0.0)
kitchen-sync (2.1.1)
kitchen-vagrant (0.20.0)
kitchen-verifier-serverspec (0.5.2)
minitest (5.8.3)
mixlib-install (1.1.0)
mixlib-shellout (2.2.6)
mixlib-versioning (1.1.0)
multipart-post (2.0.0)
net-scp (1.2.1)
net-sftp (2.1.2)
net-ssh (3.2.0)
net-ssh-gateway (1.2.0)
net-telnet (0.1.1)
power_assert (0.2.6)
psych (2.0.17)
rake (10.4.2)
rdoc (4.2.1)
resource_kit (0.1.5)
rubygems-bundler (1.4.4)
rvm (1.11.3.9)
safe_yaml (1.0.4)
test-kitchen (1.11.0)
test-unit (3.1.5)
thor (0.19.1)
thread_safe (0.3.5)
tzinfo (1.2.2)                     
virtus (1.0.5)

@juju4 make sure to update to test-kitchen 1.11.1 which fixes a bug related to the use of password auth vs. ssh keys where 1.11.0 may be using password auth when ssh keys are intended.

still the same

$ gem update test-kitchen
$ KITCHEN_YAML=".kitchen.digitalocean.yml" kitchen verify default-ubuntu-1604 -l debug
-----> Starting Kitchen (v1.11.1)
-----> Creating <default-ubuntu-1604>...
D      digitalocean:name defaultubuntu16-xxx-xxx-zahqu9t
D      digitalocean:imageubuntu-16-04-x64
D      digitalocean:size 512mb
D      digitalocean:region nyc1
D      digitalocean:ssh_key_ids SSH_KEYS_ID
D      digitalocean:private_networking true
D      digitalocean:ipv6 false
D      digitalocean:user_data 
D      digitalocean_api_key API_KEY
       Digital Ocean instance <23419166> created.
D      digitalocean_api_key API_KEY
D      digitalocean_api_key API_KEY
D      [SSH] opening connection to root@x.y.129.129<{:user_known_hosts_file=>"/dev/null", :paranoid=>false, :port=>"22", :compression=>false, :compression_level=>0, :keepalive=>true, :keepalive_interval=>60, :timeout=>15}>
D      [SSH] connection failed (#<Net::SSH::ConnectionTimeout: Net::SSH::ConnectionTimeout>)
       Waiting for SSH service on x.y.129.129:22, retrying in 3 seconds
D      [SSH] opening connection to root@x.y.129.129<{:user_known_hosts_file=>"/dev/null", :paranoid=>false, :port=>"22", :compression=>false, :compression_level=>0, :keepalive=>true, :keepalive_interval=>60, :timeout=>15, :user=>"root"}>
D      [SSH] connection failed (#<Errno::ECONNREFUSED: Connection refused - connect(2) for x.y.129.129:22>)
       Waiting for SSH service on x.y.129.129:22, retrying in 3 seconds
D      [SSH] opening connection to root@x.y.129.129<{:user_known_hosts_file=>"/dev/null", :paranoid=>false, :port=>"22", :compression=>false, :compression_level=>0, :keepalive=>true, :keepalive_interval=>60, :timeout=>15, :user=>"root"}>
Text will be echoed in the clear. Please install the HighLine or Termios libraries to suppress echoed text.
root@x.y.129.129's password:^CD      [SSH] shutting previous connection root@x.y.129.129<{:user_known_hosts_file=>"/dev/null", :paranoid=>false, :port=>"22", :compression=>false, :compression_level=>0, :keepalive=>true, :keepalive_interval=>60, :timeout=>15, :user=>"root"}>

do you set a password in your kitchen.yml? If so, try removing it.

Nope

$ egrep -v '(^$|^#)' .kitchen.digitalocean.yml 
---
driver:
  name: digitalocean
driver_config:
  region: nyc1
  size: 512mb
transport:
  name: sftp
provisioner:
  name: ansible_playbook
  roles_path: ../
  hosts: test-kitchen
  ansible_verbose: false
  ansible_verbosity: 3
  ansible_extra_flags: <%= ENV['ANSIBLE_EXTRA_FLAGS'] %>
  ansible_yum_repo: http://dl.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-8.noarch.rpm
  require_ansible_omnibus: true
  enable_yum_epel: true
  ansible_connection: ssh
platforms:
  - name: ubuntu-16.04
    driver:
      name: digitalocean
      image: ubuntu-16-04-x64
  - name: ubuntu-14.04
    driver:
      name: digitalocean
      image: ubuntu-14-04-x64
  - name: ubuntu-12.04
    driver:
      name: digitalocean
      image: ubuntu-12-04-x64
  - name: centos-7
    driver:
      name: digitalocean
      image: centos-7-2-x64
  - name: centos-6
    driver:
      name: digitalocean
      image: centos-6-8-x64
suites:
  - name: default
    run_list:
    attributes:

it happens with both transport: sftp and default transport.

What happens if you mv .ssh/config .ssh/config.backup and run it?

Hello @gregf

Instance got created but could not connect. ssh key issue?

$ KITCHEN_YAML=".kitchen.digitalocean.yml" kitchen verify default-ubuntu-1604 -l debug
-----> Starting Kitchen (v1.13.0)
-----> Creating <default-ubuntu-1604>...
D      digitalocean:name defaultubuntu16-aaa-bbb-t9p4kdn
D      digitalocean:imageubuntu-16-04-x64
D      digitalocean:size 512mb
D      digitalocean:region nyc1
D      digitalocean:ssh_key_ids 12345, 6789
D      digitalocean:private_networking true
D      digitalocean:ipv6 false
D      digitalocean:user_data 
D      digitalocean_api_key xxxxxx
       Digital Ocean instance <30410332> created.
D      digitalocean_api_key xxxxxx
D      digitalocean_api_key xxxxxx
D      [SSH] opening connection to root@67.205.143.17<{:user_known_hosts_file=>"/dev/null", :paranoid=>false, :port=>"22", :compression=>false, :compression_level=>0, :keepalive=>true, :keepalive_interval=>60, :timeout=>15}>
D      [SSH] connection failed (#<Net::SSH::ConnectionTimeout: Net::SSH::ConnectionTimeout>)
       Waiting for SSH service on 67.205.143.17:22, retrying in 3 seconds
D      [SSH] opening connection to root@67.205.143.17<{:user_known_hosts_file=>"/dev/null", :paranoid=>false, :port=>"22", :compression=>false, :compression_level=>0, :keepalive=>true, :keepalive_interval=>60, :timeout=>15, :user=>"root"}>
D      [SSH] connection failed (#<Errno::ECONNREFUSED: Connection refused - connect(2) for 67.205.143.17:22>)
       Waiting for SSH service on 67.205.143.17:22, retrying in 3 seconds
D      [SSH] opening connection to root@67.205.143.17<{:user_known_hosts_file=>"/dev/null", :paranoid=>false, :port=>"22", :compression=>false, :compression_level=>0, :keepalive=>true, :keepalive_interval=>60, :timeout=>15, :user=>"root"}>
D      [SSH] connection failed (#<Errno::ECONNREFUSED: Connection refused - connect(2) for 67.205.143.17:22>)
       Waiting for SSH service on 67.205.143.17:22, retrying in 3 seconds
D      [SSH] opening connection to root@67.205.143.17<{:user_known_hosts_file=>"/dev/null", :paranoid=>false, :port=>"22", :compression=>false, :compression_level=>0, :keepalive=>true, :keepalive_interval=>60, :timeout=>15, :user=>"root"}>
D      [SSH] connection failed (#<Errno::ECONNREFUSED: Connection refused - connect(2) for 67.205.143.17:22>)
       Waiting for SSH service on 67.205.143.17:22, retrying in 3 seconds
D      [SSH] opening connection to root@67.205.143.17<{:user_known_hosts_file=>"/dev/null", :paranoid=>false, :port=>"22", :compression=>false, :compression_level=>0, :keepalive=>true, :keepalive_interval=>60, :timeout=>15, :user=>"root"}>
D      [SSH] connection failed (#<Errno::ECONNREFUSED: Connection refused - connect(2) for 67.205.143.17:22>)
       Waiting for SSH service on 67.205.143.17:22, retrying in 3 seconds
^CD      [SSH] shutting previous connection root@67.205.143.17<{:user_known_hosts_file=>"/dev/null", :paranoid=>false, :port=>"22", :compression=>false, :compression_level=>0, :keepalive=>true, :keepalive_interval=>60, :timeout=>15, **:user=>"root"}>**

I'm exporting ssh keys id inside DIGITALOCEAN_SSH_KEY_IDS

Just did that exercise @gregf, but it didn't help and I ended up with the same problem.

Did a few more attempts after mv ~/.ssh/config ~/.ssh/config.backup and the results were as follows:

  • password prompt (as initially mentioned ssh root@IP works flawlessly even though test-kitchen expects password)
  • successful login 2 times in a row
  • password prompt again

EDIT:

Interesting observation - when that prompt pops up and I press ENTER, test kitchen tries to log in once again and this time it ends successfully:

...
D      [SSH] connection failed (#<Errno::ECONNREFUSED: Connection refused - connect(2) for 138.68.x.y:22>)
       Waiting for SSH service on 138.68.x.y:22, retrying in 3 seconds
D      [SSH] opening connection to root@138.68.x.y<{:user_known_hosts_file=>"/dev/null", :paranoid=>false, :port=>"22", :compression=>false, :compression_level=>0, :keepalive=>true, :keepalive_interval=>60, :timeout=>15, :user=>"root"}>
root@138.68.x.y's password:
root@138.68.x.y's password:
root@138.68.x.y's password:
D      [SSH] connection failed (#<Net::SSH::AuthenticationFailed: Authentication failed for user root@138.68.x.y>)
       Waiting for SSH service on 138.68.x.y:22, retrying in 3 seconds
D      [SSH] opening connection to root@138.68.x.y<{:user_known_hosts_file=>"/dev/null", :paranoid=>false, :port=>"22", :compression=>false, :compression_level=>0, :keepalive=>true, :keepalive_interval=>60, :timeout=>15, :user=>"root"}>
D      [SSH] root@138.68.x.y<{:user_known_hosts_file=>"/dev/null", :paranoid=>false, :port=>"22", :compression=>false, :compression_level=>0, :keepalive=>true, :keepalive_interval=>60, :timeout=>15, :user=>"root"}> (echo '[SSH] Established')
       [SSH] Established
       (ssh ready)

D      digitalocean:create 138.68.x.y
       Finished creating <server-centos-7-0-x64-chef-12> (0m50.24s).
-----> Converging <server-centos-7-0-x64-chef-12>...
$$$$$$ Running legacy converge for 'Digitalocean' Driver
...

It's been a while, but it seems I've just managed to partially fix this problem.

Test Kitchen v1.16.0 introduced a feature that allows you to enforce SSH key authentication only. Someone else encountered exactly the same problem:

When bootstrapping with cloud-init, there is a small window of time where the public ssh key will not be present, this will cause a login failure and eventually fallback to a password prompt.

Full PR details: test-kitchen/test-kitchen#1141

The feature itself is not documented, but I found this in the source code.

All in all, when the following snippet is present in .kitchen.yml file:

transport:
  ssh_key_only: true

the password prompt is gone.

Unfortunately, it didn't solve the problem completely, as one of my droplets got stuck in a way that indicates something went wrong with cloud-init during droplet creation:

Waiting for SSH service on 207.154.A.B:22, retrying in 3 seconds
Waiting for SSH service on 207.154.A.B:22, retrying in 3 seconds
Waiting for SSH service on 207.154.A.B:22, retrying in 3 seconds
Waiting for SSH service on 207.154.A.B:22, retrying in 3 seconds
Waiting for SSH service on 207.154.A.B:22, retrying in 3 seconds
Waiting for SSH service on 207.154.A.B:22, retrying in 3 seconds
Waiting for SSH service on 207.154.A.B:22, retrying in 3 seconds
Waiting for SSH service on 207.154.A.B:22, retrying in 3 seconds
Waiting for SSH service on 207.154.A.B:22, retrying in 3 seconds
...
Waiting for SSH service on 207.154.A.B:22, retrying in 3 seconds
Waiting for SSH service on 207.154.A.B:22, retrying in 3 seconds
Waiting for SSH service on 207.154.A.B:22, retrying in 3 seconds
Waiting for SSH service on 207.154.A.B:22, retrying in 3 seconds
Waiting for SSH service on 207.154.A.B:22, retrying in 3 seconds
Waiting for SSH service on 207.154.A.B:22, retrying in 3 seconds
Waiting for SSH service on 207.154.A.B:22, retrying in 3 seconds
Waiting for SSH service on 207.154.A.B:22, retrying in 3 seconds
$$$$$$ [SSH] connection failed, terminating (#<Net::SSH::AuthenticationFailed: Authentication failed for user root@207.154.A.B>)

I tried to log in by hand, but it keeps failing:

$ ssh 207.154.A.B -o StrictHostKeyChecking=no
root@207.154.A.B's password:

$ ssh 207.154.A.B -o StrictHostKeyChecking=no -o PreferredAuthentications=publickey
Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).

Can kitchen-digitalocean driver enforce ssh_key_only transport globally? If not, such information should be at least mentioned in the README file or some warning should be displayed if it's not turned on.