Skip to content

Commit

Permalink
Retry SSH connection on Errno::ECONNABORTED
Browse files Browse the repository at this point in the history
In some cases the SSH connection may be aborted while waiting
for setup. This includes aborted connections in the list of
applicable exceptions to retry on while waiting for the connection
to become available.

Fixes hashicorp#8520
  • Loading branch information
chrisroberts committed Apr 25, 2017
1 parent 5fa23c4 commit 2acded1
Show file tree
Hide file tree
Showing 3 changed files with 32 additions and 15 deletions.
4 changes: 4 additions & 0 deletions lib/vagrant/errors.rb
Original file line number Diff line number Diff line change
Expand Up @@ -664,6 +664,10 @@ class SSHConnectionRefused < VagrantError
error_key(:ssh_connection_refused)
end

class SSHConnectionAborted < VagrantError
error_key(:ssh_connection_aborted)
end

class SSHConnectionReset < VagrantError
error_key(:ssh_connection_reset)
end
Expand Down
36 changes: 21 additions & 15 deletions plugins/communicators/ssh/communicator.rb
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,20 @@ class Communicator < Vagrant.plugin("2", :communicator)
PTY_DELIM_END = "bccbb768c119429488cfd109aacea6b5-pty"
# Marker for start of regular command output
CMD_GARBAGE_MARKER = "41e57d38-b4f7-4e46-9c38-13873d338b86-vagrant-ssh"
# These are the exceptions that we retry because they represent
# errors that are generally fixed from a retry and don't
# necessarily represent immediate failure cases.
SSH_RETRY_EXCEPTIONS = [
Errno::EACCES,
Errno::EADDRINUSE,
Errno::ECONNABORTED,
Errno::ECONNREFUSED,
Errno::ECONNRESET,
Errno::ENETUNREACH,
Errno::EHOSTUNREACH,
Net::SSH::Disconnect,
Timeout::Error
]

include Vagrant::Util::ANSIEscapeCodeRemover
include Vagrant::Util::Retryable
Expand Down Expand Up @@ -81,6 +95,8 @@ def wait_for_ready(timeout)
message = "Connection refused."
rescue Vagrant::Errors::SSHConnectionReset
message = "Connection reset."
rescue Vagrant::Errors::SSHConnectionAborted
message = "Connection aborted."
rescue Vagrant::Errors::SSHHostDown
message = "Host appears down."
rescue Vagrant::Errors::SSHNoRoute
Expand Down Expand Up @@ -350,24 +366,10 @@ def connect(**opts)
# Connect to SSH, giving it a few tries
connection = nil
begin
# These are the exceptions that we retry because they represent
# errors that are generally fixed from a retry and don't
# necessarily represent immediate failure cases.
exceptions = [
Errno::EACCES,
Errno::EADDRINUSE,
Errno::ECONNREFUSED,
Errno::ECONNRESET,
Errno::ENETUNREACH,
Errno::EHOSTUNREACH,
Net::SSH::Disconnect,
Timeout::Error
]

timeout = 60

@logger.info("Attempting SSH connection...")
connection = retryable(tries: opts[:retries], on: exceptions) do
connection = retryable(tries: opts[:retries], on: SSH_RETRY_EXCEPTIONS) do
Timeout.timeout(timeout) do
begin
# This logger will get the Net-SSH log data for us.
Expand Down Expand Up @@ -426,6 +428,10 @@ def connect(**opts)
# This is raised if we failed to connect the max number of times
# due to an ECONNRESET.
raise Vagrant::Errors::SSHConnectionReset
rescue Errno::ECONNABORTED
# This is raised if we failed to connect the max number of times
# due to an ECONNABORTED
raise Vagrant::Errors::SSHConnectionAborted
rescue Errno::EHOSTDOWN
# This is raised if we get an ICMP DestinationUnknown error.
raise Vagrant::Errors::SSHHostDown
Expand Down
7 changes: 7 additions & 0 deletions templates/locales/en.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1189,6 +1189,13 @@ en:
If that doesn't work, destroy your VM and recreate it with a `vagrant destroy`
followed by a `vagrant up`. If that doesn't work, contact a Vagrant
maintainer (support channels listed on the website) for more assistance.
ssh_connection_aborted: |-
SSH connection was aborted! This usually happens when the machine is taking
too long to reboot or the SSH daemon is not properly configured on the VM.
First, try reloading your machine with `vagrant reload`, since a simple
restart sometimes fixes things. If that doesn't work, destroy your machine
and recreate it with a `vagrant destroy` followed by a `vagrant up`. If that
doesn't work, contact support.
ssh_connection_reset: |-
SSH connection was reset! This usually happens when the machine is
taking too long to reboot. First, try reloading your machine with
Expand Down

0 comments on commit 2acded1

Please sign in to comment.