Azure VM : Connection refused by host
I recently stopped a subscription with 14 VMs in it and restarted it a few days later. Now all my VMs are working just fine at the exception of 6 used for MongoDB.
They respond to ping and so they show as online in the azure dashboard but they do not answer to anything else.
I tried (from different locations, in and out of the azure cloud)
- ssh : connect to host * port *: Connection refused
- telnet : Unable to connect to remote host: Connection refused
- mongo : exception: connect failed
The ports for ssh and mongo are opened in azure. I tried restarting the VMs a few times trough the azure dashboard, they seem to restart successfully but still refuse all connections.
I already looked for similar issues and the best solutions I found was to wait... the issue has been happening for 7 days and waiting is no more an option.
Solution 1:
There is a better and faster solution than downloading/uploading the VHD. As mentioned, the problem "was recognized to be a missing newline in the sshd_config file!"
0. The first step should be different though, because otherwise you can lose your current Cloud Service IP, if you delete the last VM in it - and on web server this is a serious problem. So, FIRST, create a new VM in the same geographic region as the troubled VM, so the Cloud Service has at least one running VM and your current IP is retained.
Delete the troubled VM but do not delete the associated disk. (You must delete the troubled VM to free up the lock on the associated disk which we would be using later). Note the name of the associated disk (AD).
Select the new VM -> dashboard -> Attach -> "Attach Disk" (Note that the "Attach Disk" option is available only when the associated disk of the troubled VM has been freed up)
Choose the correct AD name in the popup that opens up.
Now SSH into the new VM and mount AD:
sudo mkdir /tmp/dsk
sudo mount /dev/sdc1 /tmp/dsk
(AD would typically be /dev/sdc1; if it isn't in your case, you can find it out by sudo cat /var/log/syslog | grep scsi and looking for the name preceding the message "Attached SCSI disk")
- sudo nano /tmp/dsk/etc/ssh/sshd_config
In our case, at the end of the file, we had a setting "UsePAM yesClientAliveInterval 180" - clearly a newline had been erroneously removed by the upgrade after "yes"! So we inserted the newline, saved the file and followed the steps below once sshd_config had been restored.
(You may also want to do a diff between /tmp/dsk/etc/ssh/ssh_config and /etc/ssh/sshd_config to see if any other configurations are off)
Back to the Azure dashboard: Select the new VM -> dashboard -> Detach Disk and detach AD
Fire up a new VM similar to the troubled VM: + New -> Compute -> Virtual Machine -> From Gallery -> My Disks -> AD (AD should show up here)
That's it - hope SSH works now!
Source answer at the bottom of: http://social.msdn.microsoft.com/Forums/en-US/54c600c0-f4d6-4b20-ad87-1358fa10d27a/linux-vm-ssh-connection-refused?forum=WAVirtualMachinesforWindows
Solution 2:
A lot of Azure installations were hit hard by this bug in Ubuntu that would cause people to get locked out of their VMs due to a faulty config. The bug would surface if you started up the VMs from a halted state, or restarted them (after running apt get upgrade
).
The fix for the Azure installations was to download the VHD, mount the VHD as a disk, open the /etc/sshd_config file and fix the faulty line, then reupload the vhd.