autossh does not kill ssh when link down
I have started my autossh witt a poll time of 30 s:
AUTOSSH_POLL=30 AUTOSSH_LOGLEVEL=7 autossh -M 0 -f -S none -f -N -L localhost:34567:localhost:6543 user1@server1
And it is working fine:
Sep 5 12:26:44 serverA autossh[20935]: check on child 23084
Sep 5 12:26:44 serverA autossh[20935]: set alarm for 30 secs
But if I physically remove the network cable, meaning the tunnel can not be working anymore, autossh does not kill the ssh daemon. Why? I understand that autossh can not do anything if the link is down, but in my opinion it should try to do the following:
- Verify the child ssh process (
check on child ...
) - Verify the far-end!!! (a ping-like operation through the tunnel)
- Realize that the tunnel is down
- Stop the ssh process
- Try to create the tunnel again
- Realize that it does not work, and setup a (exponentially increasing?) timer to check again soon
That is why I am running autossh: if something happens to the tunnel (be it a software or hardware problem), it should try to restart it. Instead, it is just waiting for the ssh process to die. Shouldn't it be trying to restart it, even if there is no hope of reestablishing the connection?
What kind of check is doing autossh? Just verify that the ssh is up and running? Is it not doing any kind of far-end check?
Edit
As requested, I add the relevant part of the ssh config:
# (see http://aaroncrane.co.uk/2008/04/ssh_faster)
# The ServerAliveInterval tells SSH to send a keepalive message every 60 seconds while the connection is open;
# that both helps poor-quality NAT routers understand that the NAT table entry for your connection should
# be kept alive, and helps SSH detect when there’s a network problem between the server and client.
ServerAliveInterval 60
# The ServerAliveCountMax says that after 60 consecutive unanswered keepalive messages, the connection should
# be dropped. At that point, AutoSSH should try to invoke a fresh SSH client. You can tweak those
# specific values if you want, but they seem to work well for me.
ServerAliveCountMax 60
TCPKeepAlive yes
Solution 1:
But if I physically remove the network cable, meaning the tunnel can not be working anymore, autossh does not kill the ssh daemon. Why?
autossh runs on your client machine, so it cannot directly kill the ssh daemon process on the server. However, you can specify a non-zero value for ClientAliveInterval
in /etc/ssh/sshd_config
on the server (see man sshd_config
) and restart the sshd service on the server to apply the config change. Then in the event of a network disconnection, the ssh daemon process will be killed after ClientAliveInterval * ClientAliveCountMax
seconds (but not by autossh).
Now, if you meant to ask "Why doesn't autossh kill the ssh client process?", you have specified -M 0
. From the autossh man page:
Setting the monitor port to 0 turns the monitoring function off, and autossh will only restart ssh upon ssh's exit
.
Instead of using autossh to monitor the connection, you are waiting for ssh to exit after a timeout of ServerAliveCountInterval * ServerAliveCountMax
seconds. You have requested 60 server-alive checks before ssh exits, with a 60-second interval separating consecutive checks, so you will be waiting an hour before your ssh client exits.
You should also strongly consider using the ExitOnForwardFailure
option on the client side (see man ssh_config
), so that ssh will exit if it can't establish a tunnel, and then autossh can try to launch ssh again.