SSH Hangs. error: openpty: No such file or directory error: session_pty_req: session 0 alloc failed
One of our Ubuntu 14.04 production servers stopped accepting SSH connections. When we try to login we get the SSH Banner text, but then it just hangs. If we login using the management console, we can see the following error messages in /var/log/auth.log
Oct 4 17:37:20 servername sshd[10975]: error: Could not load host key: /etc/ssh/ssh_host_ed25519_key
Oct 4 17:37:21 servername sshd[10975]: Accepted publickey for username from 10.0.0.1 port 57230 ssh2: RSA xx:xx:xx:xx
Oct 4 17:37:21 servername sshd[10975]: pam_unix(sshd:session): session opened for user username by (uid=0)
Oct 4 17:37:25 servername sshd[10975]: error: openpty: No such file or directory
Oct 4 17:37:25 servername sshd[6869]: error: session_pty_req: session 0 alloc failed
Using cat /proc/mounts| grep devpts; ls -hal /dev/{pts,ptmx}
I can verify it exists and has the correct permissions, and that there aren't any disk/inode issues:
devpts /dev/pts devpts rw,nosuid,noexec,relatime,mode=600,ptmxmode=000 0 0
crw-rw-rw- 1 root tty 5, 2 Oct 4 17:01 /dev/ptmx
/dev/pts:
total 0
drwxr-xr-x 2 root root 0 Aug 14 00:52 .
drwxr-xr-x 17 root root 4.3K Oct 4 17:01 ..
crw--w---- 1 root tty 136, 18 Oct 4 17:41 18
crw--w---- 1 root tty 136, 24 Oct 1 13:57 24
crw--w---- 1 root tty 136, 3 Oct 4 17:39 3
crw--w---- 1 root tty 136, 30 Oct 4 11:29 30
c--------- 1 root root 5, 2 Aug 14 00:52 ptmx
df -h
Filesystem Size Used Avail Use% Mounted on
udev 252G 4.0K 252G 1% /dev
tmpfs 51G 53M 51G 1% /run
/dev/sdi2 220G 13G 197G 6% /
none 4.0K 0 4.0K 0% /sys/fs/cgroup
none 5.0M 0 5.0M 0% /run/lock
none 252G 12K 252G 1% /run/shm
none 100M 0 100M 0% /run/user
/dev/sdi1 75M 512 75M 1% /boot/efi
/dev/md1 3.5T 282G 3.0T 9% /ssd
df -hi
Filesystem Inodes IUsed IFree IUse% Mounted on
udev 63M 526 63M 1% /dev
tmpfs 63M 725 63M 1% /run
/dev/sdi2 14M 171K 14M 2% /
none 63M 2 63M 1% /sys/fs/cgroup
none 63M 1 63M 1% /run/lock
none 63M 4 63M 1% /run/shm
none 63M 4 63M 1% /run/user
/dev/sdi1 0 0 0 - /boot/efi
/dev/md1 224M 46 224M 1% /ssd
I also verified the sshd_config matches another server and have restarted the ssh service. I believe the devpty system is mounted on startup, but is there any way to resolve the issue without restarting the server?
I see https://access.redhat.com/solutions/67972 has a unverified solution for this issue on RedHat, but I don't have access to a RedHat Subscription.
I found I could get a non-tty based ssh session to work using:
$ ssh username@servername /bin/bash -i
bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell
username@servername:~$
I think in this case the ioctl error is expected, because I am starting an interactive session on something that doesn't have a tty. Lots of things have issues in this session (TERM env var isn't even set), but I was able to do some basic troubleshooting and found this:
#View a process list with parent process details
ps -axfo pid,uname,cmd | grep badservice | wc -l
27917
Basically we found one of our services had over 27900 processes running under their username, when we compared this with the good server
$ salt 'server*' cmd.run 'ps -aux | grep badservice | wc -l'
server.good:
3
server.bad:
27918
Likely this was causing some sort of resource exhaustion related to ptys. The bad service was stopped, and I killed any remaining processes for that user using sudo kill -u badservice
. After which, SSH started working as expected again!