How does systemd put sshd processes in slices?

I'm diagnosing an SSH bastion I manage. This machine has about 5500 SSH connections with port forwarding at any given point in time.

Recently, I ran into an issue where SSH connections where refused because the user slice that holds all these sshd processes ran in to the MaxTasks limit.

This was new to me and during diagnosing, I noticed that the user.slice does not hold all sshd processes as I thought it would. About half of them (not exact) are held by system.slice. At first I though that might've been the root processes, with the user-specific processes (privilege separation) being held by the user.slice. However, this is not the case. It appears to be random.

I did notice the processes held by the user.slice are nicely separated per session, whereas the ones held by system.slice are just held under ssh.service with no further separation.

# systemd-cgls
[...]
│ ├─user-1031.slice
│ │ ├─session-719.scope
│ │ │ ├─5559 sshd: <user> [priv]
│ │ │ └─6224 sshd: <user>
│ │ ├─session-617.scope
│ │ │ ├─4963 sshd: <user> [priv]
│ │ │ └─5392 sshd: <user>
│ │ ├─session-515.scope
│ │ │ ├─3862 sshd: <user> [priv]
│ │ │ └─4693 sshd: <user>
│ │ ├─session-413.scope
│ │ │ ├─3049 sshd: <user> [priv]
│ │ │ └─3988 sshd: <user>
[...]
└─system.slice
  ├─ssh.service
  │ ├─  338 sshd: <user> [priv]
  │ ├─  352 sshd: <user>
  │ ├─  353 sshd: <user>
  │ ├─  358 sshd: <user>
  │ ├─  385 sshd: <user> [priv]
  │ ├─  391 sshd: <user>
  │ ├─  392 sshd: <user>
[...]
  • How does systemd decide to put a process in one slice or the other?
  • Do they get moved?
  • Is there a way to accurately and reliably put all these sessions under the appropriate user.slice so I can manage the limitations set for the number of processes allowed?

OpenSSH privilege separation is implemented with a privileged and unprivileged process per connection.

Per user slicing is a feature of systemd-logind.service driven by pam_systemd. Unclear to me as to why you have a bunch still in systemd.slice. Perhaps those use the PAM stack differently.

A single user slice for 5500 SSH connections? More than typical for one user, but you can do that.

I suggest setting pids.max very high, but not infinite, on the user slices. In excess of twice the number of connections you expect. To do that, create /etc/systemd/logind.conf.d/local.conf and customize:

[Login]
UserTasksMax=16000

If ssh.service has more than a couple thousand tasks under it, also consider upping its limits. This time, using the common resource control directives, so the drop in customization is at /etc/systemd/system/ssh.service.d/local.conf

[Service] 
TasksMax=16000