One of the projects I'm working on is moving certain puppet-applied ulimit settings away from "that sounds about right" to dynamically allocated based on the environment. This is for single-application environments, so I'm mostly worried about preventing the application from resource starvation while keeping the kernel and utility-spaces in enough handles and whatnot to do what they should.

We get persistent requests from app-teams for moar file handles! so I'm attempting to find a way to deal with that. So I made a puppet-fact:

Facter.add('app2_nofile') do
  confine :kernel => 'Linux'
  setcode do
    kernel_nofile = `/bin/cat /proc/sys/fs/file-max`.chomp
    app2_limit = (kernel_nofile.to_i * 0.85).round
    app2_limit
  end
end

Which does what it says on the tin. It takes the kernel value defined in /proc/sys/fs/file-max and take 85% of it, leaving 15% for system usage. Set a soft and hard nofile ulimit using this ::app2_nofile fact in another puppet resource so /etc/security/limits.conf is updated, and tada! Simple! If they want more file-handles, they'll have to be smarter about writing the app.

Except, it didn't work. When attempting to open a user session (su app2_user -) with the user with that nofile ulimit, we get the error message:

Could not open session

Which is bad.

Clearly, there is an upper-bound somewhere independent of simple ulimits. Or maybe I'm understanding how they fundamentally work. How does nofile limits interact with each other, and what would cause the session to not be able to be created?


Further testing suggests that the upper-bound may be a static boundary, or more complex than simple percentages. A small-RAM system with a file-max of 797,567 can have this ulimit set very high and I'll get no reproduction. On a larger system with 1,619,938 I can have that ulimit set to about 63% before I get "could not open session." I don't have anything larger right now to test with to see if that percentage moves with bigger RAM.

I do get an audit.log entry:

type=USER_START msg=audit(1416420909.479:511331): user pid=5022 uid=0 auid=1194876420 ses=44826 
subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 msg='op=PAM:session_open 
acct="app2" exe="/bin/su" hostname=? addr=? terminal=pts/0 res=failed'

The op was a PAM operation.


Solution 1:

This appears to be a feature of PAM:

https://bugzilla.redhat.com/show_bug.cgi?id=485955

While not definitive, source would be the place to go for that, it is strongly suggestive that PAM is enforcing a ceiling of some kind on certain resources. The break came when I was using strace on the su command to see what it was trying to do that was getting denied, and I saw this line:

setrlimit(RLIMIT_NOFILE, {rlim_cur=1049000, rlim_max=1049000}) = -1 EPERM (Operation not permitted)

Nothing is logged in audit.log other than the PAM failure, syslog doesn't show anything, it's just this one failure.

For my purposes, I'll write that fact to take the lower of a static value or 85% of the kernel's max files. I need to do more testing to figure out what that static value will be, but it seems this hybrid method will be better supported by the tooling.