Too many open files (CentOS7) - already tried setting higher limits

First time setting up a VPS – being mindful not to ask here unless I done due diligence and provide context.

On my remote VPS, through the terminal, almost all commands I run end up with an Error: Too many open files message and I need your help to move forward.

I run: CentOS Linux release 7.6.1810 (Core) on a machine with 1 CPU core and 2048Mb RAM. It has been setup with a LEMP stack Nginx 1.16.1, PHP-FPM 7.3.9, MariaDb 10.4.8 intended for a simple wordpress site.

I have tried:

  1. Google and forum searches.
  2. Applied these settings (manually restarting VPS manually via control panel each time):

System-wide settings in /etc/security/limits.conf:

nginx       soft    nofile      1024
nginx       hard    nofile      65536
root        hard    nofile      65536
root        soft    nofile      1024

adjustments to memory limits and uploads in /etc/php.ini:

memory_limit = 256M
file_uploads = On
upload_max_filesize = 128M
max_execution_time = 600
max_input_time = 600
max_input_vars = 3000

PHP rlimit settings in /etc/php-fpm.d/www.conf:

rlimit_files = 65535

Setting NGINX limits (and other settings) in nginx.conf:

user  nginx;
worker_processes  1;

error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;


events {
    worker_connections  10000;
}

worker_rlimit_nofile 100000;


http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

    sendfile        on;
    #tcp_nopush     on;

    keepalive_timeout  65;
    client_body_buffer_size 128k;
    client_header_buffer_size 10k;
    client_max_body_size 100m;
    large_client_header_buffers 4 256k;

    #gzip  on;

    include /etc/nginx/conf.d/*.conf;
    include /etc/nginx/sites-enabled/*.conf;
    server_names_hash_bucket_size 64;
}

Here is the output of cat /proc/sys/fs/file-nr:

45216   0   6520154

Here is the output of ps aux|grep nginx|grep -v grep:

root       928  0.0  0.0  46440  1192 ?        Ss   00:25   0:00 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf
nginx      929  0.0  0.2  50880  6028 ?        S    00:25   0:00 nginx: worker process
nginx     9973  0.0  0.1 171576  4048 ?        S    04:28   0:00 php-fpm: pool www
nginx     9974  0.0  0.1 171576  4048 ?        S    04:28   0:00 php-fpm: pool www
nginx     9975  0.0  0.1 171576  4048 ?        S    04:28   0:00 php-fpm: pool www
nginx     9976  0.0  0.1 171576  4048 ?        S    04:28   0:00 php-fpm: pool www
nginx     9977  0.0  0.1 171576  4052 ?        S    04:28   0:00 php-fpm: pool www

Switching user to nginx with su - nginx and checking limits with: ulimit -Sn returns 1024 ulimit -Hnreturns 65536

The lsof | wc -l command returns: 4776

Hope you can help steer me in the right direction to solve the Too Many Files problem!

EDIT - the following command shows more info:

service nginx restart

Redirecting to /bin/systemctl restart nginx.service
Error: Too many open files
Job for nginx.service failed because a configured resource limit was exceeded. See "systemctl status nginx.service" and "journalctl -xe" for details.
[root@pars ~]# systemctl status nginx.service
● nginx.service - nginx - high performance web server
   Loaded: loaded (/usr/lib/systemd/system/nginx.service; enabled; vendor preset: disabled)
  Drop-In: /usr/lib/systemd/system/nginx.service.d
           └─worker_files_limit.conf
   Active: failed (Result: resources) since Fri 2019-09-13 05:32:23 CEST; 14s ago
     Docs: http://nginx.org/en/docs/
  Process: 1113 ExecStop=/bin/kill -s TERM $MAINPID (code=exited, status=0/SUCCESS)
  Process: 1125 ExecStart=/usr/sbin/nginx -c /etc/nginx/nginx.conf (code=exited, status=0/SUCCESS)
 Main PID: 870 (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/virtualizor.service/system.slice/nginx.service

Sep 13 05:32:22 pars.work systemd[1]: Starting nginx - high performance web server...
Sep 13 05:32:22 pars.work systemd[1]: PID file /var/run/nginx.pid not readable (yet?) after start.
Sep 13 05:32:22 pars.work systemd[1]: Failed to set a watch for nginx.service's PID file /var/run/nginx.pid: Too many open files
Sep 13 05:32:23 pars.work systemd[1]: Failed to kill control group: Input/output error
Sep 13 05:32:23 pars.work systemd[1]: Failed to kill control group: Input/output error
Sep 13 05:32:23 pars.work systemd[1]: Failed to start nginx - high performance web server.
Sep 13 05:32:23 pars.work systemd[1]: Unit nginx.service entered failed state.
Sep 13 05:32:23 pars.work systemd[1]: nginx.service failed.

Solution 1:

It is not actually open file handles that have run out, but inotify watches.

You can see this in the error message:

Sep 13 05:32:22 pars.work systemd[1]: Failed to set a watch for nginx.service's PID file /var/run/nginx.pid: Too many open files

To solve the problem, you need to raise the number of inotify watches the system has available. If you actually check, you will find it has some ridiculously low value like 8192.

$ sysctl fs.inotify.max_user_watches
fs.inotify.max_user_watches = 8192

You can set the sysctl fs.inotify.max_user_watches to a higher value persistently by editing /etc/sysctl.conf or creating a file in the /etc/sysctl.d directory. For example, my system has:

$ cat /etc/sysctl.d/10-user-watches.conf 
fs.inotify.max_user_watches = 1048576

And then load it with sysctl -p.

You may not want to go straight to that number and cause the kernel to allocate memory to track a million user file watch slots; instead, just take the current value and double it until the problem stops occurring.

Solution 2:

To change ulimit settings for services you need to modify the systemd unit.

sudo systemctl edit --full nginx.service

And add the desired value to the service section

[Service]
LimitNOFILE=<integer>
...