How to track down a file descriptor leak?

to see the top 20 file handle using processes:

for x in `ps -eF| awk '{ print $2 }'`;do echo `ls /proc/$x/fd 2> /dev/null | wc -l` $x `cat /proc/$x/cmdline 2> /dev/null`;done | sort -n -r | head -n 20

the output is in the format file handle count, pid, cmndline for process

example output

701 1216 /sbin/rsyslogd-n-c5
169 11835 postgres: spaceuser spaceschema [local] idle
164 13621 postgres: spaceuser spaceschema [local] idle
161 13622 postgres: spaceuser spaceschema [local] idle
161 13618 postgres: spaceuser spaceschema [local] idle

Become familiar with the strace command. It monitors system calls. I recently used it to track down file descriptor leaks that were causing our snmpd daemon to crash repeatedly. It takes some getting used to, but it's a powerful tool.

You can use strace to attach to a running process (don't forget the -f flag to follow child processes).


What exactly are you trying to track down? The remote IP address(es) associated with the leaked FDs, the defective code, or something else?

As you've already identified that there is a leak, contacting the engineers responsible for this java process seems like a reasonable next step.