Determine Location of Inode Usage

Solution 1:

Don't expect this to run quickly...

cd to a directory where you suspect there might be a subdirectory with lots of inodes. If this script takes a huge amount of time, you've likely found where in the filesystem to look. /var is a good start...

Otherwise, if you change to the top directory in that filesystem and run this and wait for it to finish, you'll find the directory with all the inodes.

find . -type d | 
while 
  read line  
do 
  echo "$( find "$line" -maxdepth 1 | wc -l) $line"  
done | 
sort -rn | less

I'm not worried about the cost of sorting. I ran a test and sorting through the unsorted output of that against 350,000 directories took 8 seconds. The initial find took . The real cost is opening all these directories in the while loop. (the loop itself takes 22 seconds). (The test data was run on a subdirectory with 350,000 directories, one of which had a million files, the rest had between 1 and 15 directories).

Various people had pointed out that ls is not great at that because it sorts the output. I had tried echo, but that is also not great. Someone else had pointed out that stat gives this info (number of directory entries) but that it isn't portable. It turns out that find -maxdepth is really fast at opening directories and counts .files, so... here it is.. points for everyone!

Solution 2:

If the issue is one directory with too many files, here is a simple solution:

# Let's find which partition is out of inodes:
$ df -hi
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/sda3               2.4M    2.4M       0  100% /
...

# Okay, now we know the mount point with no free inodes,
# let's find a directory with too many files:
$ find / -xdev -size +100k -type d

The idea behind the find line is that the size of a directory is proportional to the amount of files directly inside that directory. So, here we look for directories with tons of files inside it.

If you don't want to guess a number, and prefer to list all suspect directories ordered by "size", that's easy too:

# Remove the "sort" command if you want incremental output
find / -xdev -size +10k -type d -printf '%s %p\n' | sort -n

Solution 3:

Grrr, commenting requires 50 rep. So this answer is actually a comment on chris's answer.

Since the questioner probably doesn't care about all the directories, only the worst ones, then using sort is likely very expensive overkill.

find . -type d | 
while 
  read line  
do 
  echo "$(ls "$line" | wc -l) $line"  
done | 
perl -a -ne'next unless $F[0]>=$max; print; $max=$F[0]'  | less

This isn't as complete as your version, but what this does is print lines if they're larger than the previous maximum, greatly reducing the amount of noise printed out, and saving the expense of the sort.

The downside of this is if you have 2 very large directories, and the first happens to have 1 more inode than the 2nd, you'll never see the 2nd.

A more complete solution would be to write a smarter perl script that keeps track of the top 10 values seen, and prints those out at the end. But that's too long for a quick serverfault answer.

Also, some midly smarter perl scripting would let you skip the while loop - on most platforms, ls sorts the results, and that can also be very expensive for large directories. The ls sort is not necessary here, since all we care about is the count.