Tools for tracking disk usage

I manage a number of linux fileservers. These all run applications written from 0-10 years ago. As sometimes happens, a machine will come close to, or run out of disk space. Reasons include applications not rotating log files, a machine with 500GB of disk producing 150GB of new files every month that were not written to tape, databases gradually increasing in size, people doing silly things...generally a bit of chaos.

Anyway, when a machine unexpectedly goes from 50% to 100% full in a couple of hours, I figure out what broke (lots of "du") and delete files or contact someone. I also can look at cacti graphs to figure out what the machine's normal disk usage is (e.g. for /home).

Does anyone know of any tools that will give finer grained information on historial usage than a cacti/RRD graph? Like "/home/abc/xyz increased 50GB in the last day".


I think mathematical curve fitting might be an answer here but I haven't explored it yet. I was at a talk where John Adams from Twitter talked about doing this for their capacity planning and it seemed like a useful idea.

My understanding of curve fitting is it takes the existing data and gives you a usage extrapolation. That can be used to answer questions like "based on current usage, when will our disk hit 100% full?".

Here's a wikipedia page on curve fitting. The package fityk looks like a good place to start.

It seems some programming is required to do this, I don't see any simple drop-in plugin for a monitoring package like Cacti unfortunately.


Munin will monitor disk usage and send alarms. The graphs will be similar to what you would get with cacti/RRD as Munin uses rrd for storage. I have replaced nagios and mtrg with munin for many things. There are uses for nagios that aren't covered by munin.


I once had to do something similar. I solved the problem with a cron job that ran a du on the affected filesystem every night and saved to date named file. When the server fills up it's easy to compare the current du to one of the archived ones and figure out what happened. This also gives great information on growth over time for your future disk estimation needs.


Do you thought about a monitoring?

Perhaps its better you use something lag Nagios to monitor your server. And when your disk is more then 90% for example you get an email something like this.

In this solution you can use CACTI to look and have a history but Nagios warns you if you have one or more critical states for example when you have 70% diskspace a warning and 90% critical warning.

And with Nagios is only an example you can monitor all your linux Servers with one aplication and not only the disks.