How do I stop this constant loss of free space?

Solution 1:

You've got some out-of-control logs. Instead of deleting like crazy everyday, find the fast growing file or files, and look inside to investigate what may be causing this. Maybe some program is spinning in a loop logging some condition. Either disable that program, disable its logging or try to fix the condition that it's complaining about.

If a file is growing before your eyes, and you have no idea which program is writing to it, you may be able to find that out easily. Here is an example. Who has /var/log/syslog open? We use the fuser command:

# fuser /var/log/syslog
/var/log/syslog:      602

Only one process has /var/log/syslog open. It is process 602. What is that? Let us not bother with ps and grep, but look at the /proc filesystem directly:

# ls -l /proc/602/exe
lrwxrwxrwx 1 root root 0 Mar 29 17:45 /proc/602/exe -> /usr/sbin/rsyslogd

Aha, it is rsyslogd. We are not surprised that rsyslogd has /var/log/syslog/ open.

This method is not guaranteed to work. The reason is that programs do not have to keep files open ino rder to write to them. Suppose you have a process which opens a file, appends to it, and then closes it. You will have a somewhat more difficult investigation. You could run fuser many times until by chance you catch the process "red handed". That process itself could be going into and out of existence quickly. Another problem is that multiple processes could have the file open, but only one is making it larger. In that case, you can trace their system calls.

# fuser /var/log/huge-annoying-file
/var/log/huge-annoying-file:   1234 23459

Oops! Two processes have it open: 1234 and 23459. Let's see what they are doing:

# strace -p 1234
Process 1234 attached - interrupt to quit
select(1, NULL, NULL, NULL, {9, 922666}

It's not doing anything, just blocking in a select call. Ctrl-C to break the trace:

select(1, NULL, NULL, NULL, {9, 922666}^C <unfinished ...>

Check the next one:

# strace -p 23459
write(5, "Useless garbage ..."..., 512) = 512
write(5, "More useless garbage ..."..., 512) = 512
write(5, "More useless garbage ..."..., 512) = 512
write(5, "More useless garbage ..."..., 512) = 512
write(5, "More useless garbage ..."..., 512) = 512
write(5, "More useless garbage ..."..., 512) = 512
write(5, "More useless garbage ..."..., 512) = 512
^C

Oops, that one is writing constantly. It must be the bad one. We can even check that the file descriptor 5 which the process is writing to is in fact the large file:

# ls -l /proc/23459/fd/5
lr-x------ 1 root root 64 Apr  3 23:39 /proc/23459/fd/5 -> /var/log/huge-annoying-file

I don't suspect you have a corrupt filesystem, but to force a full check, you don't have to boot a DVD.

Firstly, review your filesystem's maximum mount count setting. Identify your partition using the df command. Example on an Ubuntu system I have here:

# df
Filesystem     1K-blocks    Used Available Use% Mounted on
/dev/sda1       18062108 5499320  11645284  33% /
udev              392152       4    392148   1% /dev
tmpfs             159768     768    159000   1% /run
none                5120       0      5120   0% /run/lock
none              399416     200    399216   1% /run/shm
/dev/sr0           43668   43668         0 100% /media/VBOXADDITIONS_4.1.4_74291

You can see that the / filesystem is mounted on /dev/sda1. So /dev/sda1 is the storage device of the root partition (and the only partition in this particular system).

Let's look at some attributes of that filesystem. This is safe to do even though it is mounted. The command spews a lot of output. Here is an excerpt:

$ dumpe2fs /dev/sda1
dumpe2fs 1.42 (29-Nov-2011)
Filesystem volume name:   <none>
Last mounted on:          /
[ ... SNIP ... ]
Last mount time:          Fri Mar 29 17:45:18 2013
Last write time:          Tue Mar  5 09:08:03 2013
Mount count:              22
Maximum mount count:      22
[ ... SNIP ... ]

Hey look, the mount count is equal to the maximum mount count. Next time I reboot, there will be a filesystem check. The important thing is that the mount count is a positive value. If yours is zero, change it to some positive value like 22 using tune2fs -c 22 /dev/whatever. Zero means that a check is never forced regardless of how many times the partition is mounted. Rarely rebooted systems should have low values here. A server that goes down once a year could probably use a fsck each time it reboots. You can set date-based check intervals also.

Now to force a check, you can override the actual count to be greater than or equal to the maximum, and then reboot. That's done with capital C: tune2fs -C 1234 /dev/whatever. Now the partition looks like it has been mounted 1234 times without a check, which is greater than the one- or two-digit maximum.

Solution 2:

A disk check freed some of the space, suggesting that this problem (or part of it) may be due to filesystem corruption. If that is the case, then you should be able to free more space by scanning and repairing the filesystem. However, if the corruption is perpetually happening (which might or might not be the case), that usually means the hard drive is dying. If your backups (of your documents and any other important files that would be hard to replace) aren't completely up to date, please back up everything important now!

To check and repair the disk, it cannot be mounted (at least not read-write). So you should run the repair utility from a live environment (live CD/DVD or USB). First, you'll have to find out the device name of the partition that contains your files.

Therefore, in the installed system, run:

mount | grep ' on / '

(Make sure to include the space between the / and '.)

You'll get something like:

/dev/sda8 on / type ext4 (rw,errors=remount-ro)

The text before on--in the example from my machine, /dev/sda8--is the full device name for your root partition (/). Write this down--you'll need it.

Then boot your computer from an Ubuntu desktop CD/DVD or USB flash drive, like what you used to install Ubuntu originally. (If this is a Wubi system, installed with the Windows installer, please let us know. I don't expect that, given what you've reported, but if that is the case, the procedure will be different.)

Select Try Ubuntu without installing (not Install Ubuntu). When you get a functioning desktop, press Ctrl+Alt+T to open a Terminal window. Then run this command:

sudo e2fsck -fkccp /dev/sda8

But make sure to replace /dev/sda8 with the right full device name for your / partition, as you obtained through the method detailed above.

This may take a while. The c options included in that command cause it to scan the surface of the disk for errors as well as the filesystem (and to mark any bad areas as bad so they are not used). You can leave cc out if you like (if you do, you can also leave out k), but I recommend keeping them in.

You may be prompted about fixing certain problems, if e2fsck thinks there's a significant likelihood that trying to fix them could cause data loss. (The p makes it so that it will fix any problems that it is confident it can fix without causing complications.)

I recommend that you be strongly inclined to allow it to fix whatever it wants, since you should only be doing this after making sure your backups are current, anyway. If you want it to attempt even potentially dangerous fixes without prompting you, replace the p with y.

After this, boot back into your Ubuntu system and see if space is freed. If it is not, or if the problem continues, please comment and edit your question to provide details.