Is there any other reason for "no space left on device"?
I am using Dirvish on a Ubuntu server system for backing up a hd to an external usb 3.0 drive. Until a few days ago, everything worked fine, but now every backup fails with "no space left on device (28)" and "file system full". Unfortunately it is not that simple: There is > 500 GB free on the device.
Details:
rsync_error:
rsync: write "/mnt/backupsys/shd/gesichert1/20130223_213242/tree/<SomeFilename1>.eDJiD9": No space left on device (28)
rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Broken pipe (32)
rsync: write "/mnt/backupsys/shd/gesichert1/20130223_213242/tree/<SomeFilename2>.RHuUAJ": No space left on device (28)
rsync: write "/mnt/backupsys/shd/gesichert1/20130223_213242/tree/<SomeFilename3>.9tVK8Z": No space left on device (28)
rsync: write "/mnt/backupsys/shd/gesichert1/20130223_213242/tree/<SomeFilename4>.t3ARSV": No space left on device (28)
[... some more files ...]
rsync: connection unexpectedly closed (2712185 bytes received so far) [sender]
rsync error: error in rsync protocol data stream (code 12) at io.c(605) [sender=3.0.9]
the log looks pretty much as usual until it hits:
<SomeFilename1>
<SomeFilename2>
<SomeFilename3>
<SomeFilename4>
<PartOfAFilename>filesystem full
write error, filesystem probably full
broken pipe
RESULTS: warnings = 0, errors = 1
But, as said above, there is lots of space on the device:
df -h
/dev/sdg1 2.7T 2.0T 623G 77% /mnt/backupsys/shd
and also there are lots of inodes left:
df -i
/dev/sdg1 183148544 2810146 180338398 2% /mnt/backupsys/shd
The device is mounted as rw:
mount
/dev/sdg1 on /mnt/backupsys/shd type ext3 (rw)
The process is running as root.
I was about to say that I haven't changed anything but that's not quite true: I have switched on acl for the drive I am backing up:
/dev/md0 on /mnt/md0 type ext4 (rw,acl)
Could that be the problem? If yes, how? root still has full access to the files.
EDIT:
I just checked the temp directories:
- /tmp contains only a .webmin folder that is empty
- /var/tmp is empty
the file system where these directories reside has plenty of free space and inodes:
df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 289G 55G 220G 20% /
df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sda1 19202048 167644 19034404 1% /
EDIT2:
The directories are quite large, but not > 2 GB. The one where the backup fails is not even one of the largest, it contains 7530 files.
EDIT3:
One information which I did not consider relevant when posting this question:
The day before the backups started to fail I had activated acls on the file systems that were backed up. I assume now that this triggered Dirvish (or rsync) to think all the files had changed so the list of files that were to copied rather than hard linked was very large. This could possibly mean that some buffers were too small.
Today a full backup to an empty disk worked flawlessly. I'll try an incremental backup next. This will show whether activating acls was the cause for the problem.
My suspicion (see EDIT3) apparently was right: Adding acl support to the file system made rsync/dirvish think that all the files had changed. So instead of making an incremental backup and just create hard links to the already existing files, it tried to create a full backup which of course failed because the hard disk did not have enough space for that.
So the error message was actually correct.
After starting again with an empty backup disk, the incremental backups worked as before.
Looking at the 2% of inodes left made me think about the root reserves that the EXT filesystem imposes. You may want to check these out:
- "Reserved space for root on a filesystem - why?"
- "Reasonable size for “filesystem reserved blocks” for non-OS disks?"
I would try to .tar.gz some of the older backups hoping that it would reduce the number of inodes in use.
I see that dummzeuch find a solution to his problem but there is actually one more case I found where disk can have enough inodes/free space and still showing "no space left on the device" while attempting to transfer certain directories.
This is caused by hash collisions on block devices formatted with ext4 file system where directory indexing is enabled too especially where single directory hosts more than 100k files in it and name of the files are generated from the same algorithm (cache files, md5sum file names etc.)
Solution is to try with another directory indexing algorithm:
tune2fs -E "hash_alg=tea" /dev/blockdev_name
or to disable completely directory indexing for that block device (may hurt performance)
tune2fs -O ^dir_index /dev/blockdev_name
Another solution is to see what is filling the directory with such files and fix the software.
Possible solution is split content of folder with huge volume of files in it to multiple separate subfolders.
Full description of the problem is presented by Axel Wagner here
http://blog.merovius.de/2013/10/20/ext4-mysterious-no-space-left-on.html
Cheers.
There is a 2GB size limit on the directory itself - i.e. if you have so many files that the directory size is >2GB (NOT the size of the files IN the directory), you'll have an issue. Having said that, with only 2.8M inodes used, that shouldn't be an issue. Usually happens around 15M inodes.
So this may not be much help - but try ext4 on your backup device?