Very strange file size (more than 600 PB) on a small filesystem

I had a file on an XFS filesystem which size was about 200 GB. It was a QCOW2 image containing a virtual disk of a KVM-driven virtual machine. Something went wrong (maybe it was some glitch of qemu-kvm, I'm not sure), the virtual machine had crashed and now I have a file that looks like that:

191090708 -rwxr--r--. 1 root root 737571587400425984 Oct 10 10:03 973d10e0-a5e3-4a59-9f98-4b9b9f072ade

So, it still occupies 191090708 blocks, but ls shows it as 656 petabytes.

Moreover, I have another file with the same pre-history, but on another filesystem (not XFS, but GFS2):

410855320 -rwxr--r--. 1 root root 7493992262336241664 Dec 13  2014 ac2cb28f-09ac-4ca0-bde1-471e0c7276a0

It occupies 410855320 blocks, but ls shows it as ~6.6 exabytes.

What do you think, is it safe to remove these files? Thank you!

P.S. It's so good to have snapshots taken on a regular basis! :) I don't know what I would do without them.


Solution 1:

I can see two possible reasons for you to be seeing those file sizes:

  • Sparse files
  • File system corruption

Sparse files is a feature on some file systems whereby you can create a file with holes in it. No physical space is allocated for the holes. Reading across the holes will return NUL bytes all the way.

If the reason for what you are seeing is sparse files, then it is as safe to delete them as it would be with a non-sparse file.

If the reason for what you are seeing is file system corruption, then it is not safe to delete the files without a file system check. If a file system is corrupted in a way where multiple files claim to be occupying the same space, then deleting either file would cause those blocks to be freed. Once those freed blocks are reused the corruption gets worse.

If you have seen any other symptoms making you think the file system may be corrupted, you should force a full check of the file system before deleting the files.

If there is no evidence suggesting the file system is corrupted, and the files appear to be sparse, I would just delete the files once I don't need them any longer.

Solution 2:

The problem is the way you compute a file size.

One way it is look the offset of the last byte (like ls). The other way is to sum really allocated blocks (like du).

What you see if probably a file with data written at very large offset. Meaning that major parts of your file address space is not allocated. But you can still read it.