Monotonic growth of Linux directory size/block count
On Linux, (perhaps as a function of the filesystem block size), when I create a directory and stat
it, it returns a size of 4096. I can create files in this directory, up to a point, without increasing the perceived size of the directory (as reported by stat
).
At some point, as the directory fills up with many files, the directory size balloons (I am not talking about the contents of the directory, I am talking about the blocks consumed to represent the directory itself). If files are deleted, the directory size remains the same.
Here's a quick example:
[root@uxlabtest:/]$ mkdir test
[root@uxlabtest:/]$ stat test
File: `test'
Size: 4096 Blocks: 8 IO Block: 4096 directory
Device: fd00h/64768d Inode: 1396685 Links: 2
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2011-07-26 14:06:04.000000000 -0400
Modify: 2011-07-26 14:06:04.000000000 -0400
Change: 2011-07-26 14:06:04.000000000 -0400
Then touch a bunch of files:
[root@uxlabtest:/]$ for i in `seq 1 10000`; do touch /test/$i; done
[root@uxlabtest:/]$ stat test
File: `test'
Size: 155648 Blocks: 312 IO Block: 4096 directory
Device: fd00h/64768d Inode: 1396685 Links: 2
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2011-07-26 14:06:04.000000000 -0400
Modify: 2011-07-26 14:06:56.000000000 -0400
Change: 2011-07-26 14:06:56.000000000 -0400
Then delete the files:
[root@uxlabtest:/]$ rm -rf /test/*
[root@uxlabtest:/]$ stat test
File: `test'
Size: 155648 Blocks: 312 IO Block: 4096 directory
Device: fd00h/64768d Inode: 1396685 Links: 2
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2011-07-26 14:07:11.000000000 -0400
Modify: 2011-07-26 14:07:12.000000000 -0400
Change: 2011-07-26 14:07:12.000000000 -0400
My questions are:
- Why does the size/block count of a directory increase monotonically?
- Is this a function of the underlying filesystem or the Linux VFS?
- Can the directory size ever be reduced without deleting and recreating the directory?
- Bonus points: Point me at the kernel source code where this behavior is implemented.
Here are the answers that are true for ext2/ext3/ext4. If they are true for other file systems depends on their implementation.
- user48838 answered this one correctly. More files consume more meta data. They are allocated in 4k chunks or in any other size defined at creation time of the file system
- Yes it is a feature/problem of the real file system
- In an ext3 file system this is not possible. Only by recreating the (empty) directory
- The source code is around here and in related files
But you have luck. When you recreate the same amount of files you already deleted, the directory size will stay the same. Only when you add more files it will increase.
The block increments that you are seeing is due to how the file system manages its storage of files and related file management information. In your described situation, that would appear to increments of 4K, so each "new"/"unique" entry into the file system will reserve 4K, whether the actual data size fills up the entire 4K. If the related data takes up the entire 4K, then another 4K block is reserved and filled as needed to store the entire related data stream/sequence.
Depending on "hard" versus "soft" deletes as managed by the file system, the deletion may not (usually not for "undelete" functionality) immediately free the block(s) that was reserved. Some file systems may differentiate different types of "deletes" and provide corresponding storage block management capabilities.
How storage management is approached and implemented differ by file systems, so in OSes that support multiple/modular files systems, the OS will typically only provide "hooks" for the file system to integrate into.
Adding some rambling commentary to user48838's good answer:
Everything is a file, including directories. To store all that file information, you need space.
It would also be valid to show, say, '64B used' for a small directory and actually show the amount of space used, but we'd be using multiple of 4K on disk anyways, so it was a design decision to just show the amount of used space.
From a FS design perspective, why would you bother going through the trouble of calculating what was used? Not necessary. And then you'd have to move entries to avoid leaving holes… ick.
When deletes happen and dir size drops so that you could free up a block, all that management would need to happen before you could actually do so. Why bother to save a few KB? Odds are you'll have to expand it later anyways.
Left as an exercise for the reader: Think about why your /lost+found directory is created empty but takes up 16K (on ext3 at least).