How many files can I put in a directory?
Does it matter how many files I keep in a single directory? If so, how many files in a directory is too many, and what are the impacts of having too many files? (This is on a Linux server.)
Background: I have a photo album website, and every image uploaded is renamed to an 8-hex-digit id (say, a58f375c.jpg). This is to avoid filename conflicts (if lots of "IMG0001.JPG" files are uploaded, for example). The original filename and any useful metadata is stored in a database. Right now, I have somewhere around 1500 files in the images directory. This makes listing the files in the directory (through FTP or SSH client) take a few seconds. But I can't see that it has any effect other than that. In particular, there doesn't seem to be any impact on how quickly an image file is served to the user.
I've thought about reducing the number of images by making 16 subdirectories: 0-9 and a-f. Then I'd move the images into the subdirectories based on what the first hex digit of the filename was. But I'm not sure that there's any reason to do so except for the occasional listing of the directory through FTP/SSH.
Solution 1:
FAT32:
- Maximum number of files: 268,173,300
- Maximum number of files per directory: 216 - 1 (65,535)
- Maximum file size: 2 GiB - 1 without LFS, 4 GiB - 1 with
NTFS:
- Maximum number of files: 232 - 1 (4,294,967,295)
- Maximum file size
- Implementation: 244 - 26 bytes (16 TiB - 64 KiB)
- Theoretical: 264 - 26 bytes (16 EiB - 64 KiB)
- Maximum volume size
- Implementation: 232 - 1 clusters (256 TiB - 64 KiB)
- Theoretical: 264 - 1 clusters (1 YiB - 64 KiB)
ext2:
- Maximum number of files: 1018
- Maximum number of files per directory: ~1.3 × 1020 (performance issues past 10,000)
- Maximum file size
- 16 GiB (block size of 1 KiB)
- 256 GiB (block size of 2 KiB)
- 2 TiB (block size of 4 KiB)
- 2 TiB (block size of 8 KiB)
- Maximum volume size
- 4 TiB (block size of 1 KiB)
- 8 TiB (block size of 2 KiB)
- 16 TiB (block size of 4 KiB)
- 32 TiB (block size of 8 KiB)
ext3:
- Maximum number of files: min(volumeSize / 213, numberOfBlocks)
- Maximum file size: same as ext2
- Maximum volume size: same as ext2
ext4:
- Maximum number of files: 232 - 1 (4,294,967,295)
- Maximum number of files per directory: unlimited
- Maximum file size: 244 - 1 bytes (16 TiB - 1)
- Maximum volume size: 248 - 1 bytes (256 TiB - 1)
Solution 2:
I have had over 8 million files in a single ext3 directory. libc readdir()
which is used by find
, ls
and most of the other methods discussed in this thread to list large directories.
The reason ls
and find
are slow in this case is that readdir()
only reads 32K of directory entries at a time, so on slow disks it will require many many reads to list a directory. There is a solution to this speed problem. I wrote a pretty detailed article about it at: http://www.olark.com/spw/2011/08/you-can-list-a-directory-with-8-million-files-but-not-with-ls/
The key take away is: use getdents()
directly -- http://www.kernel.org/doc/man-pages/online/pages/man2/getdents.2.html rather than anything that's based on libc readdir()
so you can specify the buffer size when reading directory entries from disk.
Solution 3:
I have a directory with 88,914 files in it. Like yourself this is used for storing thumbnails and on a Linux server.
Listed files via FTP or a php function is slow yes, but there is also a performance hit on displaying the file. e.g. www.website.com/thumbdir/gh3hg4h2b4h234b3h2.jpg has a wait time of 200-400 ms. As a comparison on another site I have with a around 100 files in a directory the image is displayed after just ~40ms of waiting.
I've given this answer as most people have just written how directory search functions will perform, which you won't be using on a thumb folder - just statically displaying files, but will be interested in performance of how the files can actually be used.