What is the best Linux filesystem for MySQL (InnoDB)?
I tried to look for benchmark on the performances of various filesystems with MySQL InnoDB but couldn't find any.
My database workload is the typical web-based OLTP, about 90% read, 10% write. Random IO.
Among popular filesystems such as ext3, ext4, xfs, jfs, Reiserfs, Reiser4, etc. which one do you think is the best for MySQL?
How much do you value the data?
Seriously, each filesystem has its own tradeoffs. Before I go much further, I am a big fan of XFS and Reiser both, although I often run Ext3. So there isn't a real filesystem bias at work here, just letting you know...
If the filesystem is little more than a container for you, then go with whatever provides you with the best access times.
If the data is of any significant value, you will want to avoid XFS. Why? Because if it can't recover a portion of a file that is journaled it will zero out the blocks and make the data un-recoverable. This issue is fixed in Linux Kernel 2.6.22.
ReiserFS is a great filesystem, provided that it never crashes hard. The journal recovery works fine, but if for some reason you loose your parition info, or the core blocks of the filesystem are blown away, you may have a quandry if there are multiple ReiserFS partitions on a disk - because the recovery mechanism basically scans the entire disk, sector by sector, looking for what it "thinks" is the start of the filesystem. If you have three partitions with ReiserFS but only one is blown, you can imagine the chaos this will cause as the recovery process stitches together a Frankenstein mess from the other two systems...
Ext3 is "slow", in a "I have 32,000 files and it takes time to find them all running ls
" kinda way. If you're going to have thousands of small temporary tables everywhere, you will have a wee bit of grief. Newer versions now include an index option that dramatically cuts down the directory traversal but it can still be painful.
I've never used JFS. I can only comment that every review of it I've ever read has been something along the lines of "solid, but not the fastest kid on the block". It may merit investigation.
Enough of the Cons, let's look at the Pros:
XFS:
- screams with enormous files, fast recovery time
- very fast directory search
- Primitives for freezing and unfreezing the filesystem for dumping
ReiserFS:
- Highly optimal small-file access
- Packs several small files into same blocks, conserving filesystem space
- fast recovery, rivals XFS recovery times
Ext3:
- Tried and true, based on well-tested Ext2 code
- Lots of tools around to work with it
- Can be re-mounted as Ext2 in a pinch for recovery
- Can be both shrunk and expanded (other filesystems can only be expanded)
- Newest versions can be expanded "live" (if you're that daring)
So you see, each has its own quirks. The question is, which is the least quirky for you?
It may also be worth noting that you can run InnoDB without a filesystem and improve performance without filesystem overhead. I'm not sure I'd recommend it, but I've used it before without issues.
InnoDB Raw Devices
In addition, if you're running at 90% reads and 10% writes, unless you need the transactional ability of InnoDB you might look into porting to MyISAM for better read performance.
The answers here are seriously deprecated, and need updating as this is coming up in google results.
For produciton environments, XFS. Everytime. XFS is journaled and non-blocking. Make sure you have the following variables for a modern (2011/2012) MySQL database using InnoDB in production:
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 1 # an ACID requirement
sync_binlog = 1 # more ACID
innodb_flush_method = O_DIRECT
Do not use EXT3 or even EXT4. One day BTRFS will get there.
EXT3, and perhaps EXT4, locks at the inode level, not smart!
Sources: - www.mysqlperformanceblog.com - http://dev.mysql.com/doc/internals/en/index.html - Understanding MySQL Internals by Sasha Pachev - https://www.facebook.com/note.php?note_id=10150210901610933 - http://oss.sgi.com/projects/xfs/training/ - Some swing kit, trial and error.
EDIT: An update. EXT4 seems to be doing pretty well as of Mid 2013! BTRFS still isn't a good option. And RHEL may well make XFS the new default file system. Again, do NOT use EXT3.
The short version is that the closest to a recommendation I've seen MySQL make on filesystems is XFS, however ext3 should be ok as well, ext4 promises to be a nice improvement, but it's still not quite stable, although it should be before the end of the year.
If you're running cluster filesystems CXFS, OCFS2 and GFS should all be ok.
I'd strongly warn against any Reiser derivatives, and JFS although once nice has been mostly beaten by XFS and ext4 which are both more widely deployed.
It's not likely to make much difference. Go with whatever your distribution uses as its default, provided it's sufficient.
Spend your effort tuning other things - get enough ram - get a raid controller which doesn't suck - and fix the application's lame (ab)use of the database (NB: this is the main culprit in most cases where it hasn't already been done).
Consider however, carefully, the filesystem you put your mysql tmpdir on; this will affect performance, particularly queries which do disc-based filesort()s (see EXPLAIN for more details).
I think a filesystem which support delayed allocation is really handy here, as you can avoid IO completely for short-lived files when there is enough ram to keep them in the cache. XFS, for instance, doesn't bother writing files at all which get deleted and closed before they've been allocated.
Of course putting a tmpdir on a tmpfs is attractive from a performance perspective, but leads to a risk of exhausting the space and having queries which would otherwise succeed (albeit with disc temporary files used) fail.