Is the F.U.D. about ext4 justified? Or would it be safe to use in some production systems?

I am wondering if ext4 is safe to use on my servers. But I've heard so much FUD about it that I am concerned.

Our system could lose some data, and it would not be too big a deal. Even a full days worth of data would not ruffle too many feathers. And our system could most definitely benefit from delayed writes.

That said, a full file system restore from backup would take days and be unacceptable.

Any experience or informed opinions on the subject out there?


Solution 1:

Honestly, I'd hold off on ext4 right now for production use.

There are other options if you're running into real performance problems with the filesystem (and I can understand that situation, at my last job we had performance limitations in an application due to ext3). Depending on your chosen distribution, you might be able to use jfs, xfs, or reiserfs. All three will generally outperform ext3 in different ways, and all three are much more tested and stable than ext4 right now.

So, my recommendation would be multiple parts. First, investigate thoroughly to make sure you're optimizing in the right place. Test your application on different filesystems and ensure that the performance is improved enough to make a filesystem change valid.

Also, depending on your application, adding more RAM might improve performance. Linux, by default, will make use of any RAM that isn't committed to applications as disk cache. Sometimes having a few GB of "unused" RAM can have a significant performance increase on boxes with heavy disk activity.

Finally, what's your timeline requirement here? If ext3 wasn't cutting it and I had to build a machine with a different filesystem today, I'd probably use xfs or jfs. If I could push it off for 6-8 months, I'd probably wait and see how ext4 has shaped up.

Solution 2:

Certainly Ubuntu 9.04 (jaunty) is still working out the bugs of ext4 in their version of kernel 2.6.28. Some bugs appear to only be in the ubuntu kernel rather than the mainline, but that indicates that if you have a non-mainline kernel you may run into similar troubles.

This page is a search of issues with ext4 that could be worth a browse. One current (6th May 2009) serious issue that causes the kernel to lock up is issue 330824. And a previous issue (now fixed) involved data loss. But I have not heard of any loss of the entire file system, and I think it would be big news if it was happening.

So I would say it is not entirely ready for prime time. If you really need it, then it may be worth setting up a test server to play with it. For the time being I would stick with the mainline kernels, and measure the performance gain - if the gain is dramatic and stress testing does not show up any problems then it might be worth trying it ...