I have a server with 2 HDD's (2x 1 TB), running in RAID 1 (SW-RAID). I want to improve IO performance by using flashcache. There are running KVM virtual machines on it, using LVM.

Regarding this, I have the following questions:

  • Will this even work? flashcache works for block devices, however these are all virtual machines with their own setup.
  • How much would I expect to increase performance? Most virtual machines run websites and some host games.
  • How big does the SSD needs to be? Would having a bigger SSD increase performance since it's able to cache more files?
  • What happens if the SSD dies? Would flashcache retrieve files from the traditional HDD and I could simply replace the SSD?
  • How much faster would writeback be in comparison with writethrough and writearound?

I have no access to a test system unfortunately, so could I install flashcache on a live server without unmounting the the disks? I found a great tutorial here which I would be using.


Solution 1:

Flashcache, for those who haven't seen it before, is a method for extending the Linux block-cache with a SSD drive. It's cheaper than running a server with a half TB of RAM just for caching.

Will this even work?

It should. The Linux block-cache works by caching accessed blocks, not files. So long as you're not giving the KVM machines direct access to the block devices (you're not), the Linux Block Cache will be in play. However, if you are giving KVM machines direct block-device access the answer there is less clear.

If you're using file-backed virtual-disks, it'll definitely work.

If you're using LV-backed virtual-disks, I don't know.

How much would I expect to increase performance?

That is something we can't answer. It depends on a variety of things. In the abstract, you'll get the best performance for sizing your SSD to be larger than the active-set of blocks. If you get perfect caching, your performance will be similar to running your entire system on SSDs. Which you'll effectively be doing.

How big does the SSD needs to be?

Finding out the exact size you need is something we can't help with. More is better, obviously, but finding the exact ratio between cache-SSD and primary storage it not a simple matter.

Complicating this are writes set to flush immediately, such as certain file-system operations and some database configurations. Those writes will only be briefly cached, and their performance will not be affected in any way by the presence or absence of flashcache.

What happens if the SSD dies?

The same thing happens when you tell Linux to drop-caches but with a twist. With drop-caches, any unflushed writes that are in the block-cache will get flushed to disk. What happens when the SSD disappears depends on the caching mode:

Writethrough: All writes are written to the cache and primary storage in parallel, so the chances of a sudden SSD loss causing errors on the VMs are very small.

Writearound: All writes are written to primary storage and only cached when read. No chance of errors in the VMs.

Writeback: All writes go to the Cache first, and are written to primary storage in the background. The most likely to cause errors in your VMs should the SSD fail, and I wouldn't use this mode in production.

How much faster would writeback be in comparison with writethrough and writearound?

Depends on how much writing you're doing. If your writes periodically saturate your primary storage, the performance increase could be rather significant. If you're mostly read with some write, you'll not likely notice improvements.

Also, writeback is a bad policy for what you're doing so don't use it.

Solution 2:

Yes, it will work fine as long as you use the right block devices. And there's a trick.

When LVM scans for PVs, it should see the partition through the actual hard drive itself, and through the flashcache "virtual" device as well.

One obvious symptom should be that LVM tools complain about duplicate PVs.

The fix, to avoid those warnings and more importantly, make sure that the flashcache device is used by LVM2, is to adapt the filter in /etc/lvm/lvm.conf.

The LVM.CONF(5) manpage will explain it better than me, but I'll leave you with an example, if all physical volumes are backed by flashcache:

filter = [ "a/.*dm.*/" ]

Solution 3:

There is also tier from the lessfs creator. That will allow you to create hybrid devices between SSD and HDD. The performance of tier seems to outperform Flashcache.

http://www.lessfs.com/wordpress/

http://www.lessfs.com/wordpress/?p=776

//Christian

Solution 4:

Some applications open files in non-buffered way.

http://man7.org/linux/man-pages/man2/open.2.html

O_DIRECT (Since Linux 2.4.10) Try to minimize cache effects of the I/O to and from this file. In general this will degrade performance, but it is useful in special situations, such as when applications do their own caching. File I/O is done directly to/from user- space buffers. The O_DIRECT flag on its own makes an effort to transfer data synchronously, but does not give the guarantees of the O_SYNC flag that data and necessary metadata are transferred. To guarantee synchronous I/O, O_SYNC must be used in addition to O_DIRECT. See NOTES below for further discussion.

For example, this is very common for databases. So double check if flashcache works with this set of applications.