Do I need to RAID Fusion-io cards?

Can I run reliably with a single Fusion-io card installed in a server, or do I need to deploy two cards in a software RAID setup?

Fusion-io isn't very clear (almost misleading) on the topic when reviewing their marketing materials Given the cost of the cards, I'm curious how other engineers deploy them in real-world scenarios.

I plan to use the HP-branded Fusion-io ioDrive2 1.2TB card for a proprietary standalone database solution running on Linux. This is a single server setup with no real high-availability option. There is asynchronous replication with a 10-minute RPO that mirrors transaction logs to a second physical server.

Traditionally, I would specify a high-end HP ProLiant server with the top CPU stepping for this application. I need to go to SSD, and I'm able to acquire Fusion-io at a lower price than enterprise SAS SSD for the required capacity.

  • Do I need to run two ioDrive2 cards and join them with software RAID (md or ZFS), or is that unnecessary?
  • Should I be concerned about Fusion-io failure any more than I'd be concerned about a RAID controller failure or a motherboard failure?
  • System administrators like RAID. Does this require a different mindset, given the different interface and on-card wear-leveling/error-correction available in this form-factor?
  • What IS the failure rate of these devices?

Edit: I just read a Fusion-io reliability whitepaper from Dell, and the takeaway seems to be "Fusion-io cards have lots of internal redundancies... Don't worry about RAID!!".


Ultimately, it comes down to your failure model. What is the impact of a failure?

Historically, we've always RAIDed everything since the cost of doing so has been negligible. Another $500 for a drive for mirroring? Totally worth the cost without even considering it.

When you're talking about another $10K+ to turn on mirroring, it needs a bit more consideration.


No, you do not need to mirror

The Fusion-io cards do have quite good internal redundancy. This isn't the kind of hardware where your disk is a single chip. In most of the situations where I've observed failure, it's been a firmware problem that has affected both members of a mirror so RAID would not have mattered.

Think of a Fusion-io card as a RAID controller with disks behind it. Are you fine with a single-controller setup? Probably. Treat it like that.

In many setups where you would deploy Fusion-io drives, you'll have other safeguards built in (redundancy at the node level) so it doesn't make as much sense.


Yes, you need to mirror

RAID increases your availability. Do you need absolute maximum availability despite the cost? Is the cost of a failure and possible downtime expensive? Go ahead and mirror the drives. In a statistically large setup, you will have failures of drives despite the internal safeguards.


The on-device redundancy should do the job just fine for failures of the flash chips - analogous to RAID among all of the components doing actual data storage.

Should I be concerned about Fusion-io failure any more than I'd be concerned about a RAID controller failure or a motherboard failure?

A failure of the entire device would be pretty much analogous to the loss of a RAID controller or motherboard - I'd be approximately as worried about the Fusion-io card as these other single-point-of-failure components, though I don't have experience with the devices at large scale to be able to compare failure rates using hard data.

Do I need to run two ioDrive2 cards and join them with software RAID (md or ZFS), or is that unnecessary?

Adding redundancy in addition to what the device already has (say, software RAID among multiple Fusion-io cards) would be a lot like doing software RAID between two hardware RAID groups on two different RAID controllers; might be worthwhile for systems warranting extreme redundancy to remove an additional single point of failure, but not for common deployments (a 10 minute RPO on a mirror should be good enough for most applications?).

Sysadmins like RAID. Does this require a different mindset, given the different interface and on-card wear-leveling/error-correction available in this form-factor?

Yeah, I think so. You're essentially getting a device that's like a RAID controller and a bunch of storage devices behind it in one package. It's definitely tempting to be worried about putting your sensitive data on a single device, but one needs to have some level of trust in the device's internal redundancy... Which should be counter-balanced with a healthy understanding of the "RAID is not a backup" concept: always be prepared for the failure of a redundant component, or for a user to delete the data on it, with good backups.


As you know we've used their kit for a while, in both RAID and non-RAID setups - I wish I had some failure experience to give you but I've not. We've had no failures that RAID would have helped with and their on-board resilience features are only getting better. Also the main function we use them for is now horizontally scaled/clustered now so we have even less reason to RAID them. Great cards though, highly recommend them.