IOps required to boot a KVM instance

Boot storms are something that shouldn't be taken lightly as they can cause seriously big storage arrays to grind to a halt. Avoiding these sudden boot storms due to that many instances starting at once should be a major design consideration.

How many IOPS are really needed is difficult to model and you have to measure on your own. You have to chose the toolbox according to the NAS you have for that. Be more specific if you can .

There's also a heavy dependence on what you are actually booting up there.


It depends on the VM obviously - just booting GRUB? maybe 50-100 operations - an exchange server? many, many more...

The answer is easy, benchmark one - it's the only way to know.

Oh and does it have to be KVM? Something like that sounds more like the kind of thing VMWare's View product was built for.


Some comparative test data (200 VMWare VMs) here: http://ctistrategy.com/2009/12/28/vmware-boot-storm-netapp-part-2/

Cheers


Booting 1000 instances of anything concurrently is pretty serious. Even if individual performance is acceptable at 20-30IOPS (which would be a slow disk on a single machine) you're looking at 20-30K IOPS. Get yer checkbook out. It's actually worse than that as most OS's will consume a lot more than is acceptable if there is nothing preventing them. As an example if you have a Windows XP client VM and give it effectively infinite IOPs by connecting it to an SSD array that can deliver 20K or more IOPS I've seen individual virtual machines consume almost 1000 IOPS.

Staging such boot sequences is vital. If these are very low overhead systems then you might be able to get away with about 5 IOPS per system under steady state but the boot storm is called that for a reason. Read\Write IO ratios are also critically important - it's a lot easier (cheaper!) to deliver 5-10K IOPS for read heavy IO but the sustaining IO patterns for typical non server systems are very heavily write biased and it will be much more expensive finding a solution that can reliably deliver 5000IOPs in a 50:50 R/W pattern as it will be to deliver 5000IOPs in an 80:20 R/W pattern.

But seriously - there are very few storage solutions out there that can reliably boot 1000 VM instances concurrently.