VMware VMFS5 and LUN sizing - multiple smaller datastores, or 1 big datastore?

With VMFS5 no longer having the 2TB limit for a VMFS volume, I'm considering which scenario would be more beneficial overall:-

Less LUNs of larger size, or
More LUNs of smaller size.

In my case, I have a new 24-disk storage array with 600GB disks. I'll be using RAID10, so roughly 7.2TB, and trying to decide whether to go with 1 big 7TB datastore, or multiple stores of 1TB each.

What are the pros and cons of each approach?

Update: Of course, I neglected to include hot spares in my calculation, so it'll be just under 7.2TB, but the general idea is the same. :-)

Update 2: There are 60 VMs and 3 hosts. None of our VMs are particularly I/O intensive. Most of them are web/app servers, and also things like monitoring (munin/nagios), 2 Windows DCs with minimal load, and so on. DB servers are rarely virtualised unless they have VERY low I/O requirements. Right now I think the only virtual DB server we have is an MSSQL box and the DB on that box is <1GB.

Update 3: Some more info on the array and FC connectivity. The array is an IBM DS3524, 2 controllers with 2GB cache each. 4x 8Gbit FC ports per controller. Each ESXi host has 2x 4Gbit FC HBAs.

You didn't specify how many VMs you have or what they're going to be doing. Even without that information, I'd avoid one making one big LUN for blocksize/performance, contention and flexibility reasons.

I will assume you are going to virtualize servers, not desktops, all right? Next I'm going to assume that you are going to use several ESX/ESXi servers to access your storage and have them managed by vCenter Server.

When deciding on LUN size and the number of VMFS you are balancing several factors: performance, configuration flexibility, and resource utilisation, while bound by supported maximum configuration of your infrastructure.

You could get the best performance with 1 VM to 1 LUN/VMFS mapping. There is no competition between machines on the same VMFS, no locking contention, each load is separated and all is goood. The problem is that you are going to manage an ungodly amount of LUNs, may hit supported maximum limits, face headaches with VMFS resizing and migration, have underutilized resources (those single percentage point free space on VMFS adds up) and generally create a thing that is not nice to manage.

The other extreme is one big VMFS designated to host everything. You'll get best resources utilization that way, there will be no problem with deciding what do deploy where and problems with VMFS X being a hot spot, while VMFS Y is idling. The cost will be the aggregated performance. Why? Because of locking. When one ESX is writing to a given VMFS, other are locked away for the time it takes to complete IO and have to retry. This costs performance. Outside playground/test and development environments it is wrong approach to storage configuration.

The accepted practice is to create datastores large enough to host a number of VMs, and divide the available storage space into appropriately sized chunks. What the number of VMs is depends on the VMs. You may want a single or a couple of critical production data bases on a VMFS, but allow three or four dozen of test and development machines onto the same datastore. The number of VMs per datastore also depends on your hardware (disk size, rpm, controllers cache, etc) and access patterns (for any given performance level you can host much more web servers on the same VMFS than mail servers).

Smaller datastores have also one more advantage: they prevent you physically from cramming too many virtual machines per datastore. No amount of management pressure will fit an extra terabyte of virtual disks on a half-a-terabyte storage (at least until they hear about thin provisioning and deduplication).

One more thing: When creating those datastores standardize on a single block size. It simplifies a lot of things later on, when you want to do something across datastores and see ugly "not compatible" errors.

Update: DS3k will have active/passive controllers (i.e. any given LUN can be served either by controller A or B, accessing the LUN through the non-owning controller incurs performance penalty), so it will pay off to have an even number of LUNs, evenly distributed between controllers.

I could imagine starting with 15 VMs/LUN with space to grow to 20 or so.

The short answer to your question is: it all depends on what your IO patterns are, and this will be unique to your environment.

I suggest you have a look here http://www.yellow-bricks.com/2011/07/29/vmfs-5-lun-sizing/ as this may help you consider your anticipated IOPS and how many LUNS might be suitable. That said, if your were to err on the side of caution, some people would advise having many LUNS (If my correction to a previous answer is approved, see my comments re LUN IO queues on the array side). I tend to agree, but would go further to then extent them together into a single/few VMFS volumes (don't believe the FUD about extents, and other VMFS limits http://virtualgeek.typepad.com/virtual_geek/2009/03/vmfs-best-practices-and-counter-fud.html). This will have the benefit of managing a single/few datastores within vSphere and, since vSphere automatically balances VMs over the available extents starting with the first block of each extent, the performance benefit spreading your IO over multiple LUNs.

Something else to consider... You say none of the VMs are particularly IO intensive. Given this, you may like to consider a combination of RAID5 and RAID10, to get the best of both worlds (space and speed).

Further, if you have your VMs configured with multiple VMDKs, with the OS and application IO patterns spread across those virtual disks (ie. OS, web, DB, logs, etc each on a separate VMDK), you can then locate each VMDK on a different datastore to match the IO abilities of that physical LUN (eg. OS on RAID5, Logs on RAID10). Its all about keeping similar IO patterns together to take advantage of the mechanical behaviour of the underlying disks so that, for example, log writes in one VM don't impact your web read rates in another VM.

FYI... you can successfully virtualise DB servers, you just need to analyse the IO patterns & IOPS rates and target that IO to a suitable LUN; all the while being aware of the IO patterns and IOPS that that LUN is already doing. This is why many admins blame virtualiseation for poor DB performance... cos they didn't carefully calculate the IO/IOPS that multiple servers would generate when they put them on a shared LUN (ie. it the admins' fault, not virtualisation's fault).

VMware VMFS5 and LUN sizing - multiple smaller datastores, or 1 big datastore?

Related

Recent Posts