What happens when a disk fails in LVM?
I am configuring a linux server on an ESX 4.1 host. This server needs to have several TBs of data stored on it. We are currently debating whether or not to use LVM. Our current reasoning is that is is best to have multiple 2TB volumes (a limit imposed by ESX) mounted onto separate volumes as such.
/disk1 - 2TB
/disk2 - 2TB
/disk3 - 2TB
We will be storing directories that range in size from 100GB to 400GB. These directories need to be stored in their whole and cannot be split up. The concern is that there will be a lot of wasted space if we end up having 1.7TB stored on /disk1 and need to store an additional 400GB. In which case we would need to store the 400GB directory on /disk2, leaving 300GB unused.
One solution to this problem is LVM, configured as:
--------
Disk 1 |
|
Disk 2 |---->/disk
|
Disk 3 |
--------
However we are stuck on one simple question. What happens if Disk 2 fails?
In the first scenario it is obvious what happens if Disk 2 fails, /disk2 would no longer be accessible.
In the LVM setup, if Disk 2 were to fail, would it be similar (as in, only the data that was stored on Disk 2 is no longer available) or would all data on /disk no longer be accessible?
You've omitted a number of important abstraction concepts that come with LVM. Logical volumes do not handle disks - they are placed on volume groups. VGs in turn consist of physical volumes which can be disks. Cutting a long story short, the VG would not come up with a missing PV - i.e. a missing disk, so you would not able to access the logical volumes on the group.
There are recovery procedures, but usually, in a virtualized environment, you would see "all-or-nothing" availability anyway - all disk files would be contained in a single directory which is either accessible with its entire content or not at all (if the datastore is not available for example).
As for the storage efficiency, consider using thin provisioning - "unused" space is not claimed on the datastore. However, it comes at the cost of higher administrative overhead.