VDO/Virtual Disk Optimizer limitations in storage stack
Well, RHEL 7.5 released with important add-on, VDO, which basically adds thin provisioned compressed and de-duplicated volumes, which is great and we'll get these benefits with derivatives and other distros too, as technology was acquired from Permabit and is open source.
According to official docs (https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/storage_administration_guide/vdo-qs-requirements), there are some considerations ("Placement of VDO in the Storage Stack" section of the doc):
As a general rule, you should place certain storage layers under VDO and others on top of VDO:
- Under VDO: DM-Multipath, DM-Crypt, and software RAID (LVM or mdraid).
- On top of VDO: LVM cache, LVM Logical Volumes, LVM snapshots, and LVM Thin Provisioning.
Well, because it's "general" rule - i see no problem with that and everything is fine. Next we see following:
The following configurations are not supported:
- VDO on top of VDO volumes: storage → VDO → LVM → VDO
- VDO on top of LVM Snapshots
- VDO on top of LVM Cache
- VDO on top of the loopback device
- VDO on top of LVM Thin Provisioning
- Encrypted volumes on top of VDO: storage → VDO → DM-Crypt
- Partitions on a VDO volume: fdisk, parted, and similar partitions
- RAID (LVM, MD, or any other type) on top of a VDO volume
This is kinda "scary" and we should be carefully in design, because looks like following won't be "supported:"
storage -> LVM PV -> LVM VG -> LVM Thin -> LVM LV -> Storage (in VM) -> VDO (in VM) -> EXT4 (in VM)
Note, that VDO/EXT4, final result is in the VM, LVM LV is directly attached to the VM and it's similar to:
storage -> LVM PV -> LVM VG -> LVM Thin -> LVM LV -> VDO -> Storage (in VM) -> EXT4 (in VM)
- Is that really problematic or dangerous and not supported?
- Why?
Creating everything on underlying device is not always good option, but I don't see a clear explanation why we have these limitations.
Maybe because these VDO volumes will be exposed to both Host and Guest?
What is the point of creating the VDO on top of the Thin LVM? VDO is already thin provisioned and working on 4kb blocks.
- VDO on top of VDO volumes: storage → VDO → LVM → VDO - does not make sence to deduplicate deduplicated data
- VDO on top of LVM Snapshots - does not make sense to snapshot deduplicated data
- VDO on top of LVM Cache - do you need to deduplicate caching, really?
- VDO on top of LVM Thin Provisioning - as I said above, VDO is already thin device. Moreover, VDO itself will change the state to read only in case if there is no free space on underlying storage, while if you put VDO on top of the LVM thin, VDO will doesn't know that space is ended it will leads to possible data corruption
- Encrypted volumes on top of VDO: storage → VDO → DM-Crypt - by design deduplication of encrypted data is not possible (obviously because encrypted data/device is require fully provisioned size) RAID (LVM, MD, or any other type) on top of a VDO volume - why do you need to create a RAID groups for deduplicated objects?
Regarding your scenario, just make it like this (LVM must be redundant on physical level):
storage → LVM PV → LVM VG → LVM LV → VDO → Storage (in VM) → EXT4 (in VM)
I put some test VMs in similar scenario and everything working just fine.