KVM+DRBD replicated between two active-passive servers with manual switching

I have a very similar installation with the setup you described: a KVM server with a stanby replica via DRBD active/passive. To have a system as simple as possible (and to avoid any automatic split-brain, ie: due to my customer messing with the cluster network), I also ditched automatic cluster failover.

The system is 5+ years old and never gave me any problem. My volume setup is the following:

  • a dedicated RAID volume for VM storage;
  • a small overlay volume containing QEMU/KVM config files;
  • bigger volumes for virtual disks;
  • a DRBD resources managing the entire dedicated array block device.

I wrote some shell scripts to help me in case of failover. You can found them here

Please note that the system was architected for maximum performance, even at the expense of features as fast snapshots and file-based (rather than volume-based) virtual disks.

Rebuilding a similar, active/passive setup now, I would heavily lean toward using ZFS and continuous async replication via send/recv. It is not real-time, block based replication, but it is more than sufficient for 90%+ case.

If realtime replication is really needed, I would use DRBD on top of a ZVOL + XFS; I tested such a setup + automatic pacemaker switch in my lab with great satisfaction, in fact. If using 3rdy part modules (as ZoL is) is not possible, I would use a DRBD resources on top of a lvmthin volume + XFS.


Why not using things which have been checked by thousands of users and proved their reliability? You can just deploy free Hyper-V server with, for example, StarWind VSAN Free and get true HA without any issues. Check out this manual: https://www.starwindsoftware.com/resource-library/starwind-virtual-san-hyperconverged-2-node-scenario-with-hyper-v-server-2016