DRBD / Heartbeat on Virtual Machines

Solution 1:

Have you looked whether the VMs complain about dropped interrupts or similar things - maybe the host hardware is just overloaded or not enough ressources are allocated to your VMs?

If it's a flaky or overloaded network, the right thing to do would of course be fixing that; but if your hosting provider is not keen on that, can you use multiple physical paths by attaching multiple bridged networks to different host devices (hopefully on different switches)?

Just using redundant network paths via 802.3ad couldn't hurt in that case, either.

A commenter on another question mentioned split-brain - that's one thing you want to avoid at all cost: Normally a STONITH script would e.g. turn off a networked PDU strip on the other host so that the other host is down for sure; in a VM you might try a script that switches the other VM off via the VMware API.

Finally - maybe DRBD is just not right for your scenario. If you have a SAN, you may want to open the same device on the fabric on both VMs as a raw disk and then run OCFS2 or a similar cluster FS on it. Friends have seen OCFS2 run rock-solid on up to four nodes simultaneously, which would free you up to do multi-node clusters with heartbeat2 instead of being locked in with two-node fail-over like on heartbeat 1 by drbd.

Caveat emptor: heartbeat 2 uses XML config files. Not everyone (e.g., me) likes that.

DRBD / Heartbeat on Virtual Machines

Solution 1:

Related

Recent Posts