Zero downtime with Kubernetes on top of GlusterFs on top of a ZFS raid - Is this the best solution?

Solution 1:

When you do clustering, you have to think of split brain. For this you need 3 nodes.

I would prefer a RAID10 instead of RAID5 (RAIDZ), in the case of ZFS mostly for performance.

For MySQL/MariaDB I would use Galera plugin for replication.

You will need a clustering management software like ClusterLabs Pacemaker.

For storage I would consider also Ceph.

But there is another aspect of this setup. Complexity. Where do you test it? Do you plan to automate the installation. Will you automation allow to install your setup for VMs? How do you plan to configure fencing? Will you use a storage VLAN?

Do you plan to use a load balancer (e.g HAProxy)?

Network redundancy? LACP, Spanning tree, OSPF/BGP...

How is the server load? Maybe you can install all setup in VMs. You would still need 3 physical hosts, but you will have more flexibility.

And you need to write down documentation and scripts for various failure scenarios, including those caused by human errors.

With only 2 machines, for written data (storage, database) it's better to do an master-slave config where you write only on the master and have the salave as backup. For stateless services, you can configure them in active-active mode.