High availability MariaDB only two servers
I'm not worried about split brain since the connection between the two servers is solid (and because I don't have a third machine)
I want to have some MariaDB replication with automatic failover so even if one database dies, it continues working. I've seen MaxScale, but since I only have two machines it'd have to run on the same machine as one of the servers and if that server dies, then nothing works. AFAIK, MariaDB Galera clusters will refuse to allow me to run on only two and have automatic failover (will require quorum). However, I might be able to run an arbitrator on another machine or even run another database on it, but it'd be slow.
Additionally, the backend is PHP - I'm willing to change the mysqli setup and such but I don't know if or what I'd have to change there.
EDIT: I'm willing to forgo auto failover, but the behavior I would then want would be the following:
If I connect to Server A, it connects to Database A (master) and reads/writes normally.
If I connect to Serer B, it connects to Database B (read-only slave) and reads just fine. If it has to write, it'll be able to but it'll push them to Database A.
Would this be possible using MaxScale on both servers or something like that?
Solution 1:
You have two nodes. Don't use master-master of any kind, it's incredibly prone to split-brain on two nodes (it's almost guaranteed to happen eventually).
This kind of stateful application can't be expected to handle a two node cluster deployment on its own very well - either operator intervention or a CRM will be necessary to make the cluster at all robust in the case of failure - which is the reason it's clustered in the first place.
You have a two node cluster. You absolutely should be worried about split-brain, because that architecture is very prone to split-brain conditions. Just because the inter-node network link is solid today doesn't mean it will always be that way, and this is one of the largest components of risk in a two node cluster. Losing that link will instantly split-brain the cluster unless FENCING or QUORUM is established between nodes. This is one of the the biggest considerations in a two-node cluster, as fencing reduces the chances of split-brain conditions from high to near-zero.
I would recommend handling this with Pacemaker/Corosync. It's a complicated stack, but provides the mechanisms required to yield a production-grade cluster in two nodes. I would also recommend using only a single master instance at a time, rather than multi-master, even when under the enforcement of such a cluster manager.
There's a good guide for HA MariaDB that can serve as a starting point. It does NOT cover the use of fencing. If you cannot accomplish fencing, Corosync also has the ability to use a small arbitrator node running a voting daemon to provide the overall implementation with quorum at no application overhead cost (see information on Corosync qdevice).
It's behind a subscription wall, but its an end to end guide on configuring an active-passive MySQL cluster, running on one node at a time and replicating block storage between nodes
Pacemaker advanced resource types cover most of your questions regarding how to gracefully orchestrate failover, with the ability to group resources into linear dependency chains, as well as express multi-state leader-election semantics for running more than one instance of an application across nodes. That can be found here.
Bundles are a way to accomplish application isolation in Pacemaker via container runtimes like Docker and RKT. This opens up another avenue of fencing, as these bundles appear to the cluster as Pacemaker nodes themselves - so they can be "fenced" by the cluster independently of other applications. That can be found here.
Solution 2:
I ran various DBs (Mongo, Elasticsearch, SQL Server, others) with the same philosophy "I don't care about problems, I can only run two nodes."
It was a WORLD of hurt.
If you run master-slave, fine. But there will be headaches.
After years of dancing around the issue, and dealing with various devops headaches created by my insistence on only two nodes (which I insisted on because our databases were really large, and the cost of a third node was material), I finally started running three nodes.
And then everything got better.
The lesson I took, from years of dancing, is: There are two options:
- Single node with warm ish spare (e.g. master-slave)
- Three nodes
From my experience, I would never run two nodes active-active again (unless there is a magic piece that completely prevents split brain, which I have seen, and which is crazy ugly).
From five years of running multiple 0.5-1.5TB databases on various stacks.