How to slave real-time data from two MySQL sources merging in to one MySQL destination?

I have two different MySQL instances that I would like to 'slave' to a third instance. (so I can do easy joins on the third instance)

eg.

mysql1> show databases
db1
db2

mysql2> show databases
db3
db4

mysql3> show databases
db1
db2
db3
db4

I've looked at maatkit (percona toolkit) pt-table-sync but people say it can corrupt data. (It apparent removes and re-adds data to generate inserts)

pt-archiver sort of works for 'snapshots', but db1 is approximately 6GB and copying the whole thing over is a lot more data than really needed. The real time updates are only about 100MB a day.

The natural concept to me would be to allow mysql3 to run as a replica slave of BOTH mysql1 and mysql2, but that doesn't seem to be an option in MySQL

Tungsten Replicator seems to allow for this type of data sync, but it seems a bit unwieldy to configure and I'm concerned about reliability.

Does anyone have other solutions they've used for this problem?


Solution 1:

By design, one mysqld process cannot simultaneously listen to two different Masters.

The CHANGE MASTER TO command only allows you to set one Master as a source to read.

In order to emulate this, you would have to alternate between the two Masters programmatically. How do you do that ?

I described in StackOverflow how to Connect a Slave Manually to different Masters where each Master was a Laptop and the Slave was a Central Computer.

Here is the basic idea

  • Master M1
  • Master M2
  • Slave S1

Setup Replication of M1 to S1 and then M2 to S1 like this

  • 1) Have S1 run CHANGE MASTER TO with M1 as the Source
  • 2) START SLAVE;
  • 3) Run Replication for a Little While
  • 4) STOP SLAVE;
  • 5) Have S1 run CHANGE MASTER TO with M2 as the Source
  • 6) START SLAVE;
  • 7) Run Replication for a Little While
  • 8) STOP SLAVE;
  • 9) Go Back to Step 1

Each time you switch from one Master to another, you must record two values from SHOW SLAVE STATUS\G

  1. Relay_Master_Log_file
  2. Exec_Master_Log_Pos

These two values represent the last SQL Statement that came from the Master and was is next to be executed on the Slave.

There is one major caution : As long as M1 and M2 are updating mutually exclusive databases, this algorithm should be just fine.

Believe it or not, I addressed a question like this in ServerFault back in May 2011. I actually explained how to emulate true multimaster/single slave using the BLACKHOLE Storage Engine based on the book "High Performance MySQL".