What is the CouchDB replication protocol? Is it like Git?
Is there technical documentation describing how replication between two Couches works?
What is the basic overview of CouchDB replication? What are some noteworthy characteristics about it?
Unfortunately there is no detailed documentation describing the replication protocol. There is only the reference implementation built into CouchDB, and Filipe Manana's rewrite of the same which will probably become the new implmentation in the future.
However, this is the general idea:
Key points
If you know Git, then you know how Couch replication works. Replicating is very similar to pushing or pulling with distributed source managers like Git.
CouchDB replication does not have its own protocol. A replicator simply connects to two DBs as a client, then reads from one and writes to the other. Push replication is reading the local data and updating the remote DB; pull replication is vice versa.
- Fun fact 1: The replicator is actually an independent Erlang application, in its own process. It connects to both couches, then reads records from one and writes them to the other.
- Fun fact 2: CouchDB has no way of knowing who is a normal client and who is a replicator (let alone whether the replication is push or pull). It all looks like client connections. Some of them read records. Some of them write records.
Everything flows from the data model
The replication algorithm is trivial, uninteresting. A trained monkey could design it. It's simple because the cleverness is the data model, which has these useful characteristics:
- Every record in CouchDB is completely independent of all others. That sucks if you want to do a
JOIN
or a transaction, but it's awesome if you want to write a replicator. Just figure out how to replicate one record, and then repeat that for each record. - Like Git, records have a linked-list revision history. A record's revision ID is the checksum of its own data. Subsequent revision IDs are checksums of: the new data, plus the revision ID of the previous.
-
In addition to application data (
{"name": "Jason", "awesome": true}
), every record stores the evolutionary timeline of all previous revision IDs leading up to itself.- Exercise: Take a moment of quiet reflection. Consider any two different records, A and B. If A's revision ID appears in B's timeline, then B definitely evolved from A. Now consider Git's fast-forward merges. Do you hear that? That is the sound of your mind being blown.
-
Git isn't really a linear list. It has forks, when one parent has multiple children. CouchDB has that too.
Exercise: Compare two different records, A and B. A's revision ID does not appear in B's timeline; however, one revision ID, C, is in both A's and B's timeline. Thus A didn't evolve from B. B didn't evolve from A. But rather, A and B have a common ancestor C. In Git, that is a "fork." In CouchDB, it's a "conflict."
In Git, if both children go on to develop their timelines independently, that's cool. Forks totally support that.
- In CouchDB, if both children go on to develop their timelines independently, that cool too. Conflicts totally support that.
- Fun fact 3: CouchDB "conflicts" do not correspond to Git "conflicts." A Couch conflict is a divergent revision history, what Git calls a "fork." For this reason the CouchDB community pronounces "conflict" with a silent n: "co-flicked."
-
Git also has merges, when one child has multiple parents. CouchDB sort of has that too.
- In the data model, there is no merge. The client simply marks one timeline as deleted and continues to work with the only extant timeline.
-
In the application, it feels like a merge. Typically, the client merges the data from each timeline in an application-specific way. Then it writes the new data to the timeline. In Git, this is like copying and pasting the changes from branch A into branch B, then commiting to branch B and deleting branch A. The data was merged, but there was no
git merge
. - These behaviors are different because, in Git, the timeline itself is important; but in CouchDB, the data is important and the timeline is incidental—it's just there to support replication. That is one reason why CouchDB's built-in revisioning is inappropriate for storing revision data like a wiki page.
Final notes
At least one sentence in this writeup (possibly this one) is complete BS.
Thanks Jason for the excellent overview! Jens Alfke, who is working on TouchDB and its replication for Couchbase, has (unofficially) described the CouchDB replication algorithm itself if you're interested in the technical details of how a "standard" CouchDB replicator protocol tends to work.
To summarize the steps he's outlined:
- Figure out how far any previous replication got
- Get the source database
_changes
since that point - Use
revs_diff
on a batch of changes to see which are needed on the target - Copy any missing revision metadata and current document data+attachments from source to target, posting to
bulk_docs
both for optimization and so as to store the docs differently than the usual higher-level MVCC handling does onPUT
.
I've glossed over many details here, and would recommend reading through the original explanation as well.
The documentation for CouchDB v2.0.0 covers the replication algorithm much more extensively. They have diagrams, example intermediate responses, and example errors. They use the "MUST", "SHALL", etc. language of IETF RFCs.
The specifics for 2.0.0 (still unreleased as of January 2016) are a bit different from 1.x, but the basics are still as @natevw described.