Using slapcat to backup LDAP
I'm running an OpenLDAP directory on a Debian server, using the hdb backend. I've been wondering about backups, and did som reading on the net. Slapcat seems to be the way to go, but I keep seeing these posts speaking about it being dangerous to use it while slapd is running.
In what way is this dangerous? I'm planning to run these backups during the night, and no writing will be done to the database during the night - reads will probably occur though.
If there's any other backup solution better suited for this, I'd gladly hear about it.
Solution 1:
OpenLDAP supports various backends, the most popular currently being bdb/hdb. From the slapcat manpage:
For some backend types, your slapd(8) should not be running (at least,
not in read-write mode) when you do this to ensure consistency of the
database. It is always safe to run slapcat with the slapd-bdb(5),
slapd-hdb(5), and slapd-null(5) backends.
So, so long as you're using bdb or hdb as a backend, no problem running it while slapd is running. I use this on many servers, and recommend it. It is really the best way to backup.
Alternatives would include issuing a ldap search command to the server to return the whole tree. This will be significantly slower than slapcat, because all communication must go through the network layers, and access control rules must be checked. Also, you have to be very sure that the user you're searching as has the correct access rights.
Solution 2:
We have been using slapcat for our backups much like the other answers mentioned. They were very reliable for 4+ years but since the upgrade from Debian lenny to squeeze we experienced occasionally truncated backups. These broken backups occur in about 1 out of 30 cases. We are running slapcat while slapd is running. The bad part is that slapcat gives no indication of a problem, i.e. the exit code is zero. For historical reasons we are using the bdb backend. It might be better with the hdb backend.
We looked around to consider better backup strategies but apart from setting up a slave which can be stopped for the backup there is not much else as far as standard procedures or best practices go. This led us to rethink our backup policy and to write some scripts to automate the new policy.
First of all we have a script called safe-ldif which runs slapcat multiple times until it gets the same number of LDAP entries twice in a row. This is a trade off between speed and reliabilty. Second we decided to put all LDIF entries individually in a Git repository. This gives us much better compression than just compressing the individual LDIF files from slapcat. In addition it is very convenient to have a history of the individual LDIF stanzas.
If anybody else is interested have a look at https://github.com/elmar/ldap-git-backup