Is there a way to mirror two severs on Ubuntu?

It depends very much on the job at hand.

Why do you need file mirroring. Do you want to update something like a website or content repository where it's usually okay to update periodically. Or do you need real time synchronization of data?

For periodic asynchronous mirroring of files it is usually sufficient to have a Staging Area that you upload all your data to. And from where you distribute it to the Servers. In your case - with two servers - you could create some staging fileshare on srv1 to where you transfer the data (via FTP, NFS, DAV, SFTP, etc.) and then have a cronjob rsync the files to the "live" directories of srv1 and srv2. The easiest way to use rsync in that case is to generate a ssh keypair that you will use for data transfers and which is authorized on all servers in your cluster.

Example:

srv1:/data/staging/  <= is where you upload your data
srv1:/data/production/ <= is where your servers get their production data from
srv2:/data/production/

srv1$ cat /etc/cron.d/syncdata.cron
=====
*/5 * * * * syncuser rsync -a --delete /data/staging/ /data/production/
*/5 * * * * syncuser rsync -az --delete -e ssh /data/staging/ srv2:/data/production/
=====

This should give you a basic idea. Of course you would want to wrap the rsync calls in some scripts and implement a proper locking so that it doesnt run twice in case the sync takes more than 5min, etc. Also, it goes without saying that a staging area is not mandatory. You might as well sync srv1:production to srv2:production directly. Just than srv2 might show data that is up to 5min older than that of srv1. Which might be a problem, depending on how you balance between the two.

Another way to asynchronously distribute files is to package them as rpm or in your case deb files. Put these in a central repository and have them install/update via something like cfengine, monkey or some diy message bus based solution. This has the nice side effect of versioning of deployed data but is only suitable for smaller amounts of data that you produce and deploy yourself (like versions of your own software). You wouldn't wanna distribute TBs of data with this and also it's not suited to mirror content that changes with a high frequency, like every other minute or so.

If you need to replicate data in near realtime but not necessarily synchronous instead of calling a cron every so often you can use some inotify based method like the already mentioned incron to call your sync scripts. Another posibility is to use Gamin (which also uses inotify if present in the Kernel) and write your own little sync daemon. Last but not least, if all the files are uploaded to one server via e.g. SFTP you might check if your SFTP Server allows you to define hooks which are called after certain events, like file upload. That way you could tell your Server to trigger your sync script whenever new data is uploaded.

If you need real time synchronous mirroring of data a cluster filesystem might be in order. DRDB has already been named. It is very nice for replication on the block level and often used for high available MySQL setups. You might also wanna take a look at GFS2, OCFS2, Lustre and GlusterFS. Though Lustre and GlusterFS are not really suited for a two Server setup.


Basically you have 3 possibilities:

  1. Let your application push the files to both servers.
  2. Asynchronous replication, e. g. rsync every 15 minutes (or less) with a cron job
  3. Synchronous replication on file system (e. g. GlusterFS) or block device level (e. g. DRBD). If you use replication on block device level, you need a file system which supports distributed locking (e. g. OCFS2 or GFS2) if you want to have r/w access to the files from both servers at the same time.

cron + rsync = mirrored directories/files


Depending on your specific use case - You could use something similar to DRBD http://www.drbd.org/