How to synchronize the home folder between multiple computers?

Solution 1:

Here's a list things that could potentially solve this problem, each balances the trade-offs you have to make differently so you'll have to make your own choices and try things out for yourself:

  • Unison - as mentioned by others, this is run manually, but is very fast, reliable and effective. Requires both machines being synchronised to be on at the same time. It has a nice user interface to allow you to deal with the almost inevitable conflicts, and tracks and propagates deletions correctly. The graphical app/package is called unison-gtk.

  • OwnCloud - Cloud storage run on your own server. You'll need a machine to leave on. Requires a reasonable amount of setup. Runs a full Apache 2 webserver and an SqlLite or MySQL database on the server. Works similar to Dropbox with a desktop client, but the server is under your control. edit : OwnCloud has recently gone through some changes in how the project is run, and now has a new fully open source (ie no closed source 'enterprise' edition) under the guise of NextCloud, (see this youtube interview with the original OwnCloud developer for more details).

  • SparkleShare - uses git to keep files in sync. According to the homepage: good for many smaller files, not good for lots of large files such as music or photo collection.

  • Seafile - Provides a server component you can install on a local machine. Seafile uses a data model similar to git for tracking changes. Provides sync clients for desktops, tablets and smartphones. A blog post describing setup can be found at http://openswitch.org/blog/2013/07/18/installing-and-configuring-seafile-on-ubuntu-12-dot-04/

  • Osync - "... bidirectional file synchronization tool written in bash and based on rsync. It works on local and / or remote directories via ssh tunnels. It's mainly targeted to be launched as cron task" (text from the website)

  • PowerFolder - java based GPL v2 project. Main website pushes commercial offerings so it's not clear how to use the provided .jar file.

  • Rsync - fast and effective and been around for decades, however it doesn't keep a history so you have to choose a direction to decide whether a file is new or deleted. Graphical tools are available such as gwRsync.

  • Lsyncd - monitors folders/files to trigger rsync replication

  • dvcs-autosync - written in python, uses git to store and share changes between machines, and XMPP to communicate changes.

  • git-annex - command line tool for shunting files around, based on git. There's an illustrative walkthrough here: http://git-annex.branchable.com/walkthrough/

  • Tonido - freeware. Provides a desktop app that will share files to other devices. Also provide commercial cloud offerings, and the TonidoPlug plug computer.

  • BitTorrent Sync (freeware) - peer-to-peer file sync based on BitTorrent. I don't know much about this as I won't be using it due to it not being open source and not trusting it to keep my data within my LAN, feel free to edit this answer with better information / real experiences.

  • SyncThing - Developed as an open source alternative to BitTorrent Sync. It currently lacks some of the advanced features of BitTorrent Sync, such as untrusted peers. It is under active development.

  • Commercial hosted services such as dropbox, ubuntu one, google drive, apple iCloud are all quick cheap and convenient, however they all require trusting a company with all your data, and need a reasonably fast internet connection.

  • Git / subversion - Use a source control system directly. Completely manual and can be a little complex but popular approach with some users familiar with these systems from using them as programming tools.

  • CloudFS - syncronise a whole filesystem, cluster technology based

  • NFS mount - basically your home lives on one machine and you access it over the network, no good for laptops you take with you. More info: http://www.linuxjournal.com/article/4880


Factors to consider in making your decision:

  • Central server - some solutions require a machine to be on all the time (or at least when you need to synchronise) for other machines to synchronise with. This could be one of your existing machines, or a separate machine such as a NAS. Watch out for increased power bills.

  • Automatic / Manual / Scheduled - The best way to avoid having to resolve conflicts where something is changed on more than machine is to have a program on every machine that watches for changes and synchronises immediately, this way you reduce the opportunity to end up with multiple versions. With manual processes you always have to remember to run the synchronisation.

  • Remote access - do you want to synchronise away from your LAN (aka home), think about the security implications of this.

  • Security - does your data leave your network encrypted or not, how secure is the transfer between machines. What if someone captures your data on the move and later the encryption is found to have flaws? Who controls the server that keeps your data, is the data enccrypted, can you trust any third-parties? Do you have to poke holes in your router to get remote access. How long do 'deleted' files and related meta-data stick around for on the synchronised devices and on the central server. Are you synchronising between encrypted and unencrypted storage?

  • Moving large folders - the solutions I've tried all have an issue that when you move / rename a file or folder the sync doesn't understand this and uploads it all over again as new and then deletes the old copy.

  • Disk capacity

  • Backups - synchronisation is not backup. Delete an important file by mistake and many of the above will merrily delete all your other copies. I recommend reading Mat Honan's piece on being hacked for a good account of what can happen if you put all your digital eggs in one digital basket, so to speak.


I recommend not syncing the entire home folder, but instead picking specific folders to sync such as Documents/, Pictures/ etc. This will avoid the pain of being forced to deal with the speed / performance / disk space issues of automatically synchronising everything. It also avoids having to maintain exclusion lists.

As I continue to try and find something that works for me personally I'll try and keep this answer up to date with useful info. I've aggregated the information from all the other answers into one complete answer.

References:

  • LinuxFormat - February 2014 LXF180 p31, "Hosted Storage Roundup"

hacking/all/

Solution 2:

Unison might be a good candidate:

Unison is a file-synchronization tool for Unix and Windows. It allows two replicas of a collection of files and directories to be stored on different hosts (or different disks on the same host), modified separately, and then brought up to date by propagating the changes in each replica to the other.

It already does 2-way syncs. See update bellow.

I've learnt that there are very few things rsync cannot do, and it can probably provide an equal or better solution, but you'll have to wait for an rsync expert to turn up for that solution.

Update: Yes, Unison can sync more than 2 machines. From their user manual:

Using Unison to Synchronize More Than Two Machines

Unison is designed for synchronizing pairs of replicas. However, it is possible to use it to keep larger groups of machines in sync by performing multiple pairwise synchronizations.

If you need to do this, the most reliable way to set things up is to organize the machines into a “star topology,” with one machine designated as the “hub” and the rest as “spokes,” and with each spoke machine synchronizing only with the hub. The big advantage of the star topology is that it eliminates the possibility of confusing “spurious conflicts” arising from the fact that a separate archive is maintained by Unison for every pair of hosts that it synchronizes.