Sync/mirror directory with Rackspace Cloud files bucket

Solution 1:

The easiest way to sync a local directory to Rackspace cloud files is through the console tools provided by the openstack/swift project. On ubuntu, the tools can be installed with apt-get install python-swiftclient

Then, assuming you are in the directory you want to upload, run the following command in the terminal:

$ swift -A https://auth.api.rackspacecloud.com/v1.0 -U <username> -K <api-key> upload <containername> . --changed

This will recursively upload the files from your current directory to the <containername> container, saving time by uploading only changed files. You need to supply the <username> you use to log in to the Cloud Control Panel and the <api-key> available under Account / Account Settings in the same control panel.

Attention: If you use relative or absolute paths, swift will upload them with the pseudo-path provided on the command line into the container. So if instead of syncing . you sync /var/www/test, then files will end up under /var/www/test pseudo-path of the container - most likely, this is not what you want.

Solution 2:

For Linux I found this old project: http://code.google.com/p/cloudfiles-sync/wiki/Instructions

For Windows there is this GUI tool: http://www.cloudberrylab.com/free-openstack-storage-explorer.aspx

I also just found this tool that might let you mount cloud files storage: http://smestorage.com/?p=static&page=LinuxDrive

Solution 3:

You can use the Rackspace Cloud Files FUSE module (http://www.rackspace.com/knowledge_center/article/mounting-rackspace-cloud-files-to-linux-using-cloudfuse) to create a mountable file system, but beware of the following caveats:

  • use something like --size-only to determine if the file was fully written, not -a or anything like that, since setting permissions and times is not supported
  • using --bwlimit is not going to work, because the module caches writes in a temp file in memory, then eats up all the bandwidth when uploading; i'm conducting an experiment using the trickle utility to see if that helps

Solution 4:

Another potential option, as an alternative to CloudFuse for mounting Rackspace Cloud Files as a volume against which you run rsync, you could run Caimito as a Cloud-Files-to_WebDAV bridge.

http://caimito.ngasi.com/

Then, while you could conceivably try to employ the Fuse DAV2 filesystem plugin with rsync and a bunch of special command-line options to get files up to Cloud Files via the bridge, I instead recommend "sitecopy", which at its core isn't terribly dissimilar to Unison.

https://www.howtoforge.com/maintaining-remote-web-sites-with-sitecopy-debian-squeeze-ubuntu-11.10

Sitecopy does a GREAT job of pushing files via WebDAV to its target (even if our target is a fronted emulation layer to Cloud Files). This is because "sitecopy" maintains a local database of remote-end file metadata that makes for speedy batch comparisons vs rsync.

Caimito was surprisingly stable and easy to install and configure despite its Java roots.

You might conclude that using "swift" (mentioned above) might be a more direct vector to the solution, but this solution gives you a few more places to probe, dissect, debug and control the data flow.