How to make my own Dropbox / Ubuntu One server at home?

Solution 1:

There are actually lots of them.

  • SparkleShare (deps: git/subversion, mono, python) at github GUI-based sync software.

    a. Versioning: through a source control system, hence it's mutex-based on a central server through a version number.

    b. State: under development

    c. Pros: OSS, mono-based so easily moddable, Cons: user-level process, GC-dependent, ineffective sharing protocol by orders of magnitude as git is primarily for small text files, fairly hard to compile (I tried). Using high-level tools.

  • lipsync (deps: Unison, rsync) Command-line service-based software.

    a. Versioning: through the rsync delta algoritm. I assume programmer must choose conflict resolution.

    b. State: I can't find its source code, so I have no idea. The only things in his git repo are binaries.

    c. Pros: nice setup, using middle-level tools.

  • iFolder - Novell's Dropbox. I haven't studied its source yet. I just want to get this edit over with and if people are interested I'll add more.

    a. Versioning:

    b. State: Problematic getting it to even compile on Ubuntu, let alone packages. Here's a detailed install guide.

    c. Pros: Windows X64 client, mature, AD-integration with ACLs, features no other project has started to implement. I think this might be a good starting point. Cons: Novell might not use its public svn repo as the primary repo and only do code-drops. I don't know exactly about this though. Might be too coupled to openSUSE to easily install on Ubuntu. To check out its algorithms.

  • scp/rcp - deprecated in favor of rsync

  • DRDB - block device mirroring tools for distributed RAID-1, i.e. a server-variant of dropbox. I haven't checked out its source code yet, but it's linux only. The actual algorithm would probably be easy to combine with the source code in my musings below this software-listing.

    a. Versioning: internal message format over LAN/WAN

    b. State: seems mature enough

    c. Pros: stable enough for linux, Cons: no other operating systems are supported


Right now I'm investigating improving compile-times on a Virtualized Windows 7, where the compile-times on a Windows 7 on metal is 40 s, but virtualized approx 3m 20s. I'm thinking of writing an ioctl driver that is a write-through cache that looks like a ram-disk for selected folders on NTFS.

Using the above software, I think a week's worth of 2-3-person full-time development would produce a usable Alpha that doesn't lose you files by combining the above softwares.


On my system then, the general idea would be;

  1. Mount a virtual drive \?{GUID}, that is the ram-disk and RW-cache. The software creating this virtual drive takes two input parameters (that are vital):

    a. The target folder; this is the SMB folder, so I will be letting the operating system's network stack handle the actual IO. In my case this is in turn the VMWare virtual folder, that has in itself a target on an ext4 drive, but it could easily be your file server using SAMBA/SMB.

    b. The path of the folder to be mounted, e.g. C:\ramdisk

    This code for creating virtual volumes be taken from TrueCrypt's code, in /Driver/DriverFilter.c (among other files)

  2. The drive uses SMB/the VMWare/network protocol to fetch data when it starts; it fetches with a low task priority, asynchronously from the network and fills its cache. It could use a simple compacting algorithm and have 1 thread that uses message-box type continuation passing to get great performance. On Windows it could use the normal async IO calls, and on linux it could use the epoll/inotify implementation and take code from nginx.

  3. My service that is the ram-disk mounts the unnamed ramdisk drive as an NTFS folder. All programs can continue writing to C:\ramdisk, or whatever I call it.

  4. Async copy from network still going on. With a read-rate of approx 100 MiB/s and 2 GiB ramdisk, it would be 20.5 s to read all data.

Each call to read would perform an in-CPU calculation of the index into a fixed n:ulong GiB max sized array. It would require conflict resolving though or read-write locks. If we'd implement a conflict-resolve algoritm like those available through Microsoft Sync, we could pass each chunk that conflicts as a message to another conflict resolve-process. Dropbox solves it by creating a new file and naming it "PrevFileName Username's Conflicted Copy (yyyy-MM-dd).ext". Perhaps this could be altered through a small widget, if one is compiling against that single source -- the widget would detect outstanding changes as messages/events and choose the conflict resolution protocol. As such, when programming against a folder in exclusive-mode, the Windows VM could set the widget to 'exclusive'.

This would have these PROs

  • It would be non-blocking / async
  • It would make the assumption but not require that one computer will be writing mostly to the files.
  • It would work for arbitrarily large files
  • It would work on *nix and Windows by tying together the mentioned projects.
  • It would work when high read-performance is needed (i.e. the files are physically located on disk)
  • When the conflicting events are reached, one could provide a user interface app that allows the user to write/download plugins that act sanely for different sorts of events -- i.e. different sorts of files. E.g. a text file could be brought up with Kompare/WinDiff, while a binary would be duplicated and saved as another file.

Solution 2:

Currently there's not a great open source alternative that's going to work out of the box. The best thing to keep an eye on is the sparkleshare project: http://www.sparkleshare.org/

Hopefully that will grow into a great, do it yourself, alternative.

Solution 3:

OwnCloud! sounds like something you're looking for.

Solution 4:

i heard about Syncany on the Ubuntu UK Podcast, currently beta but looks like it meets the requirements