How does dropbox version/upload large files? [closed]
I have a free dropbox account (2GB), and I was wondering how the versioning of large files works.
I have a full backup of all my webfiles that sites @ just over 1GB. After the initial upload of 1GB, everytime it syncs will dropbox figure out the delta of the file, or will it have to upload the entire thing again to version it?
It would be cool to always have an up to date version of a large file, but I dont want to kill my bandwidth uploading 1GB everytime.
Is this possible?
Thanks,
Dropbox uses a binary diff algorithm to break down all files into blocks, and only upload blocks that it doesn't already have in the cloud. All of this is done locally on your computer.
Dropbox doesn't just use your files that you have already uploaded, it aggregates everyone's files into one database of blocks, and checks each local block hash against that database.
This means that if someone else has uploaded the same file as yourself (say for example, the latest Ubuntu ISO), then the upload will seem instant as there is nothing to upload, but if you are updating a file that changes regularly, like your backup file, then only the changes are uploaded. If you upload a totally unique file, then you have to wait for it all to upload.
For what it's worth, Dropbox claims to create hashes on every 4MB of each file. That way, if you change a contiguous 2MB of a 100MB file, it will likely only need to upload 4MB (or 8MB if you cross into a second 4MB block) to re-sync the file.
The hashes we use are only for the 4MB file chunks
Source: https://blogs.dropbox.com/tech/2016/05/inside-the-magic-pocket/
It's also important to highlight that it doesn't upload your whole file at once when you change it. For example, if you have an unique file weighting 2GB, let's say for an encrypted disk drive you hold (like when you use truecrypt or pgpdisk), and you change just a couple of files inside the encrypted disk, dropbox will only upload the blocks that effectively changed. So, for instance, if you upload your pgpdisk file with 2GB to dropbox, and then you change just let's say 100MB of this 2GB, dropbox will be intelligent enough to detect and update only what have changed. So you don't waste your upload bandwidth uploading stuff that is already there.
Another feature that I saw the dropbox team is working on is to make dropbox to detect another instances of dropbox running on your local network, and sync the information in between them. For example, you have a laptop and a desktop, and both have the same dropbox account, and you update your files on your desktop - and the desktop instantly syncs with the "cloud" - when you plug your laptop in, instead of going to the cloud, dropbox will instead download the diff directly from your desktop computer, and won't waste your download bandwidth. This is still to come - but will be a sweet feature!