What could cause files to end up zeroed out?

Solution 1:

Seems like the metadata were correct, so files appear in the directory trees, have names, access modes etc, but the data itself is corrupt (was not reached a media).

How this is possible depends on the file system, mount options, caching modes for the drive and so on.

Let's take ext4 for example, where it is relatively easy to make this to occur. Default mounting options use journal for metadata only, so the file system generally guarantees that on-disk structures will be correct in any case, and everything will look either as if nothing was made to the drive or the operation is applied completely. Just as in the ACID database. But the data isn't journalled by the default, so it is possible the system completed system call, reported a success to the application, created all necessary structures (in the journal only for now), while data is residing in the cache... and now power is cut. When you power the system again and mount this volume, the file system driver will replay the journal and the files will appear, but the data will be garbage left from previous block usage. That garbage could be zeros indeed. In the end, cutting the power during write is likely to produce zero-filled files. I'd expect the same result when unplugging the drive early (like pulling out the USB cable).

This unplugging scenario is quite likely taking into account you're talking about external drive. Certainly this is possible with other file systems too.