Why is data so spread out on partitions that evolved naturally?

I've been doing work on VMs + porting operating systems recently and noticed something I've known for a while: data written by operating systems seem to be allocated randomly and is spread throughout the volume. Here is an example of what I mean:

Normal OS Allocation

By comparison, here's what this same volume looks like as a partition image (with only the data retained):

Virtual Volume Allocation

So I'm wondering:

  1. What is the benefit to having such large separation in the data?
  2. How do computers/operating systems decide where a set of files will go?
  3. Is having data tightly packed in each "sector" a bad thing?

Off the top of my head, I'd say #1 is to leave room for changes so related data is close together (more important for HDDs than NAND, I'd imagine). #2 I have no idea other than MBR and similar standards. #3 I'm not sure about... so far no problems (and virtual drives seem to do this by default). Would love to hear more on the topic and know why it's like Image 1 on physical drives.


Solution 1:

By comparison, here's what this same volume looks like as a partition image (with only the data retained):

That looks unusual even for a partition image. It does look like the result of an overzealous "defragment" tool, but nothing that is image-specific or VM-specific.

(I would expect that even an imaging tool that only retains the data would instead create a "sparse" image where the data remains exactly where it originally was – disk image formats already support efficiently storing gaps/holes at any location, there doesn't seem to be any advantage in packing all files like this before imaging.)

What is the benefit to having such large separation in the data?

Reduced fragmentation. If a file needs to grow, and all extents are tightly packed next to each other, the additional data has to go somewhere else – you soon end up with the file being highly fragmented (consisting of many small extents), which is quite bad for performance on mechanical disks with their non-zero seek time.

How do computers/operating systems decide where a set of files will go?

The partition table (such as MBR or GPT, or sometimes more complex such as LVM) is used to divide the disk into fixed-size areas (partitions) and doesn't really have any effect beyond that.

The filesystem (such as NTFS or ext4) controls how data is stored in a given partition. It is the filesystem's job to keep track of the metadata (file tree) and of the data allocations.

Each filesystem has its own format of the data structures (e.g. storing file data in terms of extents vs cluster chains vs indirect blocks), and its own logic for selecting where new blocks are allocated. For example, the Linux ext4 filesystem uses the so-called Orlov allocator algorithm.

Most filesystems update existing files in-place, while some always allocate new blocks even when overwriting existing areas (copy-on-write allocation). Most filesystems work directly at cluster granularity (which is some power-of-two of a disk sector, e.g. 4 kB or 16 kB), but the Btrfs filesystem allocates disk space in 1 GB "chunks" and file data is placed only within the already-allocated chunks.

The mechanism may even differ between implementations, e.g. although a NTFS filesystem obviously uses the same data structures in Windows and Linux (e.g. MFT and free space bitmap) as that's what defines "NTFS", the logic for allocating new sectors will be different between the Windows ntfs.sys driver and the Linux ntfs-3g one.

Is having data tightly packed in each "sector" a bad thing?

For static files, no, but probably doesn't give you a great advantage either.

(Not all filesystems support packing multiple files' data in the same "sector", however. Most likely what you're seeing isn't 100% packed – there are likely small sub-cluster gaps after every file that isn't a round number of clusters – which is why the graph bars don't always reach the top.)