Is a file system just the layout of folders?

Just the layout of the folders?

Sounds too good to be true...

Let's take the FAT32 file system as an example. I can install Windows XP on it, but I can also use it on a memory card. On a memory card, you don't have those folders that you sum up.

So... Don't confuse the directory layout of a family of operating systems with a file system.

Is this what a file system means?

No... It refers to the underlying bits and bytes that make your directory structure work.

The underlying bits and bytes? Show me FAT32!

Let's look at what FAT32 looks like, it has:

  • Some header sectors in the beginning, like Volume ID and Reserved Sectors
  • Two File Allocation Tables, allowing us to figure out where our files are.
  • Clusters containing all our directory and file data.
  • Some very small unused space that we can't use.

A FAT table consists of a lot of entries that look like this, allowing us to determine where the directory or file is stored in the clusters space, as well as some attributes and size.

A directory entry would point to a list of directory/file entries...

In the clusters space, we can now travel our clusters to find the data we need. A cluster essentially contains data and information where the next fragments are

enter image description here

Do other file systems differ? Show me NTFS!

I'm going to show you an image so you can notice the differences, the rest is homework for the reader: More information can be found on this blog archive or Google.

The main idea is that NTFS is a huge improvement over FAT32 that is more robust/efficient. Having a better idea of (un)used space by using a bitmap to further help against fragmentation. And so on...

— http://thinkdifferent.typepad.com/photos/uncategorized/04ntfsfilesystem.png

What about the file systems on Linux? Show me ext2/3!

The idea is that ext2/ext3 use super blocks and inodes; this allows for soft and hardlinks, directories that are files, files with multiple names and so on. The main gist is abstracting away to allow the file system to be capable of doing more meta-ish stuff...

— http://thinkdifferent.typepad.com/photos/uncategorized/03extfilesystem.png


The big difference between Linux and Windows, at least when it comes to their filesystems and directory trees is that in Linux "everything is a file", and everything descends from a single root. This also applies to almost all Unix-derived OSes such as BSD, OS X, Solaris, etc., but I'm going to just say "Linux" to be generic (if not entirely accurate).

But what does that mean in practice?

Windows allows for multiple named roots for their filesystems. You understand these as drive letters: C: D: E: and so on. Each one has a root (\), and a tree that descends from it. Recent versions of Windows allow for things like volume mountpoints, where a volume (what you'd consider a partition) can be mounted to an existing, empty folder. So instead of D:\ representing the root of, say, your optical (CD/DVD/BR) drive, you could mount it at C:\Optical instead. This is more similar to what Linux does. There's also an underlying, single-rooted, object namespace for everything in Windows similar to what Linux uses and is managed by the Object Manager, but most users rarely see it referenced since it's primarily for kernel use.

Linux has a single root: /. Everything descends from it, and it doesn't necessarily need to represent your hard drive. Hard Drives, Optical Drives, Memory Cards, Network Shares, Printers, Scanners, CPUs, RAM, Processes, ... everything is represented somewhere inside this single namespace, and can be access by any process with standard file management APIs, presuming you have a high enough level of access. Just because you can read or write from it doesn't mean it's a file on your hard drive in Linux. For example, devices are typically mounted into /dev, so accessing things in there often means you're talking to a device -- maybe it's the sound card, or a scanner, or a camera, etc. These are known as device files. Procfs is a special "filesystem" that's normally mounted to /proc and has a "directory" for every running process, with files in each directory relating to things like the command line used to invoke that process, memory maps, open files, etc. Sysfs is another special filesystem (mounted on /sys) used to expose a wealth of information about the running kernel objects and can also be used to fine-tune the running kernel by simply writing to a particular file.