What problems does udev actually solve?

For that matter, what exactly was wrong with a bunch of static files in /dev? It's apparently unsatisfactory enough for developers to have reinvented this wheel by my count 3 times now (devfs -> udev + HAL -> udev), and now apparently it's going into the Grand Unified Init Program too, so four times.

I remember when I first started using Linux years ago being surprised that despite claims that "everything is a file", there's no /dev/eth0 (that later made sense, since it isn't a char or block device -- though a "packet" device type would be interesting...). Given that, why is the program that handles the char and block device file tree also responsible for network devices? I've seen vague references to "flexibility", but what does this add over what, say, ifconfig(8) does by just looking in /proc/net/dev? I know, for instance, NetworkManager won't be in Net or OpenBSD any time soon because it depends on udev, which neither team wants to write; what I don't understand is why a program that at least nominally is there to manage the /dev tree is apparently the only way to expose network devices that are already exposed multiple ways by the kernel (and none of them in /dev!).

Is it just because of hotplugging? Were there problems with the kernel just listening to the physical buses and loading the appropriate modules on an "device added" message? Or, God forbid, the actual administrator doing so? I do remember back in the early 2000's my servers would sometimes initialize their network cards in an unexpected order, and I suppose it makes sense to have that naming be decided in userland (though it wasn't terribly hard to fix back then), but this seems like a sledgehammer for a cockroach. (Or maybe that problem hits use cases I'm not thinking about much harder than rackmounted servers or PCs, which are my experience.)

So, to state my question plainly: what problems does udev actually solve, and how did devfs, HAL, and/or a plain old file fail to solve them? Is there a particular reason for that many different things (hotplugging, general device management, network device management, device naming, driver priority, etc.) to all be one program?


Two more things: Linux's move into enterprise & other large servers was exposing static /dev to be broken. Advancing technology, in both consumer and enterprise, was exposing static /dev as a joke. [This answer fills in more of the backstory, not particularly why devfs was replaced with udev].

Exhaustion of Major & Minor Number Space

/dev files are identified inside the kernel by their major and minor numbers. The kernel has never actually cared about the name (and you could, for example, mv /dev/sda /dev/disk-1 and it'd continue to work—though of course programs wouldn't know where to find it).

With a static /dev, you need to allocate a major/minor number for every potential device that could exist. These numbers need to be unique globally, as they're shipped as part of distros, not created on demand. The problem is that they're each 8-bit numbers—the range is 0–255.

Originally, for example, Linux started with 8,0 being sda, 8,1 being sda1, 8,16 being sdb, etc. But people kept adding more and more disks to machines, especially when you consider things like fiber channel. So at some point, major numbers 65–71 were added for more disks. Later, major numbers 128–135. And yet people kept wanting more disks...

And partition table formats like GPT came around, supporting more partitions per disk. And of course other devices were eating through the number space: various RAID controllers, logical volume management, etc.

The end result can be seen at the LANANA Linux Device List. If you look at the 2.6 list (the only one still there), a lot of the block major numbers through 200 (max: 255)—are used. Clearly, the numbers would have run out.

Changing to larger numbers wasn't easy. It changes the kernel ABI. Depending on the filesystem, it changes the on-disk layout. But, of course, most of those devices didn't exist on any one system, even one that was (for example) running out of SCSI disks probably had plenty of free things—it probably didn't need an IBM XT hard disk, for example.

With a dynamic /dev, the distro doesn't have to ship the device numbers. They no longer have to be globally unique. They don't even have to be unique across boots.

Devices names were unpredictable

It used to be real easy to assign a number to everything. A board had two IDE channels; each IDE channel supported one master, and one slave. You can assign in channel order, and master-then-slave order. So hda becomes first channel, master; hdb first channel, slave; hdc second channel, master; etc. Those were predictable and stable. They may change if you add a new drive, or remove one, but absent hardware change, they were static.

You could put /dev/hda1 in your /etc/fstab and be confident it'd stay working, at least absent hardware changes.

IDE worked like that. Nothing after it does.

SATA appears to be simple: one port, one disk. But not so; it allows port multipliers. And it allows hot-swap. Still, absent hardware changes, you can actually still keep the mapping working.

USB is much worse. Not only does it allow hot swap, it is typical. People plug in USB flash drives all the time. Further, devices can take a while to probe—and can actually change whenever they feel like it (e.g., when turn on or off USB storage mode on your phone). Firewire is similar. With neither can you really come up with a stable mapping.

Network attached disks don't have any inherent port order. The only order the kernel uses is order they appeared in. Same with logical volumes.

The quest for boot speed also made things worse. Originally, the kernel would happily sit around and wait fairly long amounts of time for, e.g., all USB devices to initialize. To fully probe all SCSI buses, etc. Those probes were made into background tasks; boot would no longer wait on them. The devices are added as the probes complete.

So the kernel was left with, more or less, "whatever order they show up in". This meant that many types of devices can and did change order every boot—what was on one boot /dev/sdb was on another boot /dev/sdc. This makes the idea of a static /dev a joke.

Summary

When you take the combination of static /dev becoming increasingly meaningless due to unpredictable device probe orders, and continuing to allocate static major/minor numbers leading to substantial work to not run out, it becomes clear why Linux's developers chose to switch to a dynamic /dev.


Good question.

In a way, this argument could be turned around: since kernel 2.6.13 introduced a new version of uevent, it was bound to happen that devfs would need to be re-written to take advantage of the interface's new features. So, in a way, the question ought to be why the change in the kernel.

However, taking it at face value, your question is answered in this Wikipedia's article:

Unlike traditional Unix systems, where the device nodes in the /dev directory have been a static set of files, the Linux udev device manager dynamically provides only the nodes for the devices actually present on a system. Although devfs used to provide similar functionality, Greg Kroah-Hartman cited a number of reasons for preferring its implementation over devfs:

1) udev supports persistent device naming, which does not depend on, for example, the order in which the devices are plugged into the system. The default udev setup provides persistent names for storage devices. Any hard disk is recognized by its unique filesystem id, the name of the disk and the physical location on the hardware it is connected to.

2) udev executes entirely in user space, as opposed to devfs' kernel space. One consequence is that udev moved the naming policy out of the kernel and can run arbitrary programs to compose a name for the device from the device's properties, before the node is created; there, the whole process is also interruptible and it runs with a lower priority.

I should probably add that with udev the possibility of a race condition, which basically undermined naming of devices in devfs and hotplug, is avoided. In other words: with devfs there was no way to ensure that your leftmost ethernet port would be called eth0 and its rightmost one eth1, making (as a pure example) the setting up of routers (one port to WAN, one port to LAN) difficult to implement.

The adoption of the disks' naming scheme based on GUID is another plus, and moving the whole process to user-space an even bigger one: have you searched through this site to see how many people write their own udev rules?

As a simple example of the advantages inherent in having udev in userspace, check either this question or this other question, both on this very site.