Udev Overview

Categories: Linux

Introduction

Just a quick note on the purpose of the udev daemon in Linux. It is responsible for:

  • loading drivers when the kernel detects new hardware
  • creating “device nodes” (usually in the /dev directory) once the driver has finished loading
  • and until kernel 3.17 (end 2012) it was responsible for loading firmware required by a driver; this is now done by the kernel itself.

Loading Drivers

In the early days of computing, devices could be attached in ways which were not “discoverable”: the operating system had no way of knowing if a device was attached, and if so what type of device it was. Configuring a device-driver for the device therefore meant probing: loading drivers for devices that might be there, and having those devices read or write various IO addresses that the device might be attached to, to see what happens.

Fortunately, that problem went away with the invention of the PCI bus. PCI provides a standard way for the operating system to ask “is there a device plugged into slot N”, and “what type of device is plugged into slot N”. Every device is required to provide a unique ID which can then be used to determine which driver is needed. The PCIe bus is also “discoverable”. USB also provides similar functionality; every USB device can be queried for its “class” and “subclass”. Of course, some device manufacturers are lazy or incompetent, and reuse the same IDs for different devices - but even in these cases the id gives a rough indication of which device it is, so probing becomes more targeted.

On boot, all drivers compiled into the kernel are initialized; they register themselves with a list of the IDs of devices they handle. The kernel pci-bus code then scans the PCI bus. For each device found, the kernel creates a small record representing that device. Then:

  • an entry is created in sysfs representing that device on the PCI bus; this info appears under /sys/devices/pci* (with symlinks under /sys/bus/pci/devices);
  • if a built-in module is registered with a matching id then it is called;
  • else a uevent is sent to userspace (ie a string is sent over the KERNEL_UEVENT netlink socket)

A similar process applies to other (ie non-PCI) types of buses. One of particular instance is the virtio bus; a Linux kernel with virtio support built in will look in a special memory location for a “virtio bus”; a kernel running on real hardware will not find one but a kernel running inside KVM will (provided by the host). The virtio-bus driver in the guest kernel can then “discover” the set of virtual devices that the host is exporting to the guest; these get registered in sysfs as with real devices on a PCI bus, and udev then detects that and loads the corresponding drivers.

When the (userspace) udev daemon is started, it starts listening on the KERNEL_UEVENT netlink socket. It also scans the /sys filesystem looking for devices registered earlier (ie uevents it missed). For each device where no driver is already associated with it, udev consults /lib/modules/{kernel}/modules.alias (see the entries of form “pci:…”) then inserts the matching module (driver).

See in particular /sys/devices/pci*/*/modalias (also accessable as /sys/bus/pci/devices/*/modalias), which provides the ID that udev uses to find the matching entry in modules.alias.

Some modern embedded systems (particularly Arm-based devices) have peripherals that are not “discoverable”. These systems use a device tree (effectively a configuration-file) to declare which devices exist and what addresses they are mapped to. I’m not very familiar with this area, but presume that once the device-tree configuration has been read by the kernel, processing continues similarly to the pci-bus behaviour - ie that devices are “discovered” by simply walking the list of entries in the device-tree.

Uevent messages are plain strings (ie a sequence of ascii characters), making them easily debuggable. They nevertheless contain a reasonably large amount of information.

Managing Device Nodes

Early in boot, the kernel creates a filesystem of type tmpfs (ie a simple in-memory filesystem) which it calls devtmpfs. During initialisation, a driver can choose to create entries (device nodes, ie special files) in the devtmpfs which point back to the driver. The devtmpfs always shows the owner of such “device nodes” as root, with access-permissions of 0600 (rw-------). The driver specifies the name of the file.

One of the first things the init process does is mount the devtmpfs filesystem on /dev, meaning that all “device nodes” created by drivers are visible in /tmp, but with the fixed owner and permissions listed above, and with names chosen by the device-drivers not the sysadmin.

Whenever a driver completes initialising a device (ie after any of the above files are created), a uevent is sent (over the netlink socket). The udev daemon listens for such events, and applies a set of rules to the received uevent to determine whether it should change the file name, owner, or permissions, and whether it should create additional symlinks to the device-file. Of course, on first startup udev also needs to emulate such uevents for devices that were created before udev started.

The udev rules are configurable by the sysadmin, meaning that userspace does have control over the name, owner, and permissions of device-nodes created from the kernel. See /lib/udev for the set of “default rules”, which can be overridden by creating files of the same name under /etc/udev. The default rules should not be modified locally.

Firmware

When the driver loads, it might need access to firmware.

Prior to kernel 3.17, the driver would send a “need firmware” message over the KERNEL_UEVENT socket; udev would receive that, find the relevant file on disk and write the contents of that file to a file in /sys. The driver would then read that data and forward it to the device. However the udev developers were never very happy with udev performing this role, and the logic was eventually moved into the kernel. This also helps systems that don’t use udev.

Kernel version 3.17 and newer loads firmware (from /lib/firmware) directly. When a device-driver is loaded into the kernel by udev, the driver calls a kernel function which calls into the filesystem code to fetch the data. Of course, this assumes that the root filesystem has already been mounted; if the root filesystem is on a device which needs firmware (including a network-mounted rootfs where the networking system requires firmware) then an initramfs with the necessary firmware would be needed. Drivers compiled-in to the kernel which require firmware will also presumably only work with an initramfs which holds the required firmware.

For more information, see this LWN article.

Development

The udev application is currently maintained by the systemd team, and the source-code is stored within the systemd git repository. Udev does rely on some other libraries from the systemd git repository. Nevertheless, there is no dependency between udev and systemd-init.

A libudev library was recently created to hold the sysfs-scanning and uevent-string-parsing code that was previously embedded within the udev application. This allows other applications to perform udev-like functionality without duplicating the logic.

MDev

Busybox has built-in functionality called mdev which works a little like udev.

AFAICT, mdev is mostly irrelevant in a system with devtmpfs (ie all modern linux kernels) - but of course busybox is also used on non-linux systems (hurd? bsd?). Mdev does have some ability to define “rules” like udev does, where devices can be renamed and their permissions changed. Mdev also supports the “load firmware” functionality that has not been needed in Linux since 2012.

EUDev

Some developers decided to fork udev, and call their project eudev. As far as I can tell, it is not widely used.

Netlink

Netlink provides a fast communications channel between userspace and kernel. The following syscall is used to open a netlink “socket”:

int socket(AF_NETLINK, SOCK_DGRAM or SOCK_RAW, protocol)

where protocol indicates what kind of data will be transferred across the socket, ie which component of the kernel the socket should communicate with. For example, protocol=NETLINK_ROUTE connects the socket to the kernel network stack. The socket is used to transfer messages, ie datagram-style, rather than streams of data. Userspace which is sending data is required to create the appropriate (fairly simple) datastructure which includes a “message type” and “message length”. Data received by userspace from the kernel has a similar structure.

Netlink socket protocol NETLINK_KOBJECT_UEVENT provides a one-way stream of data from the kernel to userspace, consisting of strings in the uevent format. The messages indicate the discovery or removal of peripheral devices.

References