What is a machine id
From man 5 machine-id:
- The /etc/machine-id file contains the unique machine ID of the local system that is set during installation or boot. The machine ID is a single newline-terminated, hexadecimal, 32-character, lowercase ID. When decoded from hexadecimal, this corresponds to a 16-byte/128-bit value. This ID may not be all zeros.
That man page has a lot of more information and is a recommended read. In addition, there seems to be a connection between systemd and dbus in this matter, as dbus keeps track of the machine-id as well. The machine id is stored in /etc/machine-id, with the dbus copy of the machine id either symlinked from or copied to /var/lib/dbus/machine-id.
For historical reasons (dbus originated the concept and systemd generalized it into a non-D-Bus-specific "API"), each of systemd-machine-id-setup and dbus-uuidgen tries to copy the other's machine ID, to avoid problems where processes that read the two files in opposite orders disagree on what the machine's unique ID is. If you delete or empty /etc/machine-id you should also delete /var/lib/dbus/machine-id.
The machine-id isn't really particularly specific to either D-Bus or systemd: they both provide it as a piece of generically useful functionality for anything else that wants it. Asking which applications use it is a bit like asking which applications use gethostname(2): you are not going to get an exhaustive list unless you use something like codesearch.
It's intended as an opaque, non-human-meaningful, persistent unique identifier for a machine (or more precisely an OS installation), used as a lookup key in state/configuration storage in the same sorts of places you might be tempted to use a hostname.
Being opaque and non-human-meaningful is important for some of the places where it's useful, because if a string is human-meaningful (like a hostname), then people will sometimes want to change it, and when they do, anything that was recording machine-specific state with the hostname as unique identifier will no longer be able to associate the machine-specific state with the machine, effectively resulting in data loss.
It is documented that every machine should have an unique machine-id, and strange things may happen when multiple machines with the same machine-id operate simultaneously.
What is it used for
List of things the machine-id is used for (please supplement the list with your knowledge)
- creation of DHCP host identifier (probably causing multiple machines fighting over the same IP address on the DHCP server)
- GNOME stores screen layout configuration keyed by machine ID
- the systemd-boot EFI bootloader stores the OS installation's kernel(s) in a directory named after the machine ID to prevent collisions.
machine id and cloned systems, generating a new machine id
The machine id is something that is frequently missed to change when cloning a machine. A new machine-id can be generated by
rm -f /etc/machine-id /var/lib/dbus/machine-id dbus-uuidgen --ensure=/etc/machine-id dbus-uuidgen --ensure
Making /etc/machine-id a 0-byte file is considered to be the canonical way to clear it, rather than actually deleting it, because if systemd is running on a completely read-only root filesystem, it has code to create a machine ID on a tmpfs and bind-mount it over the top of the empty file.
If you are doing cloning, stateless systems or similar activities, and you know you will have a valid /etc/machine-id (you either use systemd or have taken other steps to have one), then you can make /var/lib/dbus/machine-id a symlink to /etc/machine-id (dbus comes with a systemd-tmpfiles file to do this). This is not done by default in Debian, or by dbus-uuidgen --ensure, for historical reasons; maybe it should be, but to be confident that it was a correct change I'd have to think about the ways in which it might go wrong on non-systemd systems (with either a non-systemd init like sysvinit, or no init at all like minimal chroots).
Missing or desyncronized machine-id files can lead to all kinds of weird behavior, for example:
- machine coming up without any networking if systemd-networkd is used
machine id and containers / chroots
Debian chroots and containers will often have neither systemd nor sysvinit (or any of the other alternatives), but perhaps they should have a machine-id anyway - or perhaps container managers that don't run a full init system, like schroot, should be responsible for that? Or perhaps this requirement isn't necessary for containers that don't provide either system services or user logins? (The elephant in the room here is that Docker doesn't arrange to have a machine-id, and also doesn't set the $container_uuid proposed in <https://www.freedesktop.org/wiki/Software/systemd/ContainerInterface/>.)
systemd-nspawn already sets up a machine ID for its containers, and lxc (presumably also lxd) normally runs init, but schroot and Docker don't normally run init and also don't take any particular steps to have a machine ID.
Flatpak copies the machine ID from the host system into its containers, and I would assume that other frameworks with "app containers" that are conceptually part of the host machine rather than their own machine, like Snap and ?AppImage, probably do the same.
What should be made clearer in the future
- what is the machine id actually used for?
- a comprehensive list is probably not possible without grepping the code
- a machine id is of similar scope to a list of reasons to use the hostname (as in gethostname(2)), and indeed some current uses of the hostname would probably be better to use the machine ID.
- for which actions is it helpful
- should only containers/chroots having an init system have a machine id?
- under which circumstances is it ok to have a container/chroot share the machine id of the host system?
- which chroot/container managers do set up machine ids, which don't? Should they?