- Binary and Source Packages
- Real and Virtual Packages
- Direct, indirect and transitive dependencies
- Binary and build dependencies
- Dependency alternatives
- builds-from relationship
- Dependency graph
The Debian dependency system is powerful enough to play sudoku on it using aptitude and dose3. In fact, our dependency system is so expressive that it is interesting enough for scientists to research the topic. Unfortunately, because of its complexity it is often hard to talk about Debian dependencies without knowing the underlying terminology. This page is supposed to give a gentle introduction to the terminologies used when talking about package dependencies in Debian. It is written for the average Debian Developer who doesn't necessarily have a strong math or computer science background.
This page is not meant to override Debian policy. Where possible, the terminology used in Debian policy is also used here. Instead, this page is to explain the terminology for concepts that result from Debian policy as well as from things which are not policy yet but are accepted packaging practice (like multiarch).
Binary and Source Packages
In Debian there are binary packages and source packages. Often times people refer to both just as "packages" which might be unambiguous in the right context. If there is any chance for ambiguity, it is better to clearly state whether one is talking about binary packages or source packages. Another handy trick is to prefix source package names with src: to indicate that the given name is the name of a source package and not of a binary package with the same name. In this document we will always be explicit whether we mean binary or source packages.
Real and Virtual Packages
Binary packages can be real (Debian policy §7.5 also calls them "actual" or "concrete") or virtual. We are mostly talking about real binary packages, so when we are just talking about a "binary package" we usually mean a real binary package. If we want to talk about virtual binary packages, we explicitly point out that the binary package is virtual. If we want to talk about both, we say "real and virtual binary packages".
Direct, indirect and transitive dependencies
Binary and source package dependencies can be direct or indirect. Each binary and source package declares its direct dependencies but it can also have indirect dependencies because of the direct dependencies of the binary packages it depends on. Example:
Package: foo Depends: bar Package: bar Depends: baz Package: baz
A direct dependency of the binary package foo is the binary package bar. An indirect dependency of the binary package foo is the binary package baz.
If we want to talk about the direct as well as the indirect dependencies of a binary or source package, then we can either say "direct as well as indirect dependencies" or we can simply say "transitive dependencies". In the example above, the transitive dependencies of the binary package foo are the binary packages bar and baz.
Essential:yes binary packages
Binary packages marked as Essential: yes must provide their functionality even when not yet configured as per Debian policy §3.8. The exact same rule implicitly applies for all packages which are transitive Pre-Depends of binary packages marked as Essential: yes. What doesn't apply to this set of packages is the rule that binary packages can implicitly assume that they are always installed and thus do not need to be explicitly depended-upon. Because of dependency alternatives and multiple providers of virtual packages, this set might not be unique.
Another interesting set in relation with Essential: yes packages are the possible installation sets of all Essential: yes packages. Because of dependency alternatives and multiple providers of virtual packages, there exist multiple installation sets of all Essential: yes binary packages.
Binary and build dependencies
Dependencies can be expressed by binary packages as well as by source packages. A direct dependency expressed by a binary packages is called a binary dependency. A direct dependency expressed by a source package is called a build dependency. The transitive dependencies of a binary package are its transitive binary dependencies. Since a source package can only build depend on binary packages and not directly on other source packages, the transitive binary dependencies of its direct build dependencies together with its direct build dependencies are called its transitive build dependencies.
Often there exist many ways to satisfy the binary or build dependencies of a given binary or source package, respectively. This is because binary and source package can express dependency alternatives and because multiple real binary package can provide the same virtual binary package.
We will use the following example to illustrate the remaining concepts in this section:
Package: foo Depends: flubber Package: flubber Depends: bar Package: blub Provides: bla Conflicts: plim Package: plim Provides: bla
Instead of using the virtual binary package bar we can also use dependency alternatives to express the same inter-package relationships as displayed above. All explanations for the rest of this section equally apply to the following example which will use alternatives instead of virtual binary packages.
Package: foo Depends: flubber Package: flubber Depends: blub | plim Package: blub Conflicts: plim Package: plim
An installation set of a given binary or source package is a set of binary and/or source packages which satisfies all dependency and conflict relationships of all binary and source packages inside the set and contains the given binary or source package. You can think of an installation set of a given binary or source package like the set of binary packages that apt would install for that binary or source package on a system where (hypothetically) no binary packages were installed. In the above example, because of the Conflicts relationship of the binary package blub there only exist two choices for installation sets that could be installed together to install the binary package foo: One is to install foo, flubber and blub and the other is to install foo, flubber and plim. Each binary or source package might have zero, one or more possible installation sets. If it has zero installation sets, then that means that it is impossible to satisfy the transitive dependencies of that binary or source package. If there exists more than one possible installation set, then it is the task of the dependency resolver like apt to pick a fitting set to install.
The build or binary dependency closure is the set of packages which can be created by recursively following all dependency relationships starting from an initial binary or source package. In the above example, the dependency closure of the binary package foo would be foo, flubber, blub and plim. It is a superset of all possible installation sets. The dependency closure might even contain packages that are part of no installation set. Each binary or source package has a unique dependency closure.
The strong dependencies of a binary or source package are the binary and/or source packages which are part of every possible installation set for them. All binary and/or source packages that are part of the strong dependency set of a given binary or source package are thus part of every installation set that can be generated for that binary or source package. In above example, the strong dependency set of the binary package foo would be the binary packages foo and flubber. The binary package foo can be installed without the binary package blub (by installing the binary package plim) and it can also be installed without the binary package plim (by installing the binary package blub) and thus neither of the two are part of the strong dependency set of the binary package foo. Each binary or source package has a unique strong dependency set.
In a bootstrapping context, it is important to also consider the relationship between a binary package and the source package that the given binary package builds from. One could say that in a bootstrapping scenario, a binary package depends on the source package that builds it through a builds-from relationship.
One special set of packages that involves the builds-from relationship is called the "Build-Depends-transitive essential" package set which is a set of binary and source packages that is created by following all build dependency, binary dependency and builds-from relationships, starting from the build-essential binary package.
A graph contains a set of vertices (or nodes) which are connected by edges (or arcs). There are many different ways to represent package relationships as a dependency graph, but most of them represent the binary and source packages as vertices and their direct dependencies between each other as directed edges where the direction of the edge points from the dependee to the depended-upon binary or source package. In terms of dependency graphs of binary and source packages, the following terms are important.
A dependency path in a dependency graph is a sequence of two or more vertices where each vertex points to the next by an edge. The following textual representation would be an example of a dependency path involving binary packages:
apt -> libapt-pkg5.0 -> libc6 -> libgcc1
A dependency cycle is a dependency path where the last element of the path points back to the first. The following would be an example:
dpkg -> zlib1g -> libc6 -> libgcc1
The binary package libgcc depends on the binary package dpkg because the binary package dpkg is marked as Essential:yes and thus an implicit dependency of all binary packages.
Dependency cycles become more interesting when they involve source packages:
src:gobject-introspection -> gnome-pkg-tools -> python3-gi -> libgirepository-1.0-1
The binary package libgirepository-1.0-1 builds from the source package src:gobject-introspection.
And even more interesting are cycles involving multiple source packages:
src:cups-filters -> libcups2-dev -> src:cups -> ghostscript -> libgs9 -> libcupsimage2 -> libcupsfilters1
This brings us to the next section.
Strongly connected component
A strongly connected component is a part of a graph where every vertex is in a cycle with all the other vertices of the same component. So all cycles are actually also strongly connected components. Cycles are just a special form of strongly connected component. Cycles are easy because removing any one edge will immediately break the cycle. Things are more complicated with strongly connected components because it is hard to figure out a preferably small set of edges which, when removed, will make the graph acyclic. Making a dependency graph acyclic is important for binary package installation as well as for ordering source packages in a bootstrapping scenario.
The bad news is, that in Debian there exists a huge strongly connected component around the essential and build-essential packages. Back in 2013 it looked like this:
More bad news is, that its size is growing over time. And even worse news is, that what you see is already a simplified graph where all installation sets for binary packages are grouped into a single vertex each, instead of representing every single binary package as their own vertex.
Specifying the object of a dependency relation
When talking about multiarch, we often have to talk about both sides of a dependency. One side is the package that expresses a dependency. The other side is the package that fulfills that dependency. There are multiple terms to refer to either. The package expressing a dependency can be called:
- the depender
- the dependent package
- the depending package
- the source of a dependency
The package fulfilling a dependency can be called:
- the dependee
- the depended-upon package
- the target of a dependency
A real binary package may be marked with a Multi-Arch: foreign control header if the provided interfaces are independent of the architecture of the package. As per Debian policy §7.5 any provided virtual packages inherit this property. Dependency relations usually enforce that both ends of a direct dependency relationship share the same architecture, but when the target of a dependency is marked with this header no such restriction is enforced.
There are four main areas that can contribute to the interface of a package and if any of them provides an architecture-dependent interface, a package must not be marked with Multi-Arch: foreign.
The content of a package: For Architecture: all binary packages, there is nothing to check in this area. If a file or the set of files of a binary package changes when built for different architectures, it has to be considered whether the provided interface is affected. Since the interface to shared or static libraries is an architecture-dependent ABI, shared or static libraries are never considered architecture independent. Binary executables generally differ with architectures, but if the interface that they expose to be used by other software (for example the command line interface, the content on standard input/output, the way they process files) does not allow distinguishing the architecture, they are still considered architecture-independent as far as this area is concerned.
Maintainer scripts and triggers: A package may behave in an architecture dependent way, when the maintainer scripts or invoked triggers behave differently on different architectures. For instance, byte-compiling source files into architecture-dependent bytecode during postinst turns the interface of a package architecture-dependent.
The dependencies of a package: A package may expose functionality of other packages by depending on them. An executable that links a library generally, does not expose that library and thus libraries generally do not contribute to the interface. On the other hand, the whole point of transitional packages, is to expose such functionality.
Implicit and foreign dependencies of a package: Essential packages are implicitly depended upon and need not show up in Depends:. Yet their behaviour can be architecture-dependent. For instance, using dpkg --print-architecture can be used to emit the native architecture even though dpkg is marked Multi-Arch: foreign. Similarly, calling pkg-config (without a prefix) will behave differently on different architectures as its search path is architecture-dependent even thoug pkg-config is marked Multi-Arch: foreign.
The set of functionality that a package exposes for consumption by others or what one can call its interface depends on the intentions of the package providing that interface. Example: consider an architecture dependent program written in C which contains an internal table which it uses to map characters the program receives on standard input to a certain other character which it prints on standard output. This binary package containing the program could be marked Multi-Arch: foreign even though the binary package would be architecture dependent. This is because the interface it provides (read data from standard input and print data to standard output) acts the same way independent of the architecture the program is compiled for. Its internal character mapping table which could be read out of the binary executable by a third party is not part of its interface and thus this table looking different in the executable on different architectures (due to different alignment and endian-ness) does not mean that the binary package containing this program cannot be made Multi-Arch: foreign.
TODO: list of general ideas of what does and doesn't constitute the interface of a program and a remark that exceptions must be listed in /usr/share/doc/$pkg/README.multiarch