/!\ This page is work in progress and may be inaccurate. /!\ TODO: Integrate into Multiarch/InterpreterProposal.

Proposed changes to the Multiarch Spec

The current multiarch specification is not capable of expressing a few situations that actually occur in the Debian archive. This document is to explain the specific sub-problems and to propose solutions. It is based on discussions that happened during DebConf13.

Issues

This section describes known limitations of the current specification and explains the kind and number of packages affected.

Library extensions

Consider a shared library in one package and an extension to this library in a different package. As soon as the library becomes M-A:same, it can be installed for multiple architectures simultaneously. To extend the library, the extension must be available for the same architectures as is the library. Currently there is no way to express this condition to dpkg. The issue comes in two flavours.

LD_PRELOAD

It is possible to change the behaviour of the C library by exporting the LD_PRELOAD environment variable. A package that uses this technique usually has a shared library and a shell script that sets up LD_PRELOAD to point to its library. A user installing such a packages would expect it to work in a multiarch environment. That means that the shared library should be available in all the architectures the user uses. Currently, such packages are only installed for the native architecture (by default) and are not available for foreign architectures. As of this writing, the only tool that can be used on foreign architectures at all is fakechroot, because it depends on a M-A:same libfakechroot. A user can install libfakechroot for multiple architectures, but this does not happen automatically.

Usually there is no dependency relation between any of these tools and the programs whose behaviour is changed.

Affected packages:

Library plugin

A shared library may provide an interface for extension by loading further libraries during runtime. Two examples for this technique are PAM and NSS. PAM modules are loaded dynamically into programs that use libpam for authenticating users. NSS modules are loaded dynamically into programs that use the C library for name or user resolution. In both case programs programs link against libpam0g or libc6 which are both M-A:same each. Neither usually express any dependency relation on the modules, so it is possible that the modularized library is installed for multiple architectures, but the configured extension modules are not installed.

Usually there is no dependency relation between these plugins and programs that are affected by linking libpam0g or libc6.

Affected packages:

TODO: update numbers against sid

Interpreter issue

Interpreted languages such as Perl or Python can be extended with architecture dependent as well as architecture independent modules that may interact with each other. This case was envisaged when the current multiarch specification was written. The idea was that interpreters should be marked with M-A:allowed. Then architecture independent modules could have their interpreter dependencies annotated with :any. What has happened instead is that embeddable interpreters are marked with M-A:same effectively allowing it to be available in multiple architectures at the same time. The availability of interpreters as shared libraries renders this dependency annotation with :any unusable. A module using such annotations would introduce architecture boundaries where there are none. A good explanation of the issues including examples given by Guillem Jover. Essentially the usability of an architecture independent module on a particular architecture depends on the availability of all of its recursive dependencies in that architecture. This restriction currently cannot be expressed to the dependency system and therefore all architecture independent modules are considered to have the native architecture.

Affected packages:

TODO: refresh numbers TODO: link to generation script

Note that not all languages mentioned above can be embedded, but at least Perl, Python and Ruby can. A lower bound on the number of affected packages therefore is ~280.

In all of these cases there is a dependency path starting in one of the affected architecture independent modules, passing an architecture dependent module, and ending in an interpreter package.

Solutions

The solutions presented here are roughly ordered by ascending complexity.

Conversion to Arch:any M-A:same

The interpreter issue can be mitigated by turning affected architecture independent modules into architecture dependent packages marked with M-A:same.

Solves: interpreter

Pros:

Cons:

Multi-Arch: all

A new value "all" for the Multi-Arch can be added. This value implies the semantics of the value "same". In addition it causes the package to be automatically installed for all native and foreign architectures configured with dpkg.

Solves: library extension, interpreter

Cons:

Install-same-as header

The basic idea is to tell packages to be installed in all architectures that a given other package is installed for. To that end a new optional header for binary packages can be added. The value contains one of its dependencies. A package carrying this header must be M-A:same. It is only considered installed if it is installed for all architectures that the listed package is installed for.

In case of libpam0g and libc6 all plugins and LD_PRELOAD libraries would list these packages in the new header. Architecture dependent modules to interpreters would list the respective shared interpreter library package.

Solves: library extension, interpreter

Cons:

Running architectures

The idea behind this approach is very similar to turning relevant packages Architecture: any and M-A:same. The major difference here is that it is done internally to dpkg and applied to every architecture independent package automatically. Instead of considering architecture independent packages to be installed for the native architecture, dpkg tracks set of architectures for which they are considered installed. New operations are added to dpkg to augment or shrink these sets. For instance when removing a dependency of an architecture independent package, the set may need to be shrunk. On the other hand installing a package can be done without extending architecture sets of other packages. Thus the dpkg state underapproximates the available functionality provided by packages. At a later time one may notice that the dependencies of an architecture independent package are satisfied in another architecture and then extend its corresponding architecture set.

In the context of the interpreter issue, this extension causes architecture independent modules to inherit the architecture information of the interpreters. For this to work with embedded interpreters, modules need to add the embedded interpreters (being M-A:same) as an alternative to their main interpreter dependency (at least in the Perl and Python worlds).

Solves: interpreter

Pros:

Cons:

Running architectures with group tracking

As the title suggests this is an extension to the "Running architectures" proposal. It addresses the package splitting aspect by tracking multiple architectures for architecture independent packages. Interpreter packages (M-A:allowed) are used as special markers in the dependency tree and induce new sets of architectures. A package (possibly indirectly) depending on e.g. Python is then considered to be installed for the purpose of using it with Python on a set of architectures that may be different from the set of architectures for other purposes. When resolving dependencies all M-A:allowed packages in the transitive dependency set are considered as purposes. For each purposes the subset of packages indirectly depending on a purpose is considered. Dependencies are resolved within such a purpose and a set of running architectures is determined. In addition there is an empty purpose covering all packages. A package's dependencies are considered satisfied if dependencies are satisfied with respect to the empty purpose or if the package indirectly depends on at least one M-A:allowed package and its dependencies are satisfied with respect to all purposes except for the empty purpose. In practise this means that packages using e.g. both Perl and Python are no longer considered to be installed for a single architecture. Instead the package may be running on a mix of architectures. Unless such a package is M-A:foreign itself, it will not be able to fulfil dependencies on it at all.

Pros: (see also "Running architectures")

Cons:

TODO: specification TODO: implementation plan