Differences between revisions 3 and 4
Revision 3 as of 2013-10-21 06:02:04
Size: 8661
Editor: ?HelmutGrohne
Comment: fix moin markup in section headers
Revision 4 as of 2013-10-24 16:34:35
Size: 8718
Editor: ?HelmutGrohne
Comment:
Deletions are marked like this. Additions are marked like this.
Line 4: Line 4:
TODO: Integrate into [[Multiarch/InterpreterProposal]].

/!\ This page is work in progress and may be inaccurate. /!\ TODO: Integrate into Multiarch/InterpreterProposal.

Proposed changes to the Multiarch Spec

The current multiarch specification is not capable of expressing a few situations that actually occur in the Debian archive. This document is to explain the specific sub-problems and to propose solutions. It is based on discussions that happened during DebConf13.

Issues

This section describes known limitations of the current specification and explains the kind and number of packages affected.

Library extensions

Consider a shared library in one package and an extension to this library in a different package. As soon as the library becomes M-A:same, it can be installed for multiple architectures simultaneously. To extend the library, the extension must be available for the same architectures as is the library. Currently there is no way to express this condition to dpkg. The issue comes in two flavours.

LD_PRELOAD

It is possible to change the behaviour of the C library by exporting the LD_PRELOAD environment variable. A package that uses this technique usually has a shared library and a shell script that sets up LD_PRELOAD to point to its library. A user installing such a packages would expect it to work in a multiarch environment. That means that the shared library should be available in all the architectures the user uses. Currently, such packages are only installed for the native architecture (by default) and are not available for foreign architectures. As of this writing, the only tool that can be used on foreign architectures at all is fakechroot, because it depends on a M-A:same libfakechroot. A user can install libfakechroot for multiple architectures, but this does not happen automatically.

Usually there is no dependency relation between any of these tools and the programs whose behaviour is changed.

Affected packages:

  • cowdancer: /usr/lib/cowdancer/libcowdancer.so

  • eatmydata: /usr/lib/libeatmydata/libeatmydata.so

  • datefudge: /usr/lib/datefudge/<triplet>/datefudge.so

  • devscripts: /usr/lib/devscripts/libvfork.so.0

  • fakechroot: /usr/lib/<triplet>/fakechroot/libfakechroot.so from libfakechroot

  • fakeroot: /usr/lib/<triplet>/libfakeroot/libfakeroot-*.so

  • faketime: /usr/lib/<triplet>/faketime/libfaketime*.so.1 from libfaketime

  • fl-cow: /usr/lib/fl-cow/libflcow.so

  • libc-bin: /lib/x86_64-linux-gnu/libSegFault.so from libc6

  • libroar-compat2: /usr/lib/x86_64-linux-gnu/roaraudio/complibs/*.so

  • postgresql-client-common: /lib/<triplet>/libreadline.so.6 from libreadline6

  • pulseaudio-utils: /usr/lib/<triplet>/pulseaudio/libpulsedsp.so

  • sdate: /usr/lib/libsdate/libsdate.so

  • torsocks: /usr/lib/torsocks/libtorsocks.so

  • tsocks: /usr/lib/libtsocks.so

  • vde2: /usr/lib/vde2/libvdetap.so

Library plugin

A shared library may provide an interface for extension by loading further libraries during runtime. Two examples for this technique are PAM and NSS. PAM modules are loaded dynamically into programs that use libpam for authenticating users. NSS modules are loaded dynamically into programs that use the C library for name or user resolution. In both case programs programs link against libpam0g or libc6 which are both M-A:same each. Neither usually express any dependency relation on the modules, so it is possible that the modularized library is installed for multiple architectures, but the configured extension modules are not installed.

Usually there is no dependency relation between these plugins and programs that are affected by linking libpam0g or libc6.

Affected packages:

  • libpam-* (~70)

  • libnss-* (~10)

TODO: update numbers against sid

Interpreter issue

Interpreted languages such as Perl or Python can be extended with architecture dependent as well as architecture independent modules that may interact with each other. This case was envisaged when the current multiarch specification was written. The idea was that interpreters should be marked with M-A:allowed. Then architecture independent modules could have their interpreter dependencies annotated with :any. What has happened instead is that embeddable interpreters are marked with M-A:same effectively allowing it to be available in multiple architectures at the same time. The availability of interpreters as shared libraries renders this dependency annotation with :any unusable. A module using such annotations would introduce architecture boundaries where there are none. A good explanation of the issues including examples given by Guillem Jover. Essentially the usability of an architecture independent module on a particular architecture depends on the availability of all of its recursive dependencies in that architecture. This restriction currently cannot be expressed to the dependency system and therefore all architecture independent modules are considered to have the native architecture.

Affected packages:

  • Java: ~30 (lib*-java)

  • Mono: ~150 (lib*-cil)

  • Perl: ~170 (lib*-perl)

  • Python: ~80 (python-*, python3-*)

  • Ruby: ~30 (lib*-ruby)

  • Others: ~120 (upper bound, many false positives)

TODO: refresh numbers TODO: link to generation script

Note that not all languages mentioned above can be embedded, but at least Perl, Python and Ruby can. A lower bound on the number of affected packages therefore is ~280.

In all of these cases there is a dependency path starting in one of the affected architecture independent modules, passing an architecture dependent module, and ending in an interpreter package.

Solutions

The solutions presented here are roughly ordered by ascending complexity.

Conversion to Arch:any M-A:same

The interpreter issue can be mitigated by turning affected architecture independent modules into architecture dependent packages marked with M-A:same.

Solves: interpreter

Pros:

  • The installation size does not change even though more packages are to be installed.
  • This solution does not require any changes to the infrastructure or the package management.

Cons:

  • The mirror and buildd usage grows.
  • This solution is fragile in that the addition of any package can cause existing modules to become affected.
  • Maintainer scripts must be careful with respect to bytecode removal when removing only one architecture of a package.

Multi-Arch: all

A new value "all" for the Multi-Arch can be added. This value implies the semantics of the value "same". In addition it causes the package to be automatically installed for all native and foreign architectures configured with dpkg.

Solves: library extension, interpreter

Cons:

  • dpkg --add-architecture results in an inconsistent state.

  • Vastly over approximates and causes more packages to be installed than necessary.
  • Must be disabled for bootstrapping new architectures to use fakeroot.

Install-same-as header

The basic idea is to tell packages to be installed in all architectures that a given other package is installed for. To that end a new optional header for binary packages can be added. The value contains one of its dependencies. A package carrying this header must be M-A:same. It is only considered installed if it is installed for all architectures that the listed package is installed for.

In case of libpam0g and libc6 all plugins and LD_PRELOAD libraries would list these packages in the new header. Architecture dependent modules to interpreters would list the respective shared interpreter library package.

Solves: library extension, interpreter

Cons:

  • Installing a foreign interpreter or C library causes all extensions to be installed as well even if not all are needed.
  • Must be disabled for bootstrapping new architectures to use fakeroot.

Running architectures

The idea behind this approach is very similar to turning relevant packages Architecture: any and M-A:same. The major difference here is that it is done internally to dpkg and applied to every architecture independent package automatically.

TODO: specification

Solves: interpreter

Cons:

  • Significant changes to dpkg and apt.

TODO: implementation plan

Running architectures with group tracking

TODO: rough idea TODO: specification TODO: implementation plan