based on Matt Taggart's earlier write-up at http://lackof.org/taggart/hacking/multiarch/
Introduction
The Debian Multiarch proposal represents a radical rethinking of the filesystem hierarchy with respect to library and header paths. As such, it is a very disruptive change; software has long assumed that on Unix systems the library and include paths are sister directories at the same level of the hierarchy, and multiarch violates this assumption thoroughly. Even if all the various upstream build systems avoided hard-coding this assumption and were perfectly content to trust the system path (which today they are not), there would still be the matter of getting these directories on the system path to begin with - which means patching (or wrapping) all the various compilers in use.
If we are going to ask compiler upstreams to accommodate this seemingly gratuitous difference, and ask other distributions to embrace multiarch as a new standard, it behooves us to make a case for why such a change is needed at all.
Contents
Problem
Where supporting libraries of different ABIs on a single filesystem is concerned, there are several related problems to consider:
32/64 Architectures
Architectures with support for 32bit and 64bit versions of the same instruction set.
Examples
- i386 / x86_64
- ppc / ppc64
- s390 / s390x
- sparc / sparc64
- mips / mips64
- sh / sh64
- hppa / hppa64
- arm / arm64(aarch64)
Current practices
The FHS and LSB have standardized the x86_64 architecture to use /lib64 as the path for 64-bit x86 libraries, with /lib reserved for 32-bit x86 libraries on such systems. This is in spite of the fact that for performance reasons, x86_64 is the preferred ABI to use on hardware that supports it.
Red Hat and SuSE have adopted this standard. Debian and Ubuntu have declined to adopt this provision of the FHS, because the inconsistency introduced by special-casing of x86_64 would require deep changes to the packaging tools for incremental benefit.
Comments
We have both the case of a mostly 32bit system with some 64bit libs/binaries and a mostly 64 bit system with some 32bit libs/binaries. Ideally they would be treated the same. So what goes in /lib? It may need to exist for legacy reasons, should it be a symlink to /lib32 or /lib64?
Mixed instruction sets
Architectures with support for running other instruction sets (either via hardware or software emulation)
Examples
- i386 / ia64 (hardware emulation)
- arm / any (via qemu)
- s390 / any (via Hercules)
- ia64 / i386 (via ski)
Current practices
- Debian ia64
- ia32-libs package, installs in /emul/ia32-linux
- /lib/ld-linux.so.2 symlink to PI in /emul/ia32-linux
- ia32 binaries in system root (by hand)
- qemu
- using the system root for native instruction set
configurable, non-standard path (e.g., /etc/qemu-binfmt/arch; /usr/gnemul/qemu-arch) for emulated instruction sets
Comments
Since this could potentially be any target combination, it's wasteful to reinvent a new schema for each emulator or target. These emulation environments also have all the same problems that the introduction of /lib64 for x86_64 was intended to address, getting none of the benefits of the special-case "biarch" solution for 32/64bit environments since the pairs aren't the special-cased ones.
Mixed OS environment
Running binaries from one OS on another via a compatibility layer
Examples
- Linux on Other
- Linux/i386 on FreeBSD/i386
- Linux/i386 on Solaris/x86
- Linux/ia64 on HP-UX/ia64
- Other on Linux
- Solaris/sparc on Linux/sparc
- HP-UX/hppa on Linux/hppa
- HP-UX/ia64 on Linux/ia64
- Irix/mips on Linux/mips
- OSF/1/alpha on Linux/alpha
- Non-FHS compliant on Linux and system level emulation
- dosemu
- wine
- bochs
- vmware
Current practices
- Linux on FreeBSD: /compat/linux
- Solaris on Linux: /usr/gnemul/sunos/
- Others usually use pseudoroots or chroots
Comments
As most proprietary Unices have long since been overtaken by Linux in the market, demand for this capability has probably already peaked. Nevertheless, there will be uses for OS compatibility layers for some time to come.
Cross-compilation environments
Building binaries for an architecture other than the current one.
Current practices
Cross-build environments commonly mirror the FHS structure (bin, include, lib) in a parallel, architecture-specific tree (e.g., /usr/arm-linux-gnueabi).
Comments
Of all the use cases that multiarch would try to address, cross-building is the one where existing practice comes closest to providing a complete solution: it approximates the FHS, and it addresses all architectures. Even so, it's not symmetrical; binaries built natively look for their auxiliary files in /usr/lib, binaries targeting a cross-build environment would look in /usr/arch/lib, still making it impossible to reuse a single binary in both native and non-native contexts. Adding top-level arch/lib directories to replace /lib for system libraries would almost certainly also be unwelcome.
Mixed endian
Architectures that support running mixed endian binaries.
Examples
- potential: arm, hppa, ppc(can switch on the fly, but linux syscalls are BE), ia64
- mips / mipsel: not supported at runtime but potentially on the same system
Comments
Not currently a large problem but documented here for completeness.
Mixed ABIs and instruction set extensions
Architectures with more than one ABI.
Examples
- i386 / i586 / i686 / MMX / SSE / etc.
?AltiVec
- sh3 vs sh4
ABI transitions: ARM OABI -> ARM EABI; ARM soft-float -> ARM EABI hard-float
Comments
It is dubious that we should try to address ABI-compatible, optimized subarchs with the same solution as the other problems listed here. GNU libc's hwcaps implementation already does an adequate job of autodetecting appropriate optimized libraries for use; it is probably better to combine a hwcaps-style implementation for ABI-compatible subarchs with a multiarch solution for incompatible ABIs. See Multiarch/Tuples for an exploration of how compatible subarchs should be defined.
Proposed Solution
Requirements
In order to support binaries for multiple targets on the same system, we need:
- Runtime
- to install application binaries for multiple targets
- to ensure that those binaries' program interpreter is available without collision
- to support all those applications' library dependencies
- therefore have multiple copies of the same library (for different targets) installed without collision, since binaries for multiple targets will be installed on the same system
- Development
- install multiple copies (for different targets) of development libraries and headers
- a reasonable migration plan for existing implementations
- a consistent solution across implementations for each above problem
- no/few arch-specific solutions
- a cross-platform interface to query the correct multiarch directory for a target
Proposed general solution
Key: The term prefix is intended to be replaced with the FHS compliant install prefixes such as /, /usr, /usr/local, /opt/foo, etc. The term target is meant to represent the canonical GNU tuple (triplet) for the architecture / os combination.
- FHS changes:
Target-specific libraries, both shared and static, for target belong in prefix/lib/target/
- If present, (for migration from FHS 2.3)
/lib64 is symlinked (or bind mounted) with the desired proper /lib/target directory
/lib32 is symlinked (or bind mounted) with the desired proper /lib/target directory
Target-specific header files for target belong in prefix/include/target/
Non-target-specific libraries and header files remain in prefix/lib and prefix/include
- FHS Examples:
- /usr/lib/i386-linux-gnu/
- /usr/include/i386-linux-gnu/
- /usr/lib/x86_64-linux-gnu/
- /usr/local/lib/powerpc-linux-gnu/
- /usr/local/include/powerpc-linux-gnu/
- /opt/foo/lib/sparc-solaris/
- /opt/bar/include/sparc-solaris/
- Linux Program Interpreter changes:
The program interpreter will be /lib/target/ld.so.version
For compatibility with existing binaries and the ABI standards, symlinks must be maintained to the historical locations in /lib and /lib64. If /lib64 is already a symlink to /lib/target, no additional symlink is required here.
The program interpreter needs to be able to find the libraries installed in prefix/lib/target directories by default.
Future versions of the LSB may wish to consider using /lib/target as the standard location for the LSB PI.
- Compiler changes:
The compiler needs to be able to find the libraries installed in prefix/lib/target/ directories by default
The compiler needs to be able to find the header files installed in prefix/include/target/ directories by default
- Library changes:
- Software can continue to install libraries and header files in the existing prefix/lib/ and prefix/include/ directories. If multiarch support is desired the following changes are needed.
Install libraries in the appropriate prefix/lib/target directories
Install header files in the appropriate prefix/include/target directories
- Any software that has hard-coded library paths needs to be changed to use the PI to locate its libraries
- Any development software that has hard-coded library and include paths needs to be changed to use the compiler to locate its libraries and headers
Impact
Nothing in this proposal renders existing FHS- or LSB-compliant implementations invalid; an implementor may choose not to transition to target-qualified paths, or choose to transition to them only for non-default ABIs, leaving the default ABI in the unqualified /lib, /usr/lib directories. However, one of the most important benefits of multiarch to implementors is that the same binaries can be installed in either a native or non-native context without having to recompile for updated paths, so binary distributions will likely do the transition for all architectures at once.
Related software that shares a directory under /usr/lib, such as for a plugin directory, must coordinate any transition to multiarch. In general it is recommended that such plugin loaders include both the multiarch directory and the traditional /usr/lib directory on their search path for compatibility.
While multiarch is compatible at runtime with existing systems, requiring only changes to the program interpreter, development environments are another matter.
- toolchains need to be updated for the new paths:
compilers need to include /usr/include/arch on the default header path
linkers need to know that the crt*.o files are moved to /usr/lib/arch
compilers and linkers need to include /usr/lib/arch on the default library path
- This issue affects not only the distribution toolchain, but also upstream compilers and third-party compilers such as the Intel and ARM toolchains. Patches to make multiarch support available in GCC and GNU binutils upstream are in progress.
- Other build tools that walk the filesystem at runtime to locate headers and libraries also need to be updated with knowledge of multiarch paths by default. While arguably such tools have a design flaw in not trusting the toolchain's provided path, in practice this seems to be a surprisingly common flaw in build systems, with GNU autoconf appearing to be a singular exception:
- PHP, though using autoconf, bypasses the standard macros for header and library checks.
Python's build system also needs to find libraries on the filesystem. For Python 2.7 and Python 3.1 and above, preliminary support for multiarch has been added upstream.
A patch for multiarch support has been included in cmake upstream as well.
- both GNU make and pmake have features that rely on being able to find libraries on the filesystem.
Benefit
After a moderately painful transition period, a multiarch system permanently addresses all of the following problems:
- 32/64-bit libraries coinstalled on the system, including future expansion for new architectures (and for triarch mips)
- no special-casing of lib64, lib32 required anywhere: all architectures are handled the same
- binaries for emulation targets installed alongside native binaries
- no recompilation required due to different install paths for native vs. emulated environments
- binaries for cross-compilation targets installed alongside native binaries
- existing binary packages for the target architecture can be installed without modification as part of a cross-build, without any munging of paths
- future transition path in the case of fundamental breaks in instruction set or ABI compatibility
would have been useful in the past for i386->x86_64, arm->armel, and m68k->coldfire
- useful now for a soft transition from armel to armhf