based on Matt Taggart's earlier write-up at http://lackof.org/taggart/hacking/multiarch/
Introduction
The Debian Multiarch proposal represents a radical rethinking of the filesystem heirarchy with respect to library and header paths. As such, it is a very disruptive change; software has long assumed that on Unix systems the library and include paths are sister directories at the same level of the heirarchy, and multiarch violates this assumption thoroughly. Even if all the various upstream build systems avoided hard-coding this assumption and were perfectly content to trust the system path (which today they are not), there would still be the matter of getting these directories on the system path to begin with - which means patching (or wrapping) all the various compilers in use.
If we are going to ask compiler upstreams to accomodate this seemingly gratuitous difference, and ask other distributions to embrace multiarch as a new standard, it behooves us to make a case for why such a change is needed at all.
Contents
Problem
Where supporting libraries of different ABIs on a single filesystem is concerned, there are several related problems to consider:
32/64 Architectures
Architectures with support for 32bit and 64bit versions of the same instruction set.
Examples
- i386 / x86_64
- ppc / ppc64
- s390 / s390x
- sparc / sparc64
- mips / mips64
- sh / sh64
- hppa / hppa64
Current practices
The FHS and LSB have standardized the x86_64 architecture to use /lib64 as the path for 64-bit x86 libraries, with /lib reserved for 32-bit x86 libraries on such systems. This is in spite of the fact that for performance reasons, x86_64 is the preferred ABI to use on hardware that supports it.
Red Hat and SuSE have adopted this standard. Debian and Ubuntu have declined to adopt this provision of the FHS, because the inconsistency introduced by special-casing of x86_64 would require deep changes to the packaging tools for incremental benefit.
Comments
We have both the case of a mostly 32bit system with some 64bit libs/binaries and a mostly 64 bit system with some 32bit libs/binaries. Ideally they would be treated the same. So what goes in /lib? It may need to exist for legacy reasons, should it be a symlink to /lib32 or /lib64?
Mixed instruction sets
Architectures with support for running other instruction sets (either via hardware or software emulation)
Examples
- i386 / ia64 (hardware emulation)
- arm / any (via qemu)
- s390 / any (via Hercules)
- ia64 / i386 (via ski)
Current practices
- Debian ia64
- ia32-libs package, installs in /emul/ia32-linux
- /lib/ld-linux.so.2 symlink to PI in /emul/ia32-linux
- ia32 binaries in system root (by hand)
- qemu
- using the system root for native instruction set
configurable, non-standard path (e.g., /etc/qemu-binfmt/arch; /usr/gnemul/qemu-arch) for emulated instruction sets
Comments
Since this could potentially be any target combination, it's wasteful to reinvent a new schema for each emulator or target. These emulation environments also have all the same problems that the introduction of /lib64 for x86_64 was intended to address, getting none of the benefits of the special-case "biarch" solution for 32/64bit environments since the pairs aren't the special-cased ones.
Mixed OS environment
Running binaries from one OS on another via a compatibility layer
Examples
- Linux on Other
- Linux/i386 on FreeBSD/i386
- Linux/i386 on Solaris/x86
- Linux/ia64 on HP-UX/ia64
- Other on Linux
- Solaris/sparc on Linux/sparc
- HP-UX/hppa on Linux/hppa
- HP-UX/ia64 on Linux/ia64
- Irix/mips on Linux/mips
- OSF/1/alpha on Linux/alpha
- Non-FHS compliant on Linux and system level emulation
- dosemu
- wine
- bochs
- vmware
Current practices
- Linux on FreeBSD: /compat/linux
- Solaris on Linux: /usr/gnemul/sunos/
- Others usually use pseudoroots or chroots
Comments
As most proprietary Unices have long since been overtaken by Linux in the market, demand for this capability has probably already peaked. Nevertheless, there will be uses for OS compatibility layers for some time to come.
Cross-compilation environments
Building binaries for an architecture other than the current one.
Current practices
Cross-build environments commonly mirror the FHS structure (bin, include, lib) in a parallel, architecture-specific tree (e.g., /usr/arm-linux-gnueabi).
Comments
Of all the use cases that multiarch would try to address, cross-building is the one where existing practice comes closest to providing a complete solution: it approximates the FHS, and it addresses all architectures. Even so, it's not symmetrical; binaries built natively look for their auxiliary files in /usr/lib, binaries targeting a cross-build environment would look in /usr/arch/lib, still making it impossible to reuse a single binary in both native and non-native contexts. Adding top-level arch/lib directories to replace /lib for system libraries would almost certainly also be unwelcome.
Mixed endian
Architectures that support running mixed endian binaries.
Examples
- potential: arm, hppa, ppc(can switch on the fly, but linux syscalls are BE), ia64
- mips / mipsel: not supported at runtime but potentially on the same system
Comments
Not currently a large problem but documented here for completeness.
Mixed ABIs and instruction set extensions
Architectures with more than one ABI.
Examples
- i386 / i586 / i686 / MMX / SSE / etc.
?AltiVec
- sh3 vs sh4
ABI transitions: ARM OABI -> ARM EABI; ARM soft-float -> ARM EABI hard-float
Comments
It is dubious that we should try to address ABI-compatible, optimized subarchs with the same solution as the other problems listed here. GNU libc's hwcaps implementation already does an adequate job of autodetecting appropriate optimized libraries for use; it is probably better to combine a hwcaps-style implementation for ABI-compatible subarchs with a multiarch solution for incompatible ABIs. See Multiarch/Tuples for an exploration of how compatible subarchs should be defined.
Proposed Solution
Requirements
In order to support binaries for multiple targets on the same system, we need to be able to:
- Runtime
- install application binaries for multiple targets
- ensure that those binaries' program interpreter is available without collision
- support all those applications' library dependencies
- therefore have multiple copies of the same library (for different targets) installed without collision, since binaries for multiple targets will be installed on the same system
- Development
- install multiple copies (for different targets) of development libraries and headers
- Reasonable migration plan for existing implementations
- Consistent solution across implementations for each above problem
- No/few arch-specific solutions
Proposed General solution
Key: The term prefix is intended to be replaced with the FHS compliant install prefixes such as /, /usr, /usr/local, /opt/foo, etc. The term target is meant to represent the canonical GNU tuple (triplet) for the architecture / os combination.
- FHS changes:
Target-specific libraries, both shared and static, for target belong in prefix/lib/target/
- If present, (for migration from FHS 2.3)
/lib64 is symlinked (or bind mounted) with the desired proper /lib/target directory
/lib32 is symlinked (or bind mounted) with the desired proper /lib/target directory
Target-specific header files for target' belong in prefix/include/target/
Non-target-specific libraries and header files remain in prefix/lib and prefix/include
- FHS Examples:
- /usr/lib/i386-linux-gnu/
- /usr/include/i386-linux-gnu/
- /usr/lib/x86_64-linux-gnu/
- /usr/local/lib/powerpc-linux-gnu/
- /usr/local/include/powerpc-linux-gnu/
- /opt/foo/lib/sparc-solaris/
- /opt/bar/include/sparc-solaris/
- Linux Program Interpreter changes:
The program interpreter will be /lib/target/ld.so.version
For compatibility with existing binaries and the ABI standards, symlinks must be maintained to the historical locations in /lib and /lib64. If /lib64 is already a symlink to /lib/target, no additional symlink is required here.
The program interpreter needs to be able to find the libraries installed in prefix/lib/target directories by default.
Future versions of the LSB may wish to consider using /lib/target as the standard location for the LSB PI.
- Compiler changes:
The compiler needs to be able to find the libraries installed in prefix/lib/target/ directories by default
The compiler needs to be able to find the header files installed in prefix/include/target/ directories by default
- Library changes:
- Software can continue to install libraries and header files in the existing prefix/lib/ and prefix/include/ directories. If multi-arch support is desired the following changes are needed,
- Install libraries in the appropriate prefix/include/arch-os/ directories
- Install header files in the appropriate prefix/include/arch-os/ directories
- Any software that has hard-coded library paths needs to be changed to use the PI to locate its libraries
- Any development software that has hard-coded library and include paths needs to be changed to use the compiler to locate its libraries and headers
Implementation Options
Runtimes implementing this proposal can put the libraries/headers/program interpreters wherever they like as long as they can be accessed as described in the proposal. This allows for a range of possible implementations and the ability for a gradual transition. Here are some potential options.
- In all cases, software can expect to be able to install
- to the old paths to the new paths the needed runtime libraries and PIs for a new target in the correct locations and provide support for that target without needing any changes from the OS provider (an example is a target emulator)
- Libraries remain in prefix/lib and header files in prefix/include Hard links for prefix/lib/arch-os and prefix/lib Due to the above links the PI and LSB-PI may already be correct depending on the target. If not some symlinks may be required. If needed a /lib64 symlink (or bind mount) Hard links for prefix/include/arch-os and prefix/include Because the libraries and headers haven't moved the PI and compiler will still continue to find them in their existing locations. So no changes to the PI and compilers search paths are needed. No support for multi-arch yet in things like system package tools
- Libraries for single target in appropriate prefix/lib/arch-os/ and header files in prefix/include/arch-os/ If needed a /lib64 symlink (or bind mount) PI for single target moved into /lib/arch-os/ Required PI symlinks for that single target PI, compiler, etc. search paths changed to look in the new locations (in addition to the existing locations) No support for multi-arch yet in things like system package tools
- Same as option B, but with native support for multiple targets System totally multi-arch aware including things like package tools etc.
Benefits
- Fixes the problems in a clean and consistent manner Meets our required and desired results described in the proposal Cleans up some inconsistent PI naming Has the added benefit of providing a way to do long term, gradual migration from one target to another. This has a few interesting implications,
- Migration from one standard pervasive legacy target to a new not yet widely adopted target, by providing support for the legacy target during a transition period. (Maybe we can finally move away from i386 and Legacy UNIX OSes?) Migration from one target ABI to another ABI on the same target. Something has hasn't been easy before this.
FAQ
- Isn't this impossible? No, Debian has a small-scale implementation already. Will all software need to switch to this proposal right away? No, they can continue to install to the existing paths indefinitely. The main reason for changing to the new model will be to enable that software to be used on a multi-arch system. Will I be able to install the same binary for two different targets on the same system? No, not without changing the name to avoid collision. This proposal assumes that bin directories don't need to be differentiated since users won't want more than one version of a binary installed. If there is demand for that it could be addressed in a similar manner, but we don't want to deal with it until there is sufficient demand. Linux's ldconfig will walk the whole prefix/lib/ tree finding libraries for various targets, won't it blow up? No, it's smart enough to ignore stuff not for it's target. What about software that creates subdirectories under prefix/lib/ (or include) These will need to be evaluated on a case by case basis depending on if the content they are providing under that subdirectory is target specific. What about software with binary plug-ins like apache? You can't mix and match target plug-ins with one binary apache today. In the case where you could, that software would want to follow the proposal and structure their software accordingly to avoid filesystem collisions. Why not prefix/arch-os/lib/ (and include/)? It would pollute the prefix directory. Can you imagine adding one entry for each target to the root and /usr directories? Better that they go under the prefix/lib/ (and include/) directories which already contain many files.