Debian generally builds binaries for the baseline instruction set of each architecture.

Often upstream projects want to require to newer instruction targets, which means that binaries won't work on older systems.

There are several options to resolve these sorts of conflicts. These changes are often suitable for upstream but can also be done just in Debian with a small amount of maintenance needed as upstream code changes.

Porting between targets

The lack of suitable instruction target is broadly separated in two cases:

Runtime selection

Manual

Your code can check the CPU it is currently running on and run the appropriate code based on the available instructions.

On x86 the CPUID instruction is used for this, but the __builtin_cpu_supports builtin function simplifies this.

The getauxval function (LWN article) can also be used for this, using the AT_HWCAP/AT_HWCAP2 options. The LD_SHOW_AUXV=1 sleep 0 command can print all the getauxval results at the command-line.

The Google cpu_features library can detect CPU features on Linux using the above mechanisms and other ones on other platforms. An example of how to use cpu_features can be found in subarch-select.

/!\ Please note that using /proc/cpuinfo is not reliable, particularly in multiarch context and when using qemu-user.

Isa support package

isa-support package compile a list of small program specially crafted for shell that are reliable even without /proc mounted or under qemu.

The program are installed under /usr/libexec/ARCH-TRIPLET/isa-support/test-* and return 0 in case of detected feature.

Programs should run this test using: /usr/libexec/$(dpkg-architecture -q DEB_HOST_MULTIARCH)/isa-support/testname

List of functionalities tested are given in the following table

DEBARCH

 test name

test of

any-i386

test-SSE2

SSE21

any-i386/any-amd64

test-SSE3

SSE32

any-i386/any-amd64

test-SSE4.1

SSE4.12

any-i386/any-amd64

test-SSE4.2

SSE4.22

any-i386

test-amd64-baseline

amd64 baseline for debian

any-i386

test-x86-64-v1

pABI v1

any-i386/any-amd64

test-x86-64-v2

pABI v2

any-i386/any-amd64

test-x86-64-v3

pABI v3

armel

test-ARMv6

ARM v6 support

armel

test-ARMv6K

ARM v6k support

armel

test-ARMv7

ARM v7 support

any-arm3

test-ARMv8

ARM v8 support

armel

test-VFP

VFP support

armel

test-VFPv2

VFP v2 support

armel

test-VFPv3

VFP v3 support

any-arm3

test-neon

neon support

any-arm3/arm64

test-ARMv8CRC

ARM v8 + CRC instruction

any-arm3/arm64

test-arm64-baseline

arm64 baseline

ppc*4

test-altivec

altivec instructions

  1. Prefer test-amd64-v1 (1)

  2. Prefer test-x86-64-v2 (2 3 4)

  3. armel/armhf (5 6 7 8)

  4. powerpc/ppc64/ppc64el (9)

Function multi-versioning

Function multi-versioning (LWN article) involves using a compiler-supplied ifunc (more) at program start to automatically resolve functions to the right instruction target.

This is supported in at least C, C++, Rust and is planned for Zig.

Manual

You can write your own ifunc that will be run at program start to automatically resolve functions to the right instruction target. This could be a useful replacement for the target attribute (see below) where it isn't supported.

target_clones attribute

The target_clones attribute can allow you to compile one implementation of a function for multiple instruction targets and then select the best one at runtime:

int foo (void);

__attribute__((target_clones("avx2", "default")))
int foo(){
  return 1;
}

This is supported for C and C++ source files in GCC 6+ (with GLIBC 2.23+) and clang 14+ compilers.

In theory it should be possible to use #ifdef in target_clones functions in C source files in GCC and clang to get different implementations for different targets, but the necessary changes have not been implemented.

In theory target-specific C++ template specialisation could be supported for C++ source files but the necessary changes have not been added yet (see GCC bug).

The support for C source files in clang requires a function prototype for each multi-target function, and empty argument lists are not supported in the prototypes, a void argument is required. Neither are required for C++ source files in clang.

There is support for this in GCC on x86, PowerPC, armel/armhf, arm64, s390x and some other architectures. However, GCC support for PowerPC, MIPS and SPARC may be spotty.

target attribute

The target attribute can allow you to write independent implementations of a function for multiple instruction targets and then select the best one at runtime:

int foo (void);

__attribute__ ((target ("default")))
int foo ()
{
  return 0;
}

__attribute__ ((target ("sse4.2")))
int foo ()
{
  return 1;
}

This is supported for C++ source files in GCC 4.8+ compilers.

In theory this could be supported for C source files in GCC but the necessary changes have not been added yet (see bug).

In theory target-specific C++ template specialisation could be supported for C++ source files in GCC but the necessary changes have not been added yet (see GCC bug).

This supported for C and C++ source files in Clang 8+ compilers.

The support for C source files in clang requires a function prototype for each multi-target function, and empty argument lists are not supported in the prototypes, a void argument is required. Neither are required for C++ source files in clang.

There is support for this in GCC on x86, PowerPC, armel/armhf, arm64, s390x and some other architectures. However, GCC support for PowerPC, MIPS and SPARC may be spotty.

hwcaps

The hwcaps feature allows you to build an entire library for multiple instruction targets, install them into a different directory for each target and then select the best one at runtime.

This is supported for ELF libraries in glibc and version 2.33 improved this mechanism significantly.

Whole program

Writing scripts or programs allows you to build an entire program for multiple instruction targets, install them into a different directory for each target and then select the best one at runtime.

The Debian Med team has a simd-dispatch script that is an example of this approach. subarch-select also provides this, for x86 only. In future the isa-support source package may provide a way to do this in a simpler way.

/!\ Please note that using /proc/cpuinfo is not reliable, particularly in multiarch context and when using qemu-user.

Blocking unsupported systems

/!\ Since blocking use of software on unsupported systems isn't very user friendly, these options are to be used only as a last resort. Give a lot of thought to whether users AND ALSO Debian's continuous integration tests will be attempting to use your package on unsupported hardware.

/!\ Packages can allow installation but instead have scripts or programs that check the CPU they are currently running on and print an error message to stderr or show an error message using a graphical tool. The chromium wrapper script is an example of this approach. Asking on debian-devel for a baseline exception is recommended, since there may be use cases you hadn't considered.

/!\ The binary packages of src:isa-support allow preventing installation of packages that require instructions that are not available on the current CPU. /!\ This approach prevents lots of valid use-cases where the install system has different CPU features than the running system, like creating installs on older systems and running them on newer systems.

/!\ Please note that using /proc/cpuinfo is not reliable, particularly in multiarch context and when using qemu-user.

Architectures

Partial

In theory it could be possible to add additional architectures with increased baselines with different architecture names that only build select packages where the performance difference is noticeable. SIMDebian took this approach.

Change baselines

Increasing the CPU requirements of an architecture can increase its performance on newer or more capable hardware while preventing the port from being used on older or less capable hardware. Decreasing the CPU requirements does the opposite of course. Benchmarks are needed to determine exactly how useful a CPU requirements change could be.

Build flags

Users can use Debian build flags to rebuild individual packages for their own use. See the dpkg-buildflags manual page for the override mechanism and the compiler documentation for which build flags to select. Some packages will not properly inject the new flags into the build system, so inspect the build logs to find out if they do that correctly and file bugs for packages that don't inject build flags correctly. Benchmarks are needed to determine exactly how useful a build flags change could be.

Architecture Specific memo

Baseline are specified in this memo.

amd64

Instead of using instruction set compilation flags, software are encouraged to compile using psABI (Table 3.1: Micro-Architecture Levels) level. Use for instance if you need AVX and AVX2:

__attribute__((target("arch=x86-64-v3")))

or -march=x86-64-v3 gcc flag, and library should be installed in /usr/lib/x86_64-linux-gnu/glibc-hwcaps/x86-64-v3

i386

Same as amd64 software are encouraged to use psABI (Table 3.1: Micro-Architecture Levels) level.

x86_64 baseline may be used (arch=x86-64) instead of specifying SSE2/SSE instruction support.