Enabling SIMD using codes to build on other architectures using the SIMD Everywhere header only library

Debian source package https://tracker.debian.org/simde produces "libsimde-dev"

Upstream: https://github.com/simd-everywhere/simde

This page documents one option for increasing portability of CPU instruction selection. It can be combined with other features to achieve higher levels of portability for software that is not currently very portable.

When to use

C/C++ codebase makes reference to {m,xm,em,pm,tm,sm,nm,im}mintrin.h or x86intrin.h and does not provide other code routes for non-X86 architectures

Note: As of 2021-01-27 (SIMDe v0.7.2) there is complete coverage of MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, AVX, AVX2, FMA, GFNI, CLMUL, XOP, and SVML. The AVX512 variants are in progress still. Adding new functions is easy if you have access to the real hardware that supports them: https://github.com/simd-everywhere/simde#contributing

Approach

  1. Add libsimde-dev to the Build-Depends in debian/control
  2. Create a new patch to modify any source file that references the non portable headers.
  3. Add #define SIMDE_ENABLE_NATIVE_ALIASES prior to the "simde" include; use the table below to help pick the right portable header to include.

    non-portable header

    portable header

    #include <mmintrin.h>

    #include <simde/x86/mmx.h>

    #include <xmmintrin.h>

    #include <simde/x86/sse.h>

    #include <emmintrin.h>

    #include <simde/x86/sse2.h>

    #include <pmmintrin.h>

    #include <simde/x86/sse3.h>

    #include <tmmintrin.h>

    #include <simde/x86/ssse3.h>

    #include <smmintrin.h>

    #include <simde/x86/sse4.1.h>

    #include <nmmintrin.h>

    #include <simde/x86/sse4.2.h>

    #include <immintrin.h>

    #include <simde/x86/avx.h>

    #include <immintrin.h>

    #include <simde/x86/fma.h>

    #include <immintrin.h>

    #include <simde/x86/avx2.h>

    #include <immintrin.h>

    #include <simde/x86/avx512.h>

    #include <x86intrin.h>

    #include <simde/x86/avx512.h>

  4. Modify debian/rules to add "-DSIMDE_ENABLE_OPENMP -fopenmp-simd" along with "-O3" to CFLAGS/CXXFLAGS if that isn't already defined:
      export DEB_CFLAGS_MAINT_APPEND+=-DSIMDE_ENABLE_OPENMP -fopenmp-simd -O3
      export DEB_CXXFLAGS_MAINT_APPEND+=-DSIMDE_ENABLE_OPENMP -fopenmp-simd -O3
  5. Be sure to remove any -msse3, -march=native or similar from upstream's build system, if needed

  6. Be mindful of Debian Policy 7.8: if the source package requires the full source code be available (like the GPL) then you'll need to add Built-Using to indicate which version of libsimde-dev was used.

    1. Add  Built-Using: ${simde:Built-Using}  to the appropriate binary package(s) in debian/control

    2. In debian/rules add

      •          override_dh_gencontrol:
                        dh_gencontrol -- -Vsimde:Built-Using="$(shell dpkg-query -f '$${source:Package} (= $${source:Version}), ' -W "libsimde-dev")"
  7. Resulting binaries should now be building everywhere, though you may still have issues with 32-bit architectures and/or big-endian systems.
  8. Bonus: on i386/amd64 the architecture baseline should no longer be violated. For applications using "simde" consider compiling multiple times on i386 & amd64 all applicable SIMD compilation flags and adding a dispatching script to pick the highest available version at runtime.

  9. Libraries using this technique will benefit from a built in CPU feature dispatcher, but that is more advanced.
  10. Once the package builds on more architectures successfully, please discuss with upstream about adopting this or another technique.
  11. Add your package as an example to the lists below.

Packages Status

As of 2023-01-22, here is an incomplete list of packages that are using this technique, either due to Debian patches, or natively.

Package

Native or Debian use of SIMDe?

Patches sent upstream?

Notes

examl

Debian patches

Yes

last-align

Debian patches

Yes

python-skbio

Native SIMDe

N/A

code-copy of libssw 0.1.4

minimap2

Debian patches

Yes

code-copy of ksw2

pbcopper

Native SIMDe

N/A

code-copy of libssw 1.2

hisat2

Debian patches

Yes

vg

Native SIMDe

N/A

fermi-lite

Debian patches

Yes

code copy of ksw.c

bowtie2

Native SIMDe

N/A

bwa

Debian patches

Yes

code copy of ksw.c

raxml

Debian patches

Yes

kalign

Debian patches

Yes

fasta3

Native SIMDe

N/A

plink2

Native SIMDe

N/A

mmseqs2

Native SIMDe

N/A

code copy of ksw.c

obs-studio

Native SIMDe

N/A

libssw

Debian patches

Yes

spoa

Native SIMDe

N/A

emscripten

Native SIMDe

N/A

onednn

Debian patches

Not Yet

wtdbg2

Debian patches

Not Yet

kmc

Debian patches

Not Yet

hhsuite

Native SIMDe

N/A

abpoa

Native SIMDe

N/A

rna-star

Native SIMDe

N/A

Code copy of amalgamated avx2

parasail

Native SIMDe

N/A

supertuxkart

Native SIMDe

N/A

ngmlr

Debian patches

Not Yet

seqan-raptor

Native SIMDe

N/A

rapmap

Debian patches

Not yet

Code copy of ksw2

snap-aligner

Debian Patches

Yes

scrappie

Debian Patches

Not yet

z3

Debian Patches

Not yet

metaeuk

Native SIMDe

Embeds mmseqs

Candidate packages

Some can be found via  apt-cache rdepends sse2-support  or similar