Enabling SIMD using codes to build on other architectures using the SIMD Everywhere header only library Debian source package https://tracker.debian.org/simde produces "libsimde-dev" Upstream: https://github.com/simd-everywhere/simde This page documents one option for increasing portability of CPU instruction selection. It can be combined with [[InstructionSelection|other features]] to achieve higher levels of portability for software that is not currently very portable. <> = When to use = C/C++ codebase makes reference to {{{{m,xm,em,pm,tm,sm,nm,im}mintrin.h}}} or {{{x86intrin.h}}} and does not provide other code routes for non-X86 architectures Note: As of 2021-01-27 (SIMDe v0.7.2) there is complete coverage of MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, AVX, AVX2, FMA, GFNI, CLMUL, XOP, and SVML. The [[https://github.com/simd-everywhere/implementation-status/blob/main/x86.md#avx512bw|AVX512]] variants are in progress still. Adding new functions is easy if you have access to the real hardware that supports them: https://github.com/simd-everywhere/simde#contributing = Approach = 1. Add libsimde-dev to the Build-Depends in debian/control 1. Create a new patch to modify any source file that references the non portable headers. 1. Add {{{#define SIMDE_ENABLE_NATIVE_ALIASES}}} prior to the "simde" include; use the table below to help pick the right portable header to include. || '''non-portable header''' || '''portable header''' || || {{{#include }}} || {{{#include }}} || || {{{#include }}} || {{{#include }}} || || {{{#include }}} || {{{#include }}} || || {{{#include }}} || {{{#include }}} || || {{{#include }}} || {{{#include }}} || || {{{#include }}} || {{{#include }}} || || {{{#include }}} || {{{#include }}} || || {{{#include }}} || {{{#include }}} || || {{{#include }}} || {{{#include }}} || || {{{#include }}} || {{{#include }}} || || {{{#include }}} || {{{#include }}} || || {{{#include }}} || {{{#include }}} || 1. Modify debian/rules to add "-DSIMDE_ENABLE_OPENMP -fopenmp-simd" along with "-O3" to CFLAGS/CXXFLAGS if that isn't already defined: {{{ export DEB_CFLAGS_MAINT_APPEND+=-DSIMDE_ENABLE_OPENMP -fopenmp-simd -O3 export DEB_CXXFLAGS_MAINT_APPEND+=-DSIMDE_ENABLE_OPENMP -fopenmp-simd -O3 }}} 1. Be sure to remove any {{{-msse3}}}, {{{-march=native}}} or similar from upstream's build system, if needed 1. Be mindful of [[https://www.debian.org/doc/debian-policy/ch-relationships.html#additional-source-packages-used-to-build-the-binary-built-using|Debian Policy 7.8]]: if the source package requires the full source code be available (like the GPL) then you'll need to add `Built-Using` to indicate which version of {{{libsimde-dev}}} was used. 1. Add {{{ Built-Using: ${simde:Built-Using} }}} to the appropriate binary package(s) in {{{debian/control}}} 1. In {{{debian/rules}}} add {{{ override_dh_gencontrol: dh_gencontrol -- -Vsimde:Built-Using="$(shell dpkg-query -f '$${source:Package} (= $${source:Version}), ' -W "libsimde-dev")" }}} 1. Resulting binaries should now be building everywhere, though you may still have issues with 32-bit architectures and/or big-endian systems. 1. Bonus: on {{{i386}}}/{{{amd64}}} the architecture baseline should no longer be violated. For applications using "simde" consider compiling multiple times on {{{i386}}} & {{{amd64}}} all applicable SIMD compilation flags and adding a dispatching script to pick the highest available version at runtime. Example: [[https://salsa.debian.org/med-team/mmseqs2/-/blob/6aa90afc8db6414362d4e2df15e7deedbcc6e1a5/debian/bin/simd-dispatch|simd-dispatch script]] & [[https://salsa.debian.org/med-team/mmseqs2/-/blob/6aa90afc8db6414362d4e2df15e7deedbcc6e1a5/debian/rules#L20|debian/rules]] 1. Libraries using this technique will benefit from a built in CPU feature dispatcher, but that is more advanced. 1. Once the package builds on more architectures successfully, please discuss with upstream about adopting this or another technique. 1. Add your package as an example to the lists below. = Packages Status = As of 2023-01-22, here is an incomplete list of packages that are using this technique, either due to Debian patches, or natively. || Package || Native or Debian use of SIMDe? || Patches sent upstream? || Notes || || [[https://tracker.debian.org/examl|examl]] || Debian patches || [[https://github.com/stamatak/ExaML/pull/16|Yes]] || || || [[https://tracker.debian.org/last-align|last-align]] || Debian patches || [[https://groups.google.com/d/msg/last-align/yzqdAcch-h8/HrKyp0ViAwAJ|Yes]] || || || [[https://tracker.debian.org/python-skbio|python-skbio]] || Native SIMDe || N/A || code-copy of libssw 0.1.4 || || [[https://tracker.debian.org/minimap2|minimap2]] || Debian patches || [[https://github.com/lh3/minimap2/pull/607|Yes]] || code-copy of ksw2 || || [[https://tracker.debian.org/pbcopper|pbcopper]] || Native SIMDe || N/A || code-copy of libssw 1.2 || || [[https://tracker.debian.org/hisat2|hisat2]] || Debian patches || [[https://github.com/DaehwanKimLab/hisat2/pull/251|Yes]] || || || [[https://tracker.debian.org/vg|vg]] || Native SIMDe || N/A || || || [[https://tracker.debian.org/fermi-lite|fermi-lite]] || Debian patches || [[https://github.com/lh3/fermi-lite/pull/14|Yes]] || code copy of ksw.c || || [[https://tracker.debian.org/bowtie2|bowtie2]] || Native SIMDe || N/A || || || [[https://tracker.debian.org/bwa|bwa]] || Debian patches || [[https://github.com/lh3/minimap2/pull/607|Yes]] || code copy of ksw.c || || [[https://tracker.debian.org/raxml|raxml]] || Debian patches || [[https://github.com/stamatak/standard-RAxML/pull/50|Yes]] || || || [[https://tracker.debian.org/kalign|kalign]] || Debian patches || [[https://github.com/TimoLassmann/kalign/pull/20|Yes]] || || || [[https://tracker.debian.org/fasta3|fasta3]] || Debian patches || [[https://github.com/wrpearson/fasta36/pull/25|Yes]] || || || [[https://tracker.debian.org/plink2|plink2]] || Native SIMDe || N/A || || || [[https://tracker.debian.org/mmseqs2|mmseqs2]] || Native SIMDe || N/A || code copy of ksw.c || || [[https://tracker.debian.org/obs-studio|obs-studio]] || Native SIMDe || N/A || || || [[https://tracker.debian.org/libssw|libssw]] || Debian patches || [[https://github.com/mengyao/Complete-Striped-Smith-Waterman-Library/pull/69|Yes]] || || || [[https://tracker.debian.org/spoa|spoa]] || Native SIMDe || N/A || || || [[https://tracker.debian.org/emscripten|emscripten]] || Native SIMDe || N/A || || || [[https://tracker.debian.org/onednn|onednn]] || Debian patches || [[https://salsa.debian.org/deeplearning-team/onednn/-/blob/master/debian/patches/simde|Not Yet]] || || || [[https://tracker.debian.org/wtdbg2|wtdbg2]] || Debian patches || [[https://salsa.debian.org/med-team/wtdbg2/-/blob/master/debian/patches/simde|Not Yet]] || || || [[https://tracker.debian.org/kmc|kmc]] || Debian patches || [[https://salsa.debian.org/med-team/kmc/-/blob/master/debian/patches/simde|Not Yet]] || || || [[https://tracker.debian.org/pkg/hhsuite|hhsuite]] || Native SIMDe || N/A || || || [[https://tracker.debian.org/pkg/abpoa|abpoa]] || Native SIMDe || N/A || || || [[https://tracker.debian.org/pkg/rna-star|rna-star]] || Native SIMDe || N/A || [[https://sources.debian.org/src/rna-star/2.7.10b%2Bdfsg-1/source/opal/simde_avx2.h/|Code copy of amalgamated avx2]] || || [[https://tracker.debian.org/pkg/parasail|parasail]] || Native SIMDe || N/A || || || [[https://tracker.debian.org/pkg/supertuxkart|supertuxkart]]|| Native SIMDe || N/A || || || [[https://tracker.debian.org/pkg/ngmlr|ngmlr]] || Debian patches || [[https://salsa.debian.org/med-team/ngmlr/-/blob/master/debian/patches/simde.patch|Not Yet]] || || || [[https://tracker.debian.org/pkg/seqan-raptor|seqan-raptor]] || Native SIMDe || N/A || || || [[https://tracker.debian.org/rapmap|rapmap]] || Debian patches || [[https://salsa.debian.org/med-team/rapmap/-/blob/master/debian/patches/simde|Not yet]] || Code copy of ksw2 || || [[https://tracker.debian.org/pkg/snap-aligner|snap-aligner]] || Debian Patches || [[https://github.com/amplab/snap/pull/130|Yes]] || || || [[https://tracker.debian.org/pkg/scrappie|scrappie]] || Debian Patches || [[https://salsa.debian.org/med-team/scrappie/-/blob/master/debian/patches/simde.patch|Not yet]] || || || [[https://tracker.debian.org/pkg/z3|z3]] || Debian Patches || [[https://salsa.debian.org/pkg-llvm-team/z3/-/blob/master/debian/patches/00-intrinsics.patch|Not yet]] || || || [[https://tracker.debian.org/pkg/metaeuk|metaeuk]] || Native SIMDe || || Embeds mmseqs || = Candidate packages = Some can be found via {{{ apt-cache rdepends sse2-support }}} or similar * [[https://tracker.debian.org/iqtree|iqtree]] AMD64 & i386 package is -msse3 ; AMD64 package is also -mavx (violates the ABI); no non-x86 packages * [[https://tracker.debian.org/vowpal-wabbit|vowpal-wabbit]] * [[https://tracker.debian.org/pkg/vsearch|vsearch]] * [[https://tracker.debian.org/pkg/sortmerna|sortmerna]] * [[https://tracker.debian.org/pkg/mrbayes|mrbayes]] * [[https://tracker.debian.org/pkg/gmap|gmap]] * [[https://tracker.debian.org/pkg/hyphy|hyphy]] * [[https://salsa.debian.org/med-team/pufferfish|pufferfish]] (not yet in Debian) (code copy of ksw2) * [[https://tracker.debian.org/pkg/salmon|salmon]] (code copy of ksw2) Upstream just [[https://twitter.com/salmon_software/status/1329976578714071044|added SIMDe on their own]], check for a future release. * [[https://tracker.debian.org/pkg/sptag|sptag]] [[https://github.com/microsoft/SPTAG/compare/main...pabs3:SPTAG:use-simd-everywhere|patch prepared]] but it [[https://github.com/simd-everywhere/simde/issues/961|requires additional AVX512 support and other fixes in SIMDe]]