Enabling SIMD using codes to build on other architectures using the SIMD Everywhere header only library

Debian source package https://tracker.debian.org/simde produces "libsimde-dev" Upstream: https://github.com/nemequ/simde

When to use:

C/C++ codebase makes reference to {m,xm,em,pm,tm,sm,nm,im}mintrin.h and does not provide other code routes for non-X86 architectures

Note: As of 2020-04-19 there is complete coverage of MMX, SSE, SSE2, SSE3, SSSE3, AVX, and FMA. AVX2 and the AVX512 variants are in progress still. Adding new functions is easy if you have access to the real hardware that supports them: https://github.com/nemequ/simde#contributing

Approach:

  1. Add libsimde-dev to the Build-Depends in debian/control
  2. Create a new patch to modify any source file that references the non portable headers.
  3. Add "#define SIMDE_ENABLE_NATIVE_ALIASES" prior to the "simde" include; use the table below to help pick the right portable header to include.

    non-portable header

    portable header

    #include <mmintrin.h>

    #include "simde/x86/mmx.h"

    #include <xmmintrin.h>

    #include "simde/x86/sse.h"

    #include <emmintrin.h>

    #include "simde/x86/sse2.h"

    #include <pmmintrin.h>

    #include "simde/x86/sse3.h"

    #include <tmmintrin.h>

    #include "simde/x86/ssse3.h"

    #include <smmintrin.h>

    #include "simde/x86/sse4.1.h"

    #include <nmmintrin.h>

    #include "simde/x86/sse4.2.h"

    #include <immintrin.h>

    #include "simde/x86/avx.h"

    #include <immintrin.h>

    #include "simde/x86/fma.h"

    #include <immintrin.h>

    #include "simde/x86/avx2.h"

    #include <immintrin.h>

    #include "simde/x86/avx512bw.h"

    #include <immintrin.h>

    #include "simde/x86/avx512f.h"

  4. Modify debian/rules to add "-DSIMDE_ENABLE_OPENMP -fopenmp-simd" along with "-O3" to CFLAGS/CXXFLAGS if that isn't already defined:
      export DEB_CFLAGS_MAINT_APPEND+="-DSIMDE_ENABLE_OPENMP -fopenmp-simd -O3"
      export DEB_CXXFLAGS_MAINT_APPEND+="-DSIMDE_ENABLE_OPENMP -fopenmp-simd -O3"
  5. Be sure to remove any -msse3, -march=native or similar from upstream's build system, if needed

  6. Resulting binaries should now be building everywhere, though you may still have issues with 32-bit architectures and/or big-endian systems.
  7. Bonus: on i386/amd64 the architecture baseline should no longer be violated. For applications using "simde" consider compiling multiple times on i386 & amd64 all applicable SIMD compilation flags and adding a dispatching script to pick the highest available version at runtime.