5132
Comment: New section about GSL and different BLAS implementations
|
5146
added "#language en"
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
#language en |
Handle different versions of BLAS and LAPACK
Description
Beside being implementations, BLAS and LAPACK are also API standard for basic linear algebra operations (such as vector and matrix multiplication).
Many implementations of these API exist. The reference implementation of BLAS and LAPACK is very stable but is not as fast as optimized ones such as ATLAS and OpenBLAS.
Implementations of BLAS (both the Fortran and the C interface):
libblas3 and libblas-dev - Reference implementation
libatlas3-base and libatlas-base-dev - Automatically Tuned Linear Algebra Software (ATLAS)
libopenblas-base and libopenblas-dev - OpenBLAS, an optimized BLAS based on GotoBLAS2
libgslcblas0 and libgsl-dev - GNU Scientific Library (GSL), only implements the C interface, not the Fortran interface
- Intel MKL - Non-free
- AMD ACML - Non-free
- Sun Performance Library - Non-free
- ...
Implementations of LAPACK (Fortran interface):
liblapack3 and liblapack-dev - Reference implementation
- ATLAS and OpenBLAS both provide an optimized subset of LAPACK
- LAPACK++
How to switch from an implementation to the other
It is just trivial. It can be used like any other software using update-alternatives.
BLAS
In jessie and stretch
update-alternatives --config libblas.so.3
In (upcoming) buster and unstable/sid
update-alternatives --config libblas.so.3-<multiarch>
where <multiarch> is the multiarch path for you architecture (e.g. x86_64-linux-gnu for amd64).
Example
There are 3 choices for the alternative libblas.so.3 (providing /usr/lib/libblas.so.3). Selection Path Priority Status ------------------------------------------------------------ * 0 /usr/lib/openblas-base/libopenblas.so.0 40 auto mode 1 /usr/lib/atlas-base/atlas/libblas.so.3 35 manual mode 2 /usr/lib/libblas/libblas.so.3 10 manual mode 3 /usr/lib/openblas-base/libopenblas.so.0 40 manual mode
libgslcblas0 is not in that list, because it does not implement the Fortran interface.
LAPACK
In jessie and stretch
update-alternatives --config liblapack.so.3
In (upcoming) buster and unstable/sid
update-alternatives --config liblapack.so.3-<multiarch>
where <multiarch> is the multiarch path for you architecture (e.g. x86_64-linux-gnu for amd64).
Example
There are 3 choices for the alternative liblapack.so.3 (providing /usr/lib/liblapack.so.3). Selection Path Priority Status ------------------------------------------------------------ * 0 /usr/lib/openblas-base/liblapack.so.3 40 auto mode 1 /usr/lib/atlas-base/atlas/liblapack.so.3 35 manual mode 2 /usr/lib/lapack/liblapack.so.3 10 manual mode 3 /usr/lib/openblas-base/liblapack.so.3 40 manual mode
Getting the best performance out of ATLAS and OpenBLAS
The binary packages of ATLAS and OpenBLAS distributed by Debian are generic packages, which are not optimized for your specific machine.
For ATLAS, the optimal way of using it is to recompile it locally on the machine on which it is to be used. See the README.Debian file for details on how to achieve that easily.
For OpenBLAS, if you are on amd64 or i386, there is no need to recompile it, since the binary includes kernels optimized for several CPU microarchitectures, and the selection is done at runtime. For non-x86 architectures however, OpenBLAS should be recompiled locally for optimal performance, see the README.Debian file for instructions.
Using the GSL with an optimized BLAS
The GSL comes with its own BLAS implementation (libgslcblas0), which is not optimized for speed.
It is however possible to use an optimized BLAS implementation (OpenBLAS or ATLAS) as the backend for GSL matrix linear algebra when compiling your program.
Provided that you have installed libblas-dev, libatlas-base-dev or libopenblas-dev, you can use -lgsl -lblas at link time (instead of -lgsl -lgslcblas); or, if you are using pkg-config, you can get the right libraries with:
pkg-config --define-variable=GSL_CBLAS_LIB=-lblas --libs gsl
At runtime, the resulting binary will use the BLAS implementation selected by the alternatives system described above.
TODO
- Check for other implementations (CUBLAS, MKL in non-free, ACML in non-free)