cctbx packaging
peoples interested by this packaging effort
- Baptiste Carvello
- Radostan Riedel
you can find the git repository here cctbx repo
the current ITP is here 679905
upstream
the main website is here https://github.com/cctbx/cctbx_project
build dependencies
bundled
debhelper (>= 9),
- python-all-dev,
- libfftw3-dev,
libantlr3c-dev (< 3.4),
- libcbf-dev,
- libgl2ps-dev,
- libann-dev,
- libclipper-dev,
libboost-python-dev(>= 1.49),
- scons,
- libboost-thread-dev,
- python-setuptools,
- libgl1-mesa-dev,
- libglu1-mesa-dev,
- libtool
patch series status
Comments by Luc Bourhis (one of the leading cctbx developer):
- 0001-remove-hardcoded-libtbx_build-env.patch
This is sound but it hardcodes a path that is likely not to make sense on other Linux distros. Thus as is this patch should be applied at packaging time only. I would accept to apply such a patch upstream if it replaces else with elif and a check that this is Debian indeed, so that we could add other elif branches for other Linux distros as the need emerges.
- 0002-fix-opengl-header-missing-gltbx.patch
I understand that the cctbx package will have a dependency on the OpenGL package, therefore making the configuration code commented by this patch redundant. Redundant but not incorrect. Thus I am not sure whether this is necessary. Only to be applied at packaging time of course.
- 0003-correct-paths-in-dispatcher-creation.patch
Packaging time only.
- 0004-upstream-fix-for-declaration-errors-in-gcc4.7.patch
Already fixed upstream (Marat's fix, rev 15462)
- 0005-fix-for-gcc4.7-compilation-error.patch
Already fixed upstream (Luc's fix, rev 15576)
- 0006-options-for-system-libs-installtarget-and-prefix.patch
This patch is huge. Therefore applying it at packaging time only would be rather brittle. As for applying it upstream, I approve the spirit of it but it introduces masses of changes and this needs to stand the trial of cctbx nightly tests.
- 0007-adding-shlib-versioning.patch
The issue which has proved most controversial. See the section "Shared libraries" below for a comprehensive discussion.
- 0008-Fix-to-skip-pycbf-build.patch
Packaging time only.
- 0009-build-libann-statically.patch
Packaging time only.
- 0010-adding-setup_py.patch
Largely orthogonal to our code. I would happily accept this patch upstream. The code adding from __future__ import division should be removed though as we added such a line to every Python module upstream.
- 0011-fix-missing-python-lib-during-linking.patch
Since only scitbx/boost_python/SConscript needs the attribute env_etc.py_libs that this patch introduces in libtbx/SConscript, the fix should touch the former but not the latter. Then I would accept it upstream.
- 0012-fix-to-remove-cctbx.python-interpreter.patch
Acceptable upstream but uc1_2_reeke.py has been removed and there is now uc1_2_a.py that features cctbx.python too, so the patch won't apply as is.
- 0013-fix-to-support-LDFLAGS-in-use_enviroment_flags.patch
Looks good: acceptable upstream.
- 0014-Fix-to-append-CPPFLAGS-to-CXXFLAGS.patch
CPPFLAGS is added to CCFLAGS which is eventually used for the both of C and C++ by SCons. This patch is therefore incorrect.
- 0015-fix-cif-parser-to-work-with-antlr3c-3.2.patch
The cctbx needs a heavily patch version of the ANTLR runtime to work efficiently. It may be found in ucif/antlr3.
- 0016-autogenerate-pkgconfig-files.patch
Obsolete (c.f. the "Shared libraries" section)
- 0017-Fix-to-use-systems-include-path.patch
Packaging time only at the moment but this could go upstream in the future.
- 0018-Fix-to-skip-build-of-clipper-examples.patch
Packaging time only.
Shared Libraries
(Luc Bourhis' thoughts on the subject after extensive discussions with the people involved in this packaging effort)
I am not concerned with the Boost Python extensions here, only in the following libs that the cctbx builds:
- libcctbx.so
- libcctbx_sgtbx_asu.so
- libiotbx_mtz.so
- libiotbx_pdb.so
- libmmtbx_masks.so
- libomptbx.so
- libscitbx_boost_python.so
- libsmtbx_refinement_constraints.so
Boost Python extensions are linked against those libraries.
First let's list some of the typical ways the cctbx C++ code may be used by a developer.
1. He wants to build a C++ application. If he uses a cctbx API based on templates, he just #include cctbx headers. Otherwise, he #include cctbx headers and (a) either he compiles the associated cctbx .cpp files with his .cpp files or (b) he links against one of the above library.
2. He wants to create a new Boost Python extension whose implementation uses e.g. sgtbx::space group. Then he just needs to #include <cctbx/sgtbx/spacegroup.h> and compile his extension. When it will be loaded alongside the Boost Python extension cctbx_sgtbx_ext.so, the linker should sort the calls made to sgtbx::space group member functions from within that new Boost Python extension code.
In case 2, the shared libraries listed at the beginning of this section are not directly used. They are by all means an implementation detail as the Boost Python extensions could have been built from static version of those libs instead (or even directly from the object files going into those libs).
On the contrary, in case 1b, those shared libraries take centre stage and they need to be versioned to avoid well-known issues.
Upstream cctbx developers consider that case 1b is too marginal to justify the effort of versioning the shared libraries since cctbx-dev package can be provided with all the .cpp files, thus enabling case 1a. They therefore favour a packaging where the shared libraries are kept private, or where static libraries are used instead.
package organisation
As explain by the upstream
python module organisation generated using this script
- Red boxes : python modules,
- green boxes are others (C++ or external).
- Black arrows are run-time dependancies
- Red arrows are build-time dependancies
- Green arrows are optional dependancies.
libraries generated
- libann.a
- libcbf.a
- libccp4io.a
- libcctbx_sgtbx_asu.so
- libcctbx.so
- libiotbx_mtz.so
- libiotbx_pdb.so
- libmmtbx_masks.so
- libomptbx.so
- librstbx.so
- libscitbx_boost_python.so
- libscitbx_minpack.so
- libscitbx_slatec.so
- libsmtbx_refinement_constraints.so
- libspotfinder.so
Dispatcher scripts
There are 5 categories of dispatcher scripts:
- Pseudo python interpreters *.python
- Scripts conflicting the namespace of other packages
- like: boost.lcm
- Scripts which just show build or dist paths
- like: annlib.show_build_path
- Build related scripts
- like: libtbx.assert_line_count
- Everything else. (This could get usefull)
- like: cctbx.lattice_symmetry
python modules/extensions
solved questions
interfacing between scons and setup.py
We'll use a custom distutils "build_ext", "install_lib" and "install_data" commands which call the scons build system under the hood, by spawning a separate process. We'll also have custom "test" and "clean" commands. That way, "python2.x setup.py build / install" work as usual.
As the upstream build system needs to be run with the same version of python that was configured, we'll call scons with: "python2.x /usr/bin/scons"
importing the extensions
Upstream import the extensions a little bit differently from the typical python project. Usually, extensions .so files are located inside the package directory where they belong. In cctbx, all .so file are in a "lib" directory, which is added to PYTHONPATH, and then an import stub imports the objects to their final place.
The extensions are not meant to be imported directly by the user, and most of them are only imported from one place, however scitbx_array_family_shared_ext.so is imported from several places.
The import stubs features a function boost.python.import_ext, which is part of boost_adaptbx. This is defined in cctbx_sources/boost_adaptbx/boost/python.py. This function does 3 things:
- it checks for a libstdc++ mismatch with the python binary (useless in Debian)
- it beautifies the import exception messages (nice to have, but unimportant)
- it calls dlopen with RTLD_GLOBAL. People in the Boost community seem to think that this is necessary for correct operation in corner cases.
We'll split the extensions between the different debian packages, but keep otherwise the upstream way. This causes a runtime dependency on boost_adaptbx an a little bit of namespace pollution, but at least it doesn't introduce debian-specific problems.
Qnew and the related future import
The whole of the cctbx Python code shall be run with integer division. This has traditionally been achieved by passing the -Qnew option to the Python interpreter in each and every of the dispatcher scripts provided by the cctbx in the <build dir>/bin directory. Since such a global option is inacceptable for Debian, cctbx developers have added from __future__ import division to every Python module. They have also added to their pre-commit check a test rejecting a new Python module which would not feature that line.
in Baptiste's TODO list
unbundle "optik"
this is a patched version of a deprecated ancestor of stdlib's "optparse". It is only used in "libtbx/option_parser.py". I plan to change "libtbx/option_parser.py" to use "optparse" instead.
tests
cctbx has 2 kinds of tests: python scripts, which the upstream test system ("libtbx/test_utils.py") executes in-place in the source tree, and compiled test programs, which are built by scons in the build directory and executed from there. Also, many of the python tests are not part of an importable python package, so they are by default not taken into account by distutils.
The "test" command that we have right now knows how to
- execute python tests that are part of a package from the distutils build directory (thus using the version modified by "build_py")
- execute the compiled test programs from our scons build directory
I plan to add the python tests that are not part of a package as "package_data" in setup.py. Thus they will be copied by distutils to its build directory, and also installed by our packages.
When this is done, I will go through the still failing tests and see if the failure is our fault. I will also skip the tests that do not make sense in our case, and those that take an unreasonably long time.
open questions
- do we want to install the compiled test programs with one of our packages?
- ask the upstream for different naming schema between nighty and release source distribution. This way it will be possible to filter nighty built in the watch file.
TODO
- solve the API/ABI stability problems for libs
- license check of the remaining files
- package missing build-dep
- DFSG free the source package
- identify the bundeled libraries (boost ...)
- repack to remove all bundled libraries, it should save some space.
- create the get-orig-source target for this purpose
- We can delete scons, boost and most files in clipper directory. We just to need to keep the empty directories for boost, clipper etc. otherwise the build system is not working.
- remove the cctbx bundled in objcryst-fox once done
- run all tests at the end of the build (python setup.py check ?)
- what about the documentation ?
- provide pkg-config files for public shared libraries (patch for upstream)
Build system
- Modify configure.py to support "--use-system-libraries"
- skip shlibs build which are already available in Debian.
- link with the system libraries.
- use pkg-config as much as possible for library detection
- on Debian boost_python library is named according to the python version
- libboost_python-py26
- libboost_python-py27
- libboost_python-py32
- Build for multiple python versions but build shlibs only one time.
- version the shared libraries using scons
- support a --prefix option
- the install target should support DESTDIR
- Use upstreams "--use-environment-flags" option to set CFLAGS, CXXFLAGS, CPPFLAGS and LDFLAGS
- we need -O2 and -g for debugging symbols.
- add a setup.py for python modules (autogenerated ?)
- add a clean target ?
- construct a cctbx-dev package with dispatcher script and libtbx-build system to provide a build-dep for other packages. We will also need a python script that dynamically updates the libtbx_env pickle object.
Problems
- upstream is linking additional symbols into libcbf. libcbf is built statically so we don't need libcbf-dev
- can be backported into the cbf upstream ?
- check with the cbf Debian maintainer
- fftw3tbx is missing a header file.
- useless file in the current source package cctbx_sources/lapack_fem/xerbla.f.orig
- what should we do with all the dispatcher in the bin directory ?
Depend on cbflib python bindings but cbflib in debian is not providing a python package 681997
- Upstream is linking additional symbols from generated source into their libann.so. They call it ""Generating C++ files for self-inclusive approximate nearest neighbor"
ann
upstreams libann is build with with self_includes which makes it incompatible with libann in Debian.
$ grep "const ANNbool" /usr/include/ANN/ANN.h const ANNboolANN_ALLOW_SELF_MATCH= ANNtrue;
$ grep "const ANNbool" cctbx_sources/annlib/include/ANN/ANN.h const ANNboolANN_ALLOW_SELF_MATCH= ANNfalse;
annlib version:
- in cctbx is 1.1 Release date: 05/03/05
- in debian 1.1.2+doc-3 Release date: Jan 27, 2010 [1].
If you check ANNmanual_1.1.pdf Section 2.2.3 there is explained what this constant does. The file "cctbx_sources/annlib_adaptbx/ann/ann_adaptor.cpp" can clear things a little bit up.
So cctbx upstream ANNfalse and auto generates code with ANNtrue and namespace annself_include. So when we use annself_include it equals libann in debian.
usefull links
http://cci.lbl.gov/~hohn/scitbx-tour.html#Installation http://sig9.ecanews.org/pdfs/03%20Ralf%20Grosse-Kunstleve%20-%20library_aspects.pdf