cctbx packaging

peoples interested by this packaging effort

you can find the git repository here cctbx repo

the current ITP is here 679905

upstream

the main website is here https://github.com/cctbx/cctbx_project

build dependencies

mmdb

679982

gpp4

679988

clipper

679990

bundled

patch series status

Comments by Luc Bourhis (one of the leading cctbx developer):

This is sound but it hardcodes a path that is likely not to make sense on other Linux distros. Thus as is this patch should be applied at packaging time only. I would accept to apply such a patch upstream if it replaces else with elif and a check that this is Debian indeed, so that we could add other elif branches for other Linux distros as the need emerges.

I understand that the cctbx package will have a dependency on the OpenGL package, therefore making the configuration code commented by this patch redundant. Redundant but not incorrect. Thus I am not sure whether this is necessary. Only to be applied at packaging time of course.

Packaging time only.

Already fixed upstream (Marat's fix, rev 15462)

Already fixed upstream (Luc's fix, rev 15576)

This patch is huge. Therefore applying it at packaging time only would be rather brittle. As for applying it upstream, I approve the spirit of it but it introduces masses of changes and this needs to stand the trial of cctbx nightly tests.

The issue which has proved most controversial. See the section "Shared libraries" below for a comprehensive discussion.

Packaging time only.

Packaging time only.

Largely orthogonal to our code. I would happily accept this patch upstream. The code adding from __future__ import division should be removed though as we added such a line to every Python module upstream.

Since only scitbx/boost_python/SConscript needs the attribute env_etc.py_libs that this patch introduces in libtbx/SConscript, the fix should touch the former but not the latter. Then I would accept it upstream.

Acceptable upstream but uc1_2_reeke.py has been removed and there is now uc1_2_a.py that features cctbx.python too, so the patch won't apply as is.

Looks good: acceptable upstream.

CPPFLAGS is added to CCFLAGS which is eventually used for the both of C and C++ by SCons. This patch is therefore incorrect.

The cctbx needs a heavily patch version of the ANTLR runtime to work efficiently. It may be found in ucif/antlr3.

Obsolete (c.f. the "Shared libraries" section)

Packaging time only at the moment but this could go upstream in the future.

Packaging time only.

Shared Libraries

(Luc Bourhis' thoughts on the subject after extensive discussions with the people involved in this packaging effort)

I am not concerned with the Boost Python extensions here, only in the following libs that the cctbx builds:

Boost Python extensions are linked against those libraries.

First let's list some of the typical ways the cctbx C++ code may be used by a developer.

1. He wants to build a C++ application. If he uses a cctbx API based on templates, he just #include cctbx headers. Otherwise, he #include cctbx headers and (a) either he compiles the associated cctbx .cpp files with his .cpp files or (b) he links against one of the above library.

2. He wants to create a new Boost Python extension whose implementation uses e.g. sgtbx::space group. Then he just needs to #include <cctbx/sgtbx/spacegroup.h> and compile his extension. When it will be loaded alongside the Boost Python extension cctbx_sgtbx_ext.so, the linker should sort the calls made to sgtbx::space group member functions from within that new Boost Python extension code.

In case 2, the shared libraries listed at the beginning of this section are not directly used. They are by all means an implementation detail as the Boost Python extensions could have been built from static version of those libs instead (or even directly from the object files going into those libs).

On the contrary, in case 1b, those shared libraries take centre stage and they need to be versioned to avoid well-known issues.

Upstream cctbx developers consider that case 1b is too marginal to justify the effort of versioning the shared libraries since cctbx-dev package can be provided with all the .cpp files, thus enabling case 1a. They therefore favour a packaging where the shared libraries are kept private, or where static libraries are used instead.

package organisation

As explain by the upstream

python module organisation generated using this script cctbx-depends.png

libraries generated

Dispatcher scripts

There are 5 categories of dispatcher scripts:

python modules/extensions

solved questions

interfacing between scons and setup.py

We'll use a custom distutils "build_ext", "install_lib" and "install_data" commands which call the scons build system under the hood, by spawning a separate process. We'll also have custom "test" and "clean" commands. That way, "python2.x setup.py build / install" work as usual.

As the upstream build system needs to be run with the same version of python that was configured, we'll call scons with: "python2.x /usr/bin/scons"

importing the extensions

Upstream import the extensions a little bit differently from the typical python project. Usually, extensions .so files are located inside the package directory where they belong. In cctbx, all .so file are in a "lib" directory, which is added to PYTHONPATH, and then an import stub imports the objects to their final place.

The extensions are not meant to be imported directly by the user, and most of them are only imported from one place, however scitbx_array_family_shared_ext.so is imported from several places.

The import stubs features a function boost.python.import_ext, which is part of boost_adaptbx. This is defined in cctbx_sources/boost_adaptbx/boost/python.py. This function does 3 things:

We'll split the extensions between the different debian packages, but keep otherwise the upstream way. This causes a runtime dependency on boost_adaptbx an a little bit of namespace pollution, but at least it doesn't introduce debian-specific problems.

The whole of the cctbx Python code shall be run with integer division. This has traditionally been achieved by passing the -Qnew option to the Python interpreter in each and every of the dispatcher scripts provided by the cctbx in the <build dir>/bin directory. Since such a global option is inacceptable for Debian, cctbx developers have added from __future__ import division to every Python module. They have also added to their pre-commit check a test rejecting a new Python module which would not feature that line.

in Baptiste's TODO list

unbundle "optik"

this is a patched version of a deprecated ancestor of stdlib's "optparse". It is only used in "libtbx/option_parser.py". I plan to change "libtbx/option_parser.py" to use "optparse" instead.

tests

cctbx has 2 kinds of tests: python scripts, which the upstream test system ("libtbx/test_utils.py") executes in-place in the source tree, and compiled test programs, which are built by scons in the build directory and executed from there. Also, many of the python tests are not part of an importable python package, so they are by default not taken into account by distutils.

The "test" command that we have right now knows how to

I plan to add the python tests that are not part of a package as "package_data" in setup.py. Thus they will be copied by distutils to its build directory, and also installed by our packages.

When this is done, I will go through the still failing tests and see if the failure is our fault. I will also skip the tests that do not make sense in our case, and those that take an unreasonably long time.

open questions

TODO

Build system

Problems

ann

upstreams libann is build with with self_includes which makes it incompatible with libann in Debian.

$ grep "const ANNbool" /usr/include/ANN/ANN.h
const ANNboolANN_ALLOW_SELF_MATCH= ANNtrue;

$ grep "const ANNbool" cctbx_sources/annlib/include/ANN/ANN.h
const ANNboolANN_ALLOW_SELF_MATCH= ANNfalse;

annlib version:

If you check ANNmanual_1.1.pdf Section 2.2.3 there is explained what this constant does. The file "cctbx_sources/annlib_adaptbx/ann/ann_adaptor.cpp" can clear things a little bit up.

So cctbx upstream ANNfalse and auto generates code with ANNtrue and namespace annself_include. So when we use annself_include it equals libann in debian.

http://cci.lbl.gov/~hohn/scitbx-tour.html#Installation http://sig9.ecanews.org/pdfs/03%20Ralf%20Grosse-Kunstleve%20-%20library_aspects.pdf