Differences between revisions 39 and 41 (spanning 2 versions)
Revision 39 as of 2012-09-13 14:19:50
Size: 11466
Editor: picca
Comment:
Revision 41 as of 2012-10-04 22:20:57
Size: 15216
Editor: ?Luc Bourhis
Comment:
Deletions are marked like this. Additions are marked like this.
Line 43: Line 43:
(Luc could you elaborate) Comments by Luc Bourhis (one of the leading cctbx developer):
Line 46: Line 46:

This is sound but it hardcodes a path that is likely not to make sense on other Linux distros. Thus as is this patch should be applied at packaging time only. I would accept to apply such a patch upstream if it replaces {{{else}}} with {{{elif}}} and a check that this is Debian indeed, so that we could add other {{{elif}}} branches for other Linux distros as the need emerges.
 
Line 47: Line 50:

I understand that the cctbx package will have a dependency on the OpenGL package, therefore making the configuration code commented by this patch redundant. Redundant but not incorrect. Thus I am not sure whether this is necessary. Only to be applied at packaging time of course.
Line 48: Line 54:

Packaging time only.
Line 49: Line 58:

Already fixed upstream (Marat's fix, rev 15462)
Line 50: Line 62:

Already fixed upstream (Luc's fix, rev 15576)
Line 51: Line 66:

This patch is huge. Therefore applying it at packaging time only would be rather brittle. As for applying it upstream, I approve the spirit of it but it introduces masses of changes and this needs to stand the trial of cctbx nightly tests.
Line 52: Line 70:

The issue which has proved most controversial. See the section "Shared libraries" below for a comprehensive discussion.
Line 53: Line 74:

Packaging time only.
Line 54: Line 78:

Packaging time only.
Line 55: Line 82:

Largely orthogonal to our code. I would happily accept this patch upstream. The code adding {{{from __future__ import division}}} should be removed though as we added such a line to every Python module upstream.
Line 56: Line 86:

Since only scitbx/boost_python/SConscript needs the attribute env_etc.py_libs that this patch introduces in libtbx/SConscript, the fix should touch the former but not the latter. Then I would accept it upstream.
Line 57: Line 90:

Acceptable upstream but uc1_2_reeke.py has been removed and there is now uc1_2_a.py that features cctbx.python too, so the patch won't apply as is.
Line 58: Line 94:

Looks good: acceptable upstream.
Line 59: Line 98:

CPPFLAGS is added to CCFLAGS which is eventually used for the both of C and C++ by SCons. This patch is therefore incorrect.
Line 60: Line 102:

The cctbx needs a heavily patch version of the ANTLR runtime to work efficiently. It may be found in ucif/antlr3.
Line 61: Line 106:

Obsolete (c.f. the "Shared libraries" section)
Line 62: Line 110:

Packaging time only at the moment but this could go upstream in the future.
Line 63: Line 114:

Packaging time only.

=== Shared Libraries ===

(Luc Bourhis' thoughts on the subject after extensive discussions with the people involved in this packaging effort)

I am not concerned with the Boost Python extensions here, only in the following libs that the cctbx builds:

 * libcctbx.so
 * libcctbx_sgtbx_asu.so
 * libiotbx_mtz.so
 * libiotbx_pdb.so
 * libmmtbx_masks.so
 * libomptbx.so
 * libscitbx_boost_python.so
 * libsmtbx_refinement_constraints.so

Boost Python extensions are linked against those libraries.

First let's list some of the typical ways the cctbx C++ code may be used by a developer.

1. He wants to build a C++ application. If he uses a cctbx API based on templates, he just {{{#include}}} cctbx headers. Otherwise, he {{{#include}}} cctbx headers and (a) either he compiles the associated cctbx {{{.cpp}}} files with his {{{.cpp}}} files or (b) he links against one of the above library.

2. He wants to create a new Boost Python extension whose implementation uses e.g. {{{sgtbx::space group}}}. Then he just needs to {{{#include <cctbx/sgtbx/spacegroup.h>}}} and compile his extension. When it will be loaded alongside the Boost Python extension cctbx_sgtbx_ext.so, the linker should sort the calls made to {{{sgtbx::space group}}} member functions from within that new Boost Python extension code.

In case 2, the shared libraries listed at the beginning of this section are not directly used. They are by all means an implementation detail as the Boost Python extensions could have been built from static version of those libs instead (or even directly from the object files going into those libs).

On the contrary, in case 1b, those shared libraries take centre stage and they need to be versioned to avoid well-known issues.

Upstream cctbx developers consider that case 1b is too marginal to justify the effort of versioning the shared libraries since cctbx-dev package can be provided with all the .cpp files, thus enabling case 1a. They therefore favour a packaging where the shared libraries are kept private, or where static libraries are used instead.
Line 125: Line 207:
The import stubs use a wrapper around {{{__import__}}} called boost.python.import_ext, which is part of boost_adaptbx.
This is defined in {{{cctbx_sources/boost_adaptbx/boost/python.py}}}. This wrapper does 3 things:
The import stubs features a function boost.python.import_ext, which is part of boost_adaptbx.
This is defined in {{{cctbx_sources/boost_adaptbx/boost/python.py}}}. This function does 3 things:
Line 132: Line 214:
(luc add the floating point behaviour)
Line 138: Line 218:
Upstream builds and tests with a global option to the python interpreter: "-Qnew". This sets the int division to produce floats in all modules (i.e. this is the same as adding a "from {{{__}}}future{{{__}}} import division" line in all modules). Such a global option is inacceptable for Debian, so we would add the "{{{__}}}future{{{__}}}" line in cctbx modules instead.

Adding such lines automatically during build is possible, as is shown by our "build_py" distutils command. This commands adds the "{{{__}}}future{{{__}}}" line in non-empty modules as the first executable line, while respecting comments, whitespace and inline help that may exist at the start of the file.

That beeing said, upstream do not seem to actively remove the "{{{__}}}future{{{__}}}" lines in modules that contain them, and seem to avoid int division altogether. So maybe they are trying to keep their code working without "-Qnew" as well. If they could provide us with such a warranty, we would not need to change the modules, which would of course be the best.

So, when the packaging is ready, we'll have to ask upstream if working without "-Qnew" is part of their development goals. If so, we might be able to remove our "build_py" command.
The whole of the cctbx Python code shall be run with integer division. This has traditionally been achieved by passing the {{{-Qnew}}} option to the Python interpreter in each and every of the dispatcher scripts provided by the cctbx in the {{{<build dir>/bin}}} directory. Since such a global option is inacceptable for Debian, cctbx developers have added {{{from __future__ import division}}} to every Python module. They have also added to their pre-commit check a test rejecting a new Python module which would not feature that line.

cctbx packaging

peoples interested by this packaging effort

  • Baptiste Carvello
  • picca

  • Radostan Riedel

you can find the git repository here cctbx repo

the current ITP is here 679905

upstream

the main website is there http://cctbx.sourceforge.net/

build dependencies

mmdb

679982

gpp4

679988

clipper

679990

bundled

  • debhelper (>= 9),

  • python-all-dev,
  • libfftw3-dev,
  • libantlr3c-dev (< 3.4),

  • libcbf-dev,
  • libgl2ps-dev,
  • libann-dev,
  • libclipper-dev,
  • libboost-python-dev(>= 1.49),

  • scons,
  • libboost-thread-dev,
  • python-setuptools,
  • libgl1-mesa-dev,
  • libglu1-mesa-dev,
  • libtool

patch series status

Comments by Luc Bourhis (one of the leading cctbx developer):

  • 0001-remove-hardcoded-libtbx_build-env.patch

This is sound but it hardcodes a path that is likely not to make sense on other Linux distros. Thus as is this patch should be applied at packaging time only. I would accept to apply such a patch upstream if it replaces else with elif and a check that this is Debian indeed, so that we could add other elif branches for other Linux distros as the need emerges.

  • 0002-fix-opengl-header-missing-gltbx.patch

I understand that the cctbx package will have a dependency on the OpenGL package, therefore making the configuration code commented by this patch redundant. Redundant but not incorrect. Thus I am not sure whether this is necessary. Only to be applied at packaging time of course.

  • 0003-correct-paths-in-dispatcher-creation.patch

Packaging time only.

  • 0004-upstream-fix-for-declaration-errors-in-gcc4.7.patch

Already fixed upstream (Marat's fix, rev 15462)

  • 0005-fix-for-gcc4.7-compilation-error.patch

Already fixed upstream (Luc's fix, rev 15576)

  • 0006-options-for-system-libs-installtarget-and-prefix.patch

This patch is huge. Therefore applying it at packaging time only would be rather brittle. As for applying it upstream, I approve the spirit of it but it introduces masses of changes and this needs to stand the trial of cctbx nightly tests.

  • 0007-adding-shlib-versioning.patch

The issue which has proved most controversial. See the section "Shared libraries" below for a comprehensive discussion.

  • 0008-Fix-to-skip-pycbf-build.patch

Packaging time only.

  • 0009-build-libann-statically.patch

Packaging time only.

  • 0010-adding-setup_py.patch

Largely orthogonal to our code. I would happily accept this patch upstream. The code adding from __future__ import division should be removed though as we added such a line to every Python module upstream.

  • 0011-fix-missing-python-lib-during-linking.patch

Since only scitbx/boost_python/SConscript needs the attribute env_etc.py_libs that this patch introduces in libtbx/SConscript, the fix should touch the former but not the latter. Then I would accept it upstream.

  • 0012-fix-to-remove-cctbx.python-interpreter.patch

Acceptable upstream but uc1_2_reeke.py has been removed and there is now uc1_2_a.py that features cctbx.python too, so the patch won't apply as is.

  • 0013-fix-to-support-LDFLAGS-in-use_enviroment_flags.patch

Looks good: acceptable upstream.

  • 0014-Fix-to-append-CPPFLAGS-to-CXXFLAGS.patch

CPPFLAGS is added to CCFLAGS which is eventually used for the both of C and C++ by SCons. This patch is therefore incorrect.

  • 0015-fix-cif-parser-to-work-with-antlr3c-3.2.patch

The cctbx needs a heavily patch version of the ANTLR runtime to work efficiently. It may be found in ucif/antlr3.

  • 0016-autogenerate-pkgconfig-files.patch

Obsolete (c.f. the "Shared libraries" section)

  • 0017-Fix-to-use-systems-include-path.patch

Packaging time only at the moment but this could go upstream in the future.

  • 0018-Fix-to-skip-build-of-clipper-examples.patch

Packaging time only.

Shared Libraries

(Luc Bourhis' thoughts on the subject after extensive discussions with the people involved in this packaging effort)

I am not concerned with the Boost Python extensions here, only in the following libs that the cctbx builds:

  • libcctbx.so
  • libcctbx_sgtbx_asu.so
  • libiotbx_mtz.so
  • libiotbx_pdb.so
  • libmmtbx_masks.so
  • libomptbx.so
  • libscitbx_boost_python.so
  • libsmtbx_refinement_constraints.so

Boost Python extensions are linked against those libraries.

First let's list some of the typical ways the cctbx C++ code may be used by a developer.

1. He wants to build a C++ application. If he uses a cctbx API based on templates, he just #include cctbx headers. Otherwise, he #include cctbx headers and (a) either he compiles the associated cctbx .cpp files with his .cpp files or (b) he links against one of the above library.

2. He wants to create a new Boost Python extension whose implementation uses e.g. sgtbx::space group. Then he just needs to #include <cctbx/sgtbx/spacegroup.h> and compile his extension. When it will be loaded alongside the Boost Python extension cctbx_sgtbx_ext.so, the linker should sort the calls made to sgtbx::space group member functions from within that new Boost Python extension code.

In case 2, the shared libraries listed at the beginning of this section are not directly used. They are by all means an implementation detail as the Boost Python extensions could have been built from static version of those libs instead (or even directly from the object files going into those libs).

On the contrary, in case 1b, those shared libraries take centre stage and they need to be versioned to avoid well-known issues.

Upstream cctbx developers consider that case 1b is too marginal to justify the effort of versioning the shared libraries since cctbx-dev package can be provided with all the .cpp files, thus enabling case 1a. They therefore favour a packaging where the shared libraries are kept private, or where static libraries are used instead.

package organisation

As explain by the upstream

python module organisation generated using this script cctbx-depends.png

  • Red boxes : python modules,
  • green boxes are others (C++ or external).
  • Black arrows are run-time dependancies
  • Red arrows are build-time dependancies
  • Green arrows are optional dependancies.

libraries generated

  • libann.a
  • libcbf.a
  • libccp4io.a
  • libcctbx_sgtbx_asu.so
  • libcctbx.so
  • libiotbx_mtz.so
  • libiotbx_pdb.so
  • libmmtbx_masks.so
  • libomptbx.so
  • librstbx.so
  • libscitbx_boost_python.so
  • libscitbx_minpack.so
  • libscitbx_slatec.so
  • libsmtbx_refinement_constraints.so
  • libspotfinder.so

Dispatcher scripts

There are 5 categories of dispatcher scripts:

  • Pseudo python interpreters *.python
  • Scripts conflicting the namespace of other packages
    • like: boost.lcm
  • Scripts which just show build or dist paths
    • like: annlib.show_build_path
  • Build related scripts
    • like: libtbx.assert_line_count
  • Everything else. (This could get usefull)
    • like: cctbx.lattice_symmetry

python modules/extensions

solved questions

interfacing between scons and setup.py

We'll use a custom distutils "build_ext", "install_lib" and "install_data" commands which call the scons build system under the hood, by spawning a separate process. We'll also have custom "test" and "clean" commands. That way, "python2.x setup.py build / install" work as usual.

As the upstream build system needs to be run with the same version of python that was configured, we'll call scons with: "python2.x /usr/bin/scons"

importing the extensions

Upstream import the extensions a little bit differently from the typical python project. Usually, extensions .so files are located inside the package directory where they belong. In cctbx, all .so file are in a "lib" directory, which is added to PYTHONPATH, and then an import stub imports the objects to their final place.

The extensions are not meant to be imported directly by the user, and most of them are only imported from one place, however scitbx_array_family_shared_ext.so is imported from several places.

The import stubs features a function boost.python.import_ext, which is part of boost_adaptbx. This is defined in cctbx_sources/boost_adaptbx/boost/python.py. This function does 3 things:

  • it checks for a libstdc++ mismatch with the python binary (useless in Debian)
  • it beautifies the import exception messages (nice to have, but unimportant)
  • it calls dlopen with RTLD_GLOBAL. People in the Boost community seem to think that this is necessary for correct operation in corner cases.

We'll split the extensions between the different debian packages, but keep otherwise the upstream way. This causes a runtime dependency on boost_adaptbx an a little bit of namespace pollution, but at least it doesn't introduce debian-specific problems.

The whole of the cctbx Python code shall be run with integer division. This has traditionally been achieved by passing the -Qnew option to the Python interpreter in each and every of the dispatcher scripts provided by the cctbx in the <build dir>/bin directory. Since such a global option is inacceptable for Debian, cctbx developers have added from __future__ import division to every Python module. They have also added to their pre-commit check a test rejecting a new Python module which would not feature that line.

in Baptiste's TODO list

unbundle "optik"

this is a patched version of a deprecated ancestor of stdlib's "optparse". It is only used in "libtbx/option_parser.py". I plan to change "libtbx/option_parser.py" to use "optparse" instead.

tests

cctbx has 2 kinds of tests: python scripts, which the upstream test system ("libtbx/test_utils.py") executes in-place in the source tree, and compiled test programs, which are built by scons in the build directory and executed from there. Also, many of the python tests are not part of an importable python package, so they are by default not taken into account by distutils.

The "test" command that we have right now knows how to

  • execute python tests that are part of a package from the distutils build directory (thus using the version modified by "build_py")
  • execute the compiled test programs from our scons build directory

I plan to add the python tests that are not part of a package as "package_data" in setup.py. Thus they will be copied by distutils to its build directory, and also installed by our packages.

When this is done, I will go through the still failing tests and see if the failure is our fault. I will also skip the tests that do not make sense in our case, and those that take an unreasonably long time.

open questions

  • do we want to install the compiled test programs with one of our packages?
  • ask the upstream for different naming schema between nighty and release source distribution. This way it will be possible to filter nighty built in the watch file.

TODO

  • solve the API/ABI stability problems for libs
  • license check of the remaining files
  • package missing build-dep
  • DFSG free the source package
    • identify the bundeled libraries (boost ...)
    • repack to remove all bundled libraries, it should save some space.
    • create the get-orig-source target for this purpose
    • We can delete scons, boost and most files in clipper directory. We just to need to keep the empty directories for boost, clipper etc. otherwise the build system is not working.
  • remove the cctbx bundled in objcryst-fox once done
  • run all tests at the end of the build (python setup.py check ?)
  • what about the documentation ?
  • provide pkg-config files for public shared libraries (patch for upstream)

Build system

  • Modify configure.py to support "--use-system-libraries"
    • skip shlibs build which are already available in Debian.
    • link with the system libraries.
    • use pkg-config as much as possible for library detection
    • on Debian boost_python library is named according to the python version
      • libboost_python-py26
      • libboost_python-py27
      • libboost_python-py32
  • Build for multiple python versions but build shlibs only one time.
  • version the shared libraries using scons
  • support a --prefix option
  • the install target should support DESTDIR
  • Use upstreams "--use-environment-flags" option to set CFLAGS, CXXFLAGS, CPPFLAGS and LDFLAGS
    • we need -O2 and -g for debugging symbols.
  • add a setup.py for python modules (autogenerated ?)
  • add a clean target ?
  • construct a cctbx-dev package with dispatcher script and libtbx-build system to provide a build-dep for other packages. We will also need a python script that dynamically updates the libtbx_env pickle object.

Problems

  • upstream is linking additional symbols into libcbf. libcbf is built statically so we don't need libcbf-dev
    • can be backported into the cbf upstream ?
    • check with the cbf Debian maintainer
  • fftw3tbx is missing a header file.
  • useless file in the current source package cctbx_sources/lapack_fem/xerbla.f.orig
  • what should we do with all the dispatcher in the bin directory ?
  • Depend on cbflib python bindings but cbflib in debian is not providing a python package 681997

  • Upstream is linking additional symbols from generated source into their libann.so. They call it ""Generating C++ files for self-inclusive approximate nearest neighbor"

ann

upstreams libann is build with with self_includes which makes it incompatible with libann in Debian.

$ grep "const ANNbool" /usr/include/ANN/ANN.h
const ANNboolANN_ALLOW_SELF_MATCH= ANNtrue;

$ grep "const ANNbool" cctbx_sources/annlib/include/ANN/ANN.h
const ANNboolANN_ALLOW_SELF_MATCH= ANNfalse;

annlib version:

  • in cctbx is 1.1 Release date: 05/03/05
  • in debian 1.1.2+doc-3 Release date: Jan 27, 2010 [1].

If you check ANNmanual_1.1.pdf Section 2.2.3 there is explained what this constant does. The file "cctbx_sources/annlib_adaptbx/ann/ann_adaptor.cpp" can clear things a little bit up.

So cctbx upstream ANNfalse and auto generates code with ANNtrue and namespace annself_include. So when we use annself_include it equals libann in debian.

http://cci.lbl.gov/~hohn/scitbx-tour.html#Installation http://sig9.ecanews.org/pdfs/03%20Ralf%20Grosse-Kunstleve%20-%20library_aspects.pdf