Differences between revisions 60 and 61
Revision 60 as of 2009-03-16 03:35:47
Size: 15331
Editor: anonymous
Comment: converted to 1.6 markup
Revision 61 as of 2009-05-30 07:47:07
Size: 15388
Editor: FranklinPiat
Comment: svn checkout tools-ng in folder tools-ng. (plus formating improvements)
Deletions are marked like this. Additions are marked like this.
Line 21: Line 21:
The [[Popularity Contest]] is an initiative to collect statistics about the number of installations that any package experiences in Debian. To be counted, the packages do not necessarily need to be distributed via Debian servers. Here are our numbers: The [[http://popcon.debian.org/|Popularity Contest]] is an initiative to collect statistics about the number of installations that any package experiences in Debian. To be counted, the packages do not necessarily need to be distributed via Debian servers. Here are our numbers:
Line 30: Line 30:
The current developpment is actually done in [[http://svn.debian.org/wsvn/pkg-bioc/branches/tools-ng/|SVN sql branch]] where the sql version is brough to live. This branch is actually a copy of the tools-ng describe below, but with the data store in SQL, so we do not need to reread everything at each startup. The code is not yet finished, so this is not a completely working build system. Contribution are the most welcome. The current development is actually done in [[http://svn.debian.org/wsvn/pkg-bioc/branches/tools-ng/|SVN sql branch]] where the sql version is brought to live. This branch is actually a copy of the tools-ng describe below, but with the data store in SQL, so we do not need to reread everything at each startup. The code is not yet finished, so this is not a completely working build system. Contribution are the most welcome.
Line 33: Line 33:
For ''pkg-bioc'' you should checkout everything. The binaries produced are not part of the SVN repository, only the manually prepared files go there. For {{{pkg-bioc}}} you should checkout everything. The binaries produced are not part of the SVN repository, only the manually prepared files go there.
Line 44: Line 44:
For the management of manually prepared R packages, which are partly maintained in the trunk/packages folder, the tool ''svn-buildpackage'' should be used. Other projects like DebianMed give respective instructions. For the management of manually prepared R packages, which are partly maintained in the trunk/packages folder, the tool {{{svn-buildpackage}}} should be used. Other projects like DebianMed give respective instructions.
Line 47: Line 47:
This section explains how the .deb packages can be build locally from the R package archives. You will need about 1GB just for building the CRAN packages. All the build process scripts are developed for Lenny/Sid. The ''r_pkg_prepare.sh'' and ''r_pkg_update.pl'' scripts prepare and create the proper preparation of Debian packages with the choosed Builder suite: [[http://packages.debian.org/lenny/pbuilder|pbuilder]] or [[http://packages.debian.org/lenny/cowdancer|cowbuilder]]. It is strongly recommended to follow these instructions as it is a very nice piece of technology that makes Debian as strong as it is. The ''r_pkg_ordering.pl'' script will create the graph dependency of all package, and after launched the build with the selected Builder. The default Builder platform is [[http://packages.debian.org/lenny/cowdancer|cowbuilder]]. This section explains how the .deb packages can be build locally from the R package archives. You will need about 1GB just for building the CRAN packages. All the build process scripts are developed for Lenny/Sid. The {{{r_pkg_prepare.sh}}} and {{{r_pkg_update.pl}}} scripts prepare and create the proper preparation of Debian packages with the choosed Builder suite: [[http://packages.debian.org/lenny/pbuilder|pbuilder]] or [[http://packages.debian.org/lenny/cowdancer|cowbuilder]]. It is strongly recommended to follow these instructions as it is a very nice piece of technology that makes Debian as strong as it is. The {{{r_pkg_ordering.pl}}} script will create the graph dependency of all package, and after launched the build with the selected Builder. The default Builder platform is [[http://packages.debian.org/lenny/cowdancer|cowbuilder]].
Line 49: Line 49:
1. Create an empty directory and cd in it.

2. Prepare local Debian installation:

{{{
aptitude install cdbs subversion debhelper devscripts libapt-pkg-perl libgraph-perl pbuilder r-base-core fakeroot cdebootstrap libdebian-installer4 libdebian-installer-extra4 sudo libwww-perl cowdancer lintian}}}
 .
Note: cdebootstrap need at least version 0.4.X, if not lenny is not known.
3. Checkout the pkg-bioc's ''tools-ng'' module via SVN, either

* anonymously without account on alioth.debian.org
{{{
svn co svn://svn.debian.org/svn/pkg-bioc/trunk/tools-ng tools
 1. Create an empty directory and cd in it.
 2. Prepare local Debian installation:{{{
aptitude install cdbs subversion debhelper devscripts libapt-pkg-perl libgraph-perl pbuilder r-base-core fakeroot cdebootstrap libdebian-installer4 libdebian-installer-extra4 sudo libwww-perl cowdancer lintian}}} <<BR>> Note: cdebootstrap need at least version 0.4.X, if not lenny is not known.
 3. Checkout the pkg-bioc's {{{tools-ng}}} module via SVN, either
  * anonymously without account on alioth.debian.org {{{
svn co svn://svn.debian.org/svn/pkg-bioc/trunk/tools-ng tools-ng
Line 62: Line 56:
 * or (preferred) with one's on alioth.debian.org
{{{
  * or (preferred) with one's account on alioth.debian.org {{{
Line 72: Line 65:
4. You could '''possibly''' might decide to edit the header of ''r_pkg_prepare.sh'' for setting up some options: 4. You could '''possibly''' might decide to edit the header of {{{r_pkg_prepare.sh}}} for setting up some options:
Line 76: Line 69:
5. Preparations for the packaging script: execute the ''r_pkg_prepare.sh'' with the right option. It will prepare 5. Preparations for the packaging script: execute the {{{r_pkg_prepare.sh}}} with the right option. It will prepare
Line 82: Line 75:
 * generate a file ''autogenerate-variable.out'' with all the right setting in it.  * generate a file {{{autogenerate-variable.out}}} with all the right setting in it.
Line 95: Line 88:
6. Preparation of the mirror and Builder: ''r_pkg_update.pl'' 6. Preparation of the mirror and Builder: {{{r_pkg_update.pl}}}
Line 126: Line 119:
This script is generated by ''r_pkg_prepare.sh''. This script is generated by {{{r_pkg_prepare.sh}}}.
Line 148: Line 141:
 * The script ''gen_already_in_debian.sh'' generate the file (list.d/AlreadyIncludeInDebian.list) listing the already packaged R modules.  * The script {{{gen_already_in_debian.sh}}} generate the file (list.d/AlreadyIncludeInDebian.list) listing the already packaged R modules.

Debian CRAN/BioConductor/Omegahat package archive

The Alioth pkg-bioc group is working on Debian packages repository for the GNU R packages from the upstream sources at CRAN , BioConductor and Omegahat.

NOTE: The project moved to a subversion repository. That change went hand in hand with a considerable rewrite. The folder from which to start scripts hence became tools-ng, the prior folder tools will be removed in near future.

Advocacy

This effort is important for Debian for several reasons:

  • Ease of use for scientific researchers using Debian
    • The archive is aiming to be complete
    • The archive is aiming to be up to date
    • The archive is aiming to be easily updateable
  • Provide an improved link between Debian and the research communities as their upstream developers

It is also important for initiatives within and associated with Debian

  • Debian Pure Blends (like DebianMed) and derivatives like Quantian can use CRAN and BioConductor packages

  • Commercial efforts distributing Debian-derived Bioinformatics solutions are likely to improve over an inclusion of BioConductor, CRAN and Omegahat.

PopCon stats

The Popularity Contest is an initiative to collect statistics about the number of installations that any package experiences in Debian. To be counted, the packages do not necessarily need to be distributed via Debian servers. Here are our numbers:

Contributions

The May 2007 brought us a repository for Debian packages. We hope to announce soon on this page once its regular maintenance for updates is secured. To help in that process or with the further development of the R packages, browse the code in the SVN repository and you may want to introduce yourself on the pkg-bioc-devel mailing list. The lines below explain how to automatically build the packages. If you want to contribute, you could have a look at the following TODO list.

The current development is actually done in SVN sql branch where the sql version is brought to live. This branch is actually a copy of the tools-ng describe below, but with the data store in SQL, so we do not need to reread everything at each startup. The code is not yet finished, so this is not a completely working build system. Contribution are the most welcome.

Using subversion to group-maintain packages

For pkg-bioc you should checkout everything. The binaries produced are not part of the SVN repository, only the manually prepared files go there.

If you have an account on Alioth then use it if you are interested in write access to the repository (you are)

svn co svn+ssh://aliothname-guest@svn.debian.org/svn/pkg-bioc/
  • The password you may have to enter multiple times. For read only access do

svn co svn://svn.debian.org/pkg-bioc/

For the management of manually prepared R packages, which are partly maintained in the trunk/packages folder, the tool svn-buildpackage should be used. Other projects like DebianMed give respective instructions.

How to build the packages

This section explains how the .deb packages can be build locally from the R package archives. You will need about 1GB just for building the CRAN packages. All the build process scripts are developed for Lenny/Sid. The r_pkg_prepare.sh and r_pkg_update.pl scripts prepare and create the proper preparation of Debian packages with the choosed Builder suite: pbuilder or cowbuilder. It is strongly recommended to follow these instructions as it is a very nice piece of technology that makes Debian as strong as it is. The r_pkg_ordering.pl script will create the graph dependency of all package, and after launched the build with the selected Builder. The default Builder platform is cowbuilder.

  1. Create an empty directory and cd in it.
  2. Prepare local Debian installation:

    aptitude install cdbs subversion debhelper devscripts libapt-pkg-perl libgraph-perl pbuilder r-base-core fakeroot cdebootstrap libdebian-installer4 libdebian-installer-extra4 sudo libwww-perl cowdancer lintian


    Note: cdebootstrap need at least version 0.4.X, if not lenny is not known.

  3. Checkout the pkg-bioc's tools-ng module via SVN, either

    • anonymously without account on alioth.debian.org

      svn co svn://svn.debian.org/svn/pkg-bioc/trunk/tools-ng tools-ng
    • or (preferred) with one's account on alioth.debian.org

      svn co svn+ssh://developername@svn.debian.org/svn/pkg-bioc/trunk/tools-ng tools-ng
    • Only this second route allows the contribution to the project.

and go to the new directory

cd tools-ng

4. You could possibly might decide to edit the header of r_pkg_prepare.sh for setting up some options:

  • The default Builder is cowbuilder.

  • The default maximum size of the source package is 50MB.

5. Preparations for the packaging script: execute the r_pkg_prepare.sh with the right option. It will prepare

  • the directories : create a source,build directories and a directory for the changelogs to be passed across builds.
  • create the symlink.
  • prepare some apt-cache stuff, need for the build.
  • prepare and configure option need for the Builder
  • generate a file autogenerate-variable.out with all the right setting in it.

  • to be run the first time, or when change need it. In theory, it could be run every time before r_pkg_update.pl.

    Note: for a list of all the available options try :  r_pkg_prepare.sh --help 

  • On the US side, try:

sh ./r_pkg_prepare.sh --create-all --us
  • On Europe side (currently Germany), you might want to try the following for CRAN packages:

sh ./r_pkg_prepare.sh --create-all --eu

If you just want the CRAN mirror, you might want to try the following option --create-cran in place of --create-all.

6. Preparation of the mirror and Builder: r_pkg_update.pl

  • download the packages from CRAN, BioConductor, Omegahat to the local directory structure. It retrieves the top-level files as well as the Descriptions dir and nothing else. Some really big package are blacklist. We need an R guys for providing some nice R code checking if a file is bigger than XXMB before downloading them (see TODO list). (--doupdate)

  • create or update the Builder base image. (--dobuilderupdate)
  • clean in the build and source directories for obsolete files and directories.

Note: for a list of all the available options try :  ./r_pkg_update.pl --help  or  ./r_pkg_update.pl --man 

./r_pkg_update.pl --doupdate --dobuilderupdate

Also note: The --dobuilderupdate implies a changeroot if ?CowBuilder or PBuilder approaches are utilised. Either process needs to perform a chroot. One hence needs to allow the execution of pbuilder and/or cowbuilder as root via an entry in /etc/sudoers like

username ALL=(ALL) NOPASSWD: /usr/sbin/pbuilder

The editing should be performed via the visudo program to avoid syntax errors.

7. The dependencies Graph construction is done on this step. After constructing the dependencies, the packages are build. If a package failed to be build, all package depending on this one, will be marked as not buildable. The graph construction is not taking a lot of time. What is taking a lot of time is reading the DESCRIPTION file from the disk. it's taking about 150 seconds to build the dependencies graph of cran, bioc, omegahat (on my old P4). So this part will be optimize as soon as the storage.pm exist (see TODO list).

Note: for a list of all the available options try :  ./r_pkg_ordering.pl --help  or  ./r_pkg_ordering.pl --man 

./r_pkg_ordering.pl

I will highly sugest that you launched the previous command in a screen.

The building time is taking time.... for the full 3 repositories, it's about 2500 packages to be build for a fresh build! So you can from time to time, from an other terminal in the tools-ng directory, run:

 ./stat.sh

This script is generated by r_pkg_prepare.sh.

If you want to stop the building process in a clean way,

touch stop

When the current package will be finished, the script will stop the building loop and write the 2 summary files:

  • ../web/cannotbuild.html
  • ../web/duplicated.html

and exit cleanly.

do not forgot of course to remove your "stop" file, before relaunching your ./r_pkg_ordering.pl.

8. upload the results to the shared repository with the current directory still being in the tools-ng directory. (You need to have an access to our Alioth pkg-bioc group. See the pkg-bioc-devel mailing-list if you want to contribute.) This step is not yet implemented. It will be implemented, as soon as, we have a build process producing Lintian clean packages.

Note: If you want to connect using your ssh key :), you need to log in on Alioth and import your public from the web interface Import your ssh key.

Alternative Notes

  • The script gen_already_in_debian.sh generate the file (list.d/AlreadyIncludeInDebian.list) listing the already packaged R modules.

  • 2 builds mechanism are support: pbuilder or cowbuilder. Other build system could be support, it's just a matter of:

 cp PBuilder.pm mynewbuilder.pm
 edit mynewbuilder.pm to correct the call done inside.
 edit r_pkg_prepare to select the build install methods as mynewbuilder
 continue on step 5 : sh ./r_pkg_prepare.sh --create-all --somewhere

When your mynewbuilder is working, don't forget to commit it on the SVN if you are alioth contributors, or to send it to the mailing-list.

Automated assignment of Debtags

This part is needing some contributions (see TODO list)

The ?BioConductor developers annotate their packages by words of a controlled hierarchically organised vocabulary. Details are laid out in the biocViews package. The Debian community has come up with Debian Tags, in short: [debtags.alioth.debian.org Debtags]. The Perl script tools/r_debtags_update.pl performs an automated translation of the biocViews terms into entries of the Debtags initiative. It can be called without parameters and as a start reads through all DESCRIPTIONs of locally installed Debian packages. This is not perfect but it is something. The challenge now is to merge such automated efforts with manually created entries. Most of the stuff has been merged in the tools-ng/debtab.pm which still need some more work to make it working).

The SVN repository stores the latest list of tags in the file tools-ng/R.tags. Nothing has yet been decided if an how we should make this accessible to access as a source in /etc/debtags/sources.list. The DebianMed community provides a svn page to collect additions for the Debtags initiative.

Technical Details (Draft)

The 'r_pkg_prepare' script sets up folders to store results of the packaging process. It also sets a number of variables that are read from later scripts. Most obviously this comprises the decisions * which build method should be used * from where packages should be downloaded and a range of further files are created to help cow/pbuilder to perform correctly.

The then called script 'r_pkg_update' performs two actions. The first is represented by the option '--doupdate' which mirrors the source packages from CRAN, ?BioConductor or ?OmegaHat. The '--dobuildupdate' updates the ?[chroot] in which the packages are built. That chroot is tared away into a file and untared again for every package that is due to be built. This overhead is reduced by using cowbuilder. Since the call to chroot demands superuser capabilities, the cowbuilder and pbuilder commands need to be added to the sudoers file.

The packaging itself is performed by the script 'r_pkg_ordering' which does not see additional options. It just builds then all while taking care of the build order, hence the name.

Acknowledgements

Directly or indirectly and this effort received support by the Institutes Neuro- and Bioinformatics, Medical Informatics and Medical Biometry and Statistics at the University of Lübeck, the KnowARC EU project, our friends and families. Please add who is missing.

http://www.inb.uni-luebeck.de/~moeller/bioc/pkg-bioc-logo-R.png