Debian eScience with myGrid and Taverna

Introduction

The term eScience (or e-Science) describes data and CPU-time intensive research that is most likely to be performed through the integration of resources throughout the Internet. Well, it may be departments in larger corporations or collaborating universities. The term is related to computational grids but today's understanding rather associates web services. The United Kingdom has invested substantial resources towards the development of an IT infrastructure for eScience applications and other countries around the globe have followed suit. The most prominent outcome is the [http://www.mygrid.org.uk myGrid] (www.mygrid.org.uk) effort with its workflow tool [http://taverna.sf.net Taverna] ([http://taverna.sf.net taverna.sf.net]).

This page describes the effort to adopt the development of the myGrid eScience project for the Debian Linux distribution. An ["Alioth"] project ([http://alioth.debian.org/projects/pkg-escience/ pkg-escience]) has just been created.

Motivation for Debian Packaging

The DebianScience special interest group describes and provides resources for scientific computing with Debian. Pkg-escience understands itself as a dedicated effort contributing its bits to the prior. If all works out nicely, then an easier provisioning of scientific services is possible by linking Debian-based developments via web services and myGrid to the world. Conversely, all myGrid services - the focus is yet on [http://en.wikipedia.org/wiki/Bioinformatics bioinformatics] but is not technically constrained to such - will be avaiable to Debian researchers.

The package taverna is useful now since one does not require to set up any local services to join Grid use and development.

Installation

In order to retrieve the packages created in this project for your local Debian machine (which is suggested to run testing or unstable) please add the following to /etc/apt/sources.list:

deb http://pkg-escience.alioth.debian.org/debian ./
deb-src http://pkg-escience.alioth.debian.org/debian ./

For the Sun JDK also add

deb http://ftp.de.debian-unofficial.org/debian/ unstable main contrib non-free
deb http://ftp.de.debian-unofficial.org/debian/ testing main contrib non-free

Try apt-get install taverna. Problems may occur if you are running Debian stable. If so, you may want to investigate if Debian ["Backports"] ([http://www.backports.org www.backports.org]) has more recent libraries. Please give respective feedback.

Work to be done

Direct adoption of upstream packages

The sources provided by the upstream developers can be installed on Debian machines without any difficulty since Linux is a common operating system among them. It is however far from being acceptable for inclusion with the Debian main distribution. For the most pragmatic adoption for Debian the direct results of the compilation of the upstream source can be taken.

Issues for compliance with DFSG and Debian Policy

Status of packages

Core packages

Package

Status

Comments

taverna

initial version in svn

current Taverna 1.0 CVS, apparently works

mygrid

Otherwise missing libraries

Package

Status

Comments

ensj

local package

outdated

martj

local package

outdated

How to contribute

Technical issues

Communication with upstream sources

Much in contrast with the general philosophy of Debian packaging, pkg-escience for now strives to use the reasonably latest upstream source.

Preparation of .orig.tar.gz

If a stable release of the upstream work is used, the orig.tar.gz is exactly that. Otherwise, such a file should be created dynamically. The following script performs this task for taverna:

# Script to update upstream CVS source,
# which is supposed to be existing
# locally in cvs_source/taverna1.0,
# and to prepare the .orig.tar.gz from it.

TARFILENAME=taverna_1.3.orig.tar.gz
CVSSOURCEDIR=cvs_source
TAVERNADIR=taverna1.0

(
        cd $CVSSOURCEDIR \
        && ( cd $TAVERNADIR && cvs update . ) \
        && tar czvf $TARFILENAME --exclude=CVS $TAVERNADIR
) && mv cvs_source/$TARFILENAME .

Checkout of latest alioth svn changes

 svn co svn+ssh://youraliothID@svn.debian.org/svn/pkg-escience/taverna

Use of svn-buildpackage

One changes the current working directory into the directory of the {{{svn-buildpackage --svn-dont-purge --svn-dont-clean \ --svn-reuse -rfakeroot}}}

How to upload the packaging of a new package to svn

New packages are first submitted to the alioth svn

with svnrepos=svn+ssh://youraliothID@svn.debian.org/svn/pkg-escience.

And on alioth.debian.org

Maintenance of home page

The emphasis of the web pages describing should be on these wiki pages. If you feel inclined to update the project home page, then please do so by loggin in via

$ ssh youraliothID@alioth.debian.org
$ [ -x pkg-escience ] || \
ln -s  \
  /org/alioth.debian.org/chroot/home/groups/pkg-escience .
$ cd pkg-science/htdocs

That directory contains the [http://pkg-escience.alioth.debian.org index.html] which can be edited ad libido and the subfolder debian that harbors the:

apt repository

As a start, only a repository for Debian all is planned. Help to set this up properly across architectures is welcome. To upload new packages first create a new subfolder for the package

$ ssh youraliothID@alioth.debian.org \
  "mkdir pkg-escience/htdocs/debian/packagename"

then scp the files to the destination

$ scp taverna_1.3.orig.tar.gz taverna_1.3-1.cvs20060423* \ youraliothID@alioth.debian.org:pkg-escience/htdocs/debian/packagename/

and finally update the index files.

$ cat update.sh
#!/bin/bash
apt-ftparchive sources . | tee Sources | gzip -c > Sources.gz
apt-ftparchive packages . | tee Packages | gzip -c > Packages.gz
$ ./update.sh


CategoryJava