Debian eScience with myGrid and Taverna
Introduction
The term eScience (or e-Science) describes data and CPU-time intensive research that is most likely to be performed through the integration of resources throughout the Internet. Well, it may be departments in larger corporations or collaborating universities. The term is related to computational grids but today's understanding rather associates web services. The United Kingdom has invested substantial resources towards the development of an IT infrastructure for eScience applications and other countries around the globe have followed suit. The most prominent outcome is the [http://www.mygrid.org.uk myGrid] (www.mygrid.org.uk) effort with its workflow tool [http://taverna.sf.net Taverna] ([http://taverna.sf.net taverna.sf.net]).
This page describes the effort to adopt the development of the myGrid eScience project for the Debian Linux distribution. An ["Alioth"] project ([http://alioth.debian.org/projects/pkg-escience/ pkg-escience]) has just been created.
Motivation for Debian Packaging
The DebianScience special interest group describes and provides resources for scientific computing with Debian. Pkg-escience understands itself as a dedicated effort contributing its bits to the prior. If all works out nicely, then an easier provisioning of scientific services is possible by linking Debian-based developments via web services and myGrid to the world. Conversely, all myGrid services - the focus is yet on [http://en.wikipedia.org/wiki/Bioinformatics bioinformatics] but is not technically constrained to such - will be avaiable to Debian researchers.
The package taverna is useful now since one does not require to set up any local services to join Grid use and development.
Installation
In order to retrieve the packages created in this project for your local Debian machine (which is suggested to run testing or unstable) please add the following to /etc/apt/sources.list:
deb http://pkg-escience.alioth.debian.org/debian ./ deb-src http://pkg-escience.alioth.debian.org/debian ./
For the Sun JDK also add
deb http://ftp.de.debian-unofficial.org/debian/ unstable main contrib non-free deb http://ftp.de.debian-unofficial.org/debian/ testing main contrib non-free
Try apt-get install taverna. Problems may occur if you are running Debian stable. If so, you may want to investigate if Debian ["Backports"] ([http://www.backports.org www.backports.org]) has more recent libraries. Please give respective feedback.
Work to be done
Direct adoption of upstream packages
The sources provided by the upstream developers can be installed on Debian machines without any difficulty since Linux is a common operating system among them. It is however far from being acceptable for inclusion with the Debian main distribution. For the most pragmatic adoption for Debian the direct results of the compilation of the upstream source can be taken.
Issues for compliance with DFSG and Debian Policy
- Addition of new Debian packages. A considerable number of jar files is distributed without reference to the source
- through upstream CVS
- fetched at compile time as specified in build.xml
- Preparation of Documentation
- man pages
- preparation of packages for upstream documentation
Compatibility with Free Java Runtime Environments, currently the Sun SDK is used from ?DebianUnofficial (www.debian-unofficial.org)
Status of packages
Core packages |
||
Package |
Status |
Comments |
taverna |
initial version in svn |
current Taverna 1.0 CVS, apparently works |
mygrid |
||
Otherwise missing libraries |
||
Package |
Status |
Comments |
ensj |
local package |
outdated |
martj |
local package |
outdated |
How to contribute
- Join as developer on Alioth (optional)
- Send patches or indicate URL with packages of interest
Technical issues
Communication with upstream sources
Much in contrast with the general philosophy of Debian packaging, pkg-escience for now strives to use the reasonably latest upstream source.
Preparation of .orig.tar.gz
If a stable release of the upstream work is used, the orig.tar.gz is exactly that. Otherwise, such a file should be created dynamically. The following script performs this task for taverna:
# Script to update upstream CVS source, # which is supposed to be existing # locally in cvs_source/taverna1.0, # and to prepare the .orig.tar.gz from it. TARFILENAME=taverna_1.3.orig.tar.gz CVSSOURCEDIR=cvs_source TAVERNADIR=taverna1.0 ( cd $CVSSOURCEDIR \ && ( cd $TAVERNADIR && cvs update . ) \ && tar czvf $TARFILENAME --exclude=CVS $TAVERNADIR ) && mv cvs_source/$TARFILENAME .
Checkout of latest alioth svn changes
svn co svn+ssh://youraliothID@svn.debian.org/svn/pkg-escience/taverna
Use of svn-buildpackage
One changes the current working directory into the directory of the {{{svn-buildpackage --svn-dont-purge --svn-dont-clean \ --svn-reuse -rfakeroot}}}
How to upload the packaging of a new package to svn
New packages are first submitted to the alioth svn
after that package was first successfully packaged through dpkg-buildpackage
with svn-inject -v -o package.dsc $svnrepos
with svnrepos=svn+ssh://youraliothID@svn.debian.org/svn/pkg-escience.
And on alioth.debian.org
Maintenance of home page
The emphasis of the web pages describing should be on these wiki pages. If you feel inclined to update the project home page, then please do so by loggin in via
$ ssh youraliothID@alioth.debian.org $ [ -x pkg-escience ] || \ ln -s \ /org/alioth.debian.org/chroot/home/groups/pkg-escience . $ cd pkg-science/htdocs
That directory contains the [http://pkg-escience.alioth.debian.org index.html] which can be edited ad libido and the subfolder debian that harbors the:
apt repository
As a start, only a repository for Debian all is planned. Help to set this up properly across architectures is welcome. To upload new packages first create a new subfolder for the package
$ ssh youraliothID@alioth.debian.org \ "mkdir pkg-escience/htdocs/debian/packagename"
then scp the files to the destination
$ scp taverna_1.3.orig.tar.gz taverna_1.3-1.cvs20060423* \ youraliothID@alioth.debian.org:pkg-escience/htdocs/debian/packagename/
and finally update the index files.
$ cat update.sh #!/bin/bash apt-ftparchive sources . | tee Sources | gzip -c > Sources.gz apt-ftparchive packages . | tee Packages | gzip -c > Packages.gz $ ./update.sh
Related projects in the Debian community
["DebianScience"]
- ["pkg-bioc"]
- ["BOINC"]
[https://alioth.debian.org/projects/pkg-grid/ pkg-grid] Alioth project (appears dormant)