Debian eScience with myGrid and Taverna
Introduction
The term eScience (or e-Science) describes data and CPU-time intensive research that is most likely to be performed through the integration of resources throughout the Internet. Well, it may be departments in larger corporations or collaborating universities. The term is related to computational grids but today's understanding rather associates web services. The United Kingdom has invested substantial resources towards the development of an IT infrastructure for eScience applications and other countries around the globe have followed suit. The most prominent outcome is the [http://www.mygrid.org.uk myGrid] (www.mygrid.org.uk) effort with its workflow tool [http://taverna.sf.net Taverna] ([http://taverna.sf.net taverna.sf.net]).
This page describes the effort to adopt the development of the myGrid eScience project for the Debian Linux distribution. An ["Alioth"] project ([http://alioth.debian.org/projects/pkg-escience/ pkg-escience]) has just been created.
Motivation for Debian Packaging
The DebianScience special interest group describes and provides resources for scientific computing with Debian and DebianMed, a CustomDebian distribution strives to render Debian a one-stop-shop for biomedical applications which also comprises Bioinformatics. Pkg-eScience understands itself as a dedicated effort contributing its bits to the prior two. If all works out nicely, then an easier provisioning of scientific services is possible by linking Debian-based developments via web services and myGrid to the world. Conversely, all myGrid services - the focus is yet on [http://en.wikipedia.org/wiki/Bioinformatics bioinformatics] but is not technically constrained to such - will be avaiable to Debian researchers.
The package taverna is useful now since one does not require to set up any local services to join Grid use and development.
Installation
In order to retrieve the packages created in this project for your local Debian machine (which is suggested to run testing or unstable) please add the following to /etc/apt/sources.list:
deb http://pkg-escience.alioth.debian.org/debian ./ deb-src http://pkg-escience.alioth.debian.org/debian ./
For the Sun JDK also add
deb http://ftp.de.debian-unofficial.org/debian/ unstable main contrib non-free deb http://ftp.de.debian-unofficial.org/debian/ testing main contrib non-free
Try apt-get install taverna. Problems may occur if you are running Debian stable. If so, you may want to investigate if Debian ["Backports"] ([http://www.backports.org www.backports.org]) has more recent libraries. Please give respective feedback. To contribute to the packaging or to perform changes to the upstream sources please compare with the section "Installation from Source" at the Debian Wiki pages of ["BOINC"].
Work to be done
Direct adoption of upstream packages
The sources provided by the upstream developers can be installed on Debian machines without any difficulty since Linux is a common operating system among them. It is however far from being acceptable for inclusion with the Debian main distribution. For the most pragmatic adoption for Debian the direct results of the compilation of the upstream source can be taken.
Issues for compliance with DFSG and Debian Policy
- Addition of new Debian packages. A considerable number of jar files is distributed without reference to the source
- through upstream CVS
- fetched at compile time as specified in build.xml
- Preparation of Documentation
- man pages
- preparation of packages for upstream documentation
Compatibility with Free Java Runtime Environments, currently the Sun SDK 1.5 is used from ?DebianUnofficial (www.debian-unofficial.org)
Package-specific TODO list
Moved to ["pkg-escience/todo"].
Overview on status of packages
Core packages |
||||
Package |
apt |
svn |
Comments |
DFSG |
taverna |
x |
x |
current Taverna 1.0 CVS, apparently works |
no |
mygrid |
- |
- |
||
Otherwise missing libraries |
||||
Package |
apt |
svn |
Comments |
DFSG |
ensj |
x |
x |
compiles with Taverna |
no |
martj |
x |
x |
compiles with Taverna |
no |
biojava |
x |
x |
compiles with Taverna |
almost |
bytecode |
x |
x |
compiles with Taverna |
almost |
freefluo |
x |
x |
decided for wrong source |
no |
uddi4j |
x |
x |
compiles with Taverna |
almost |
icu4j |
x |
x |
works with Taverna |
almost |
wsdl4j |
x |
x |
compiles with axis |
almost |
axis |
x |
x |
untested |
no |
jastor |
- |
- |
requires more recent Jena than distributed with upstream Taverna |
jdk compatibility test remaining for "contrib" |
jena |
- |
- |
requires more libraries, unclear compatibility with Debian libxercesImpl, compiles with jastor |
no |
How to contribute
- Join
- as developer on Alioth (optional)
on the [http://lists.alioth.debian.org/mailman/listinfo/pkg-escience-devel mailing list]
- Send patches or indicate URL with packages of interest
Guidelines for development
Moved to ["pkg-escience/develguide"]
Related projects
in the Debian community
["DebianScience"] Wiki page
["pkg-bioc"] Wiki page accompaning ?BioConductor and R Debian packaging project
[http://alioth.debian.org/projects/pkg-emboss/ pkg-emboss] Alioth project (dormant, sadly)
- ["BOINC"]
[https://alioth.debian.org/projects/pkg-grid/ pkg-grid] Alioth project (appears dormant)
[https://alioth.debian.org/projects/pkg-scicomp/ pkg-scicomp] Alioth project on scientific computing
and outside of Debian
[http://www.mygrid.org.uk ?MyGrid.org.uk] - the upstream page
[http://www.vl-e.com/ Virtual Laboratory for e-Science] - VL-Eers - are you reading this?
[http://www.trianacode.org/ Triana] - Another workflow management environment with ties to several grids