Differences between revisions 6 and 7
Revision 6 as of 2016-02-17 14:18:33
Size: 4835
Comment: Added reference
Revision 7 as of 2016-02-18 20:15:43
Size: 4837
Comment:
Deletions are marked like this. Additions are marked like this.
Line 72: Line 72:

The CommonWorkflowLanguage is an effort to establish a formal specification of workflows that is accepted by a range of workflow editing tools, fostering the exchange of expertise between otherwise divided communities. The effort, now at its (https://github.com/common-workflow-language/common-workflow-language/tree/master/draft-3) third draft, is slightly dominated by bioinformaticians and their wealth of smallish tools, services and data interacting in everyone's daily routine. Debian is pretty good as a collection of those tools, with DebianMed granting an easy access to a bunch of constructive individuals, which somewhat co-anchors the workflow community with it.

This page is meant as an open invitation to describe your personal approach towards the CWL. Let this be a collection of smallish tutorials and issues you ran into, so this can be discussed and addressed. If we do this right, then this community science at its best.

Installation

Prerequisites

sudo apt-get install git git-buildpackage debhelper dh-python python python-all python-setuptools python-rdflib-jsonld python-schema-salad python-shellescape

and as runtime requirements for the cwltool, also install

sudo apt-get install python-rdflib python-rdflib-jsonld python-requests python-schema-salad python-yaml

A Debian package of the cwltool reference implementation of CWL-compatible workflow engines is about to be uploaded to the Debian repository. In the meantime, to build and install it perform

gbp clone https://anonscm.debian.org/git/debian-med/cwltool.git
cd cwltool
gbp buildpackage -rfakeroot
cd ..
sudo dpkg -i cwltool_*deb

For a range of example workflows and tool descriptions, look at https://github.com/common-workflow-language/workflows and get that whole repository to your local drive via

cd # no arguments, will change to home directory
git clone https://github.com/common-workflow-language/workflows.git

The tutorial will refer to this directory as ~/workflows.

Hello World

In ~/workflows/workflows/hello a presumed innocent 'Hello World' boosts motivation with quick success. From anywhere invoke:

cwltool ~/git/workflows/workflows/hello/hello.cwl#main

and you get something alike

/usr/bin/cwltool 1.0.20160203221531
[job step0] /tmp/tmpIor1Ju$ echo 'Hello World' > /tmp/tmpIor1Ju/messageout.txt
[workflow main] outdir is /home/moeller
Final process status is success
{
    "output": {
        "path": "/home/moeller/messageout.txt", 
        "checksum": "sha1$648a6a6ffffdaa0badb23b8baf90b6168dd16b3a", 
        "class": "File", 
        "size": 12
    }
}

Indeed, the file 'mesageout.txt' holds the output as

Hello World

The #main seems somewhat unmotivated. It is the entry point within the workflow.

Questions

Q: When looking at a workflow, how do I know what binaries to install?

A: This is missing/work in progress. The ELIXIR database, i.e. a catalog of bioinformatics tools and services, knows about Debian packages. And the CWL eventually refers to that database. To retrieve such information on the Debian packages to install via an automated retrieval is a bit of a workflow of its own. For the time being it may be preferable to just use "apt-cache search" or "apt-file" on the names of the binaries stated in the workflow.

Q: Where are other workflow engines than cwltool to experiment with?

A: The cwltool lists all alternatives on its home page (http://commonwl.org). None of these is packaged up to Debian standards, Java is tricky at times, but Bio-Linux has directly installable .debs or Taverna and Galaxy. Whoever aims at bringing more workflow tools to Debian finds help right here.

Q: How do I find a good workflow for my problem?

A: The CWL per se is a language in which the workflow can be formulated. It in particular invites to reuse partial workflow to adapt to local problems, but it is not a workflow repository. But, those shall surface at different places. For instance ?SeqAnswers should provide such. Or we can look at transforming what myExperiment.org is collecting. And others. That all said, the CWL community is jointly maintaining a series of workflows at https://github.com/common-workflow-language/workflows falling back to the search routines as provided by github.

References

These resources may be of interest to skim through

and emerging to be linked up