The CommonWorkflowLanguage is an effort to establish a formal specification of workflows that is accepted by a range of workflow editing tools, fostering the exchange of expertise between otherwise divided communities. The effort, now at its (http://www.commonwl.org/v1.0/) first full release, is slightly dominated by bioinformaticians and their wealth of smallish tools, services and data interacting in everyone's daily routine. Debian is pretty good as a collection of those tools, with DebianMed granting an easy access to a bunch of constructive individuals, which somewhat co-anchors the workflow community with it.

This page is meant as an open invitation to describe your personal approach towards the CWL. Let this be a collection of smallish tutorials and issues you ran into, so this can be discussed and addressed. If we do this right, then this community science at its best.

Installation

Prerequisites

sudo apt-get install cwltool

For a range of example workflows and tool descriptions, look at https://github.com/common-workflow-language/workflows and get that whole repository to your local drive via

cd # no arguments, will change to home directory
git clone https://github.com/common-workflow-language/workflows.git

The tutorial will refer to this directory as ~/workflows.

Hello World

In ~/workflows/workflows/hello a presumed innocent 'Hello World' boosts motivation with quick success. From anywhere invoke:

cwltool ~/git/workflows/workflows/hello/hello.cwl#main

and you get something alike

/usr/bin/cwltool 1.0.20160203221531
[job step0] /tmp/tmpIor1Ju$ echo 'Hello World' > /tmp/tmpIor1Ju/messageout.txt
[workflow main] outdir is /home/moeller
Final process status is success
{
    "output": {
        "path": "/home/moeller/messageout.txt", 
        "checksum": "sha1$648a6a6ffffdaa0badb23b8baf90b6168dd16b3a", 
        "class": "File", 
        "size": 12
    }
}

Indeed, the file 'mesageout.txt' holds the output as

Hello World

The #main seems somewhat unmotivated. It is the entry point within the workflow.

Packaging

CWL workflows have evolved to separate

To ease the organisation of workflows with Debian and avoid redundancies, we suggest to ship packages of tools with a cwl description. These may be

These .cwl files are today not centrally organised. Besides what can be found in many disperse git repositories, collections of CWL files are offered on:

Find an up-to-date list of sourced for CWL tool descriptions via http://www.commonwl.org/#Repositories_of_CWL_Tools_and_Workflows

Once we have a series of tool descriptions distributed with Debian source packages, since these are inspectable on salsa.debian.org, this initiative renders Debian's source repository close to a catalog of descriptions in its own right.

Tools that help with the creation of CWL tool descriptions, analogous to help2man, are emerging. Such auto-created CWL files shall follow the following naming schemes to allow these to be distinguished from curated interfaces to facilitate their automated updates.

Examples for shipping CWL files with packages are:

Authors of workflows will have their cwl files include these descriptions the standard location /usr/share/commonwl

There is yet no exact policy on how workflows shall be packaged that integrate these tool descriptions to perform analyses. Since these cannot be assigned to individual tools we propose those to be individually embedded in Debian packages.

Questions

Q: When looking at a workflow, how do I know what binaries to install?

A: This is missing/work in progress. The ELIXIR database, i.e. a catalog of bioinformatics tools and services, knows about Debian packages. And the CWL eventually refers to that database. To retrieve such information on the Debian packages to install via an automated retrieval is a bit of a workflow of its own. For the time being it may be preferable to just use "apt-cache search" or "apt-file" on the names of the binaries stated in the workflow.

Q: Where are other workflow engines than cwltool to experiment with?

A: The cwltool lists all alternatives on its home page (http://commonwl.org). None of these is packaged up to Debian standards, Java is tricky at times, but Bio-Linux has directly installable .debs or Taverna and Galaxy. Whoever aims at bringing more workflow tools to Debian finds help right here.

Q: How do I find a good workflow for my problem?

A: The CWL per se is a language in which the workflow can be formulated. It in particular invites to reuse partial workflow to adapt to local problems, but it is not a workflow repository. But, those shall surface at different places. For instance ?SeqAnswers should provide such. Or we can look at transforming what myExperiment.org is collecting. And others. That all said, the CWL community is jointly maintaining a series of workflows at https://github.com/common-workflow-language/workflows falling back to the search routines as provided by github.

References

These resources may be of interest to skim through

and emerging to be linked up