Notes on the discussion about the external tool at the Debian Med meeting during January 2011.

External tool invocation discussion

Ways of calling tools are described by "use case descriptions". The use case descriptions can be held in a registry. For example,

http://usecase.taverna.org.uk/sharedRepository/index.php

The KnowARC project developed a plugin for Taverna that can read the use case descriptions and allow you to include calls to the a tool in a workflow.

You need to know how/where to call the tool - the "invocation mechanism". There are currently three options for the invocation mechanism:

In the current plugin, the setting for how to call the tool is shared by all external tool services in a workflow e.g. all the tools are run locally.

Planned improvement

Taverna will manage a set of invocation environments that are named and identified by a UUID, for example "fred" and "62A81F2F-4C3D-4C0C-ACF1-681327130328". (The UUID may be changed to being a URL.)

An external tool service in a workflow will state the invocation environment that the tool will be run in.

The invocation environments can specify the settings for the various invocation mechanisms and also which invocation mechanism is currently to be used e.g. ssh goes to phoebus.cs.man.ac.uk but to currently use local. So you can have some services using environment "fred", some "bob" and some "jim".

The invocation environment manager allows users to edit the settings and also change which mechanism to use. This allows you to easily change where a set of tools will be run e.g. to change all services using setting "bob" to run locally.

There is also a proposal to understand the idea of test and production invocation environments. So, "fred" can be set to run services locally during test and on a grid during production. The choice of whether to run a workflow in a test or production mode will be made when the workflow is run.

It is not clear how the choice of mode will be shown to the user

A workflow run may vary the data that it uses according to the run mode e.g. to use different data during test and production. There needs to be explicit support for this in the workflow but it is not clear how.

Ongoing issues

External tool description

There is an XML format for specifying the use case descriptions. We have also looked at:

Current plan is to be able to translate the EMBOSS acd descriptions and put them in a repository. Future work will be done to ensure that the external tool capability is sufficient.

Additional invocation mechanisms

For Taverna 2.3, we will include local and ssh invocation as part of the release. The KnowARC invocation will probably be made available as a plugin.

Need to look at cloud invocation soon.

Invocation environment checking

There is currently some limited ways of specifying what needs to be on the machine where a tool will be run, and also how to check if the tool can be run there. Need to look at more general ways of specifying this.

Sensible handling of data

Want to minimize the transfer of data. So data stays, where possible, with the tools that will use it. Also, tools are invoked where the data is.

Need to extend Taverna's data handling mechanisms to deal with this. Need to improve some of the invocation mechanisms to better decide where to run the tools.