Debian Package Integration for Biomedical Research

Debian now provides a very nice collection of programs for the dealing with molecular data. And the internet offers an enormous wealth of data to browse through. The tools are often prepared with the data from databases, or to produce data to be eventually stored in databases, but they do not work with each other. For researchers, however, it is the instant comparison of results from in silico analysis with wet-lab findings (locally or from the web) that helps generating new ideas that are then tested in the lab. Sadly, most currently available tools do not support such a back- and forth between databases and algorithms. For instance, it is left to the researcher to memorize features of DNA (sequence variations, exon/intron boundaries, ...) memorize their shifts in locations between runs in computational tools.

Workflow suites like Taverna have addressed that problem, but are not suitable for interactive use on the desktop. Here, the dbus (desktop bus) ensures communication between programs. And the mobile community has extended that concept to work with not so simple constructs to describe mobile phones with their resources (GPS, Wifi, ..) and the data those devices deliver. The GSoC student shall investigate to what degree this concept is adaptable for the communication of bioinformatics data between application. This project shall provide a proof of concept, that a few lines of extra code can bring an enormous push in productivity to wet-lab researchers.

The student shall investigate the program dotter of the acedb project (Debian package prepared in Debian Med) and provide DNA feature annotation on the basis of the Ensembl database. Dotter provides an intuitive overview on putative DNA features. The annotation from databases then (possibly/hopefully) provides explanations of what is seen, also educating the researcher, and indicating similarly obvious features that are still to be described. To the degree that time permits, this effort shall then be abstracted towards arbitrary DAS (Distributed Annotation System) sources.