Linking Applications with Data
Mentor: SteffenMoeller
Summary: Further develop / integrate efforts to manage public data with Debian
Required skills:
- Perl + good general programming
- English
Description: Many applications in scientific environments ship publicly available data with its source code. This may be
- directly related to its the application's core functionality (e.g. description of restriction enzymes' sequence specificity),
- part of the documentation (example)
- have installation should be performed only once
- know updates to be supported and triggerable e.g. by cron
Many larger databases have a considerable number of tools associated with it. The tool needs to be aware of those dependencies and perform respective post-processing with every such update. The Debian Med community has developed getData [1] for this purpose. Recently, the tool BioMaj was added to the distribution. The student shall investigate a series of scenarios in which the ?UniProt [2] database is used and perform respective downstream work with EMBOSS and NCBI BLAST by some automatism. Further thinking shall involve
- how to install several versions in parallel
- how to prepare Debian packages for individual database that may function as build/runtime dependencies
[1] http://wiki.debian.org/getData [2] http://www.uniprot.org