Biological databases manager
Mentor: DebianMed
Summary: Download, process, manage and integrate biological databases to Debian.
Required skills:
- Familiarity with one programming or scripting language.
- Familiarity with Debian packaging.
- Bioinformatics.
Description:
Biological databases are typically too big and too volatile to fit the traditional source/binary packaging scheme of Debian. Bioinformatic programs that are distributed in Debian, especially blast and emboss, can index databases, but Debian lacks a tool to install or update datasets and keep their indexing in sync. The student will have to write a program with the following features:
- Download, update or remove a database from a remote repository.
- Use a official list of locations, and accept user-supplied additional locations.
- Detect which Debian packages are installed locally, and process the databases accordingly (mostly by indexing).
Manage the dependancies between data sets (for instance, microRNAs form release X of miRbase are mapped on versions Y and Z of the human and mouse genomes; the program should propose to install the genomes at the relevant version when installing miRbase).
- Let the users have different versions of a database installed.
- Propose the users to install the Debian packages that are most useful for managing the databases they downloaded.
- Be aware of disk space issues, and let the user change the location of the databases.
- (optional) Have a graphical user interface.
Please contact us on debian-med@lists.debian.org for applying.