Mole

Mole is a QA work-in-progress project. Also look at ["CRMI"].

The goal of Mole is to have one central location where information about packages and other Debian-related objects (such as bugs, or mirror) can be stored.

Mole is currently being worked on by ["Jeroen"] van Wolffelaar as part of his [http://code.google.com/soc/debian/appinfo.html?csaid=31AA1D661D273528 Google Summer of Code project].

See ["Mole/Development"] for a wikipage listing current development status.

What is Mole?

Mole is intended to be an easily accessible piece of infrastructure where anyone can add data repositories, can have actual data submitted in various easy ways into readily available data storage types. All this data is then easily and efficiently available, both in programmatic microqueries or via a webinterface, and as whole datasets, including replication. In addition to this, Mole also provides infrastructure for initiating datamining: generating data by having specific code run over each result from another table, for example.

Advantages

  1. it will be very easy for random ideas to do archive-wide checks, or datamining on all bugs, etc etc, to be implemented by any DD without the need to program the 'boring' infrastructure around it -- one only needs to program the interesting bits
  2. Results of existing QA- and other datamining and archive checks are made easily available for anyone, for humans via the mole webinterface, but also for further automatic processing, via a couple of standard interfaces. This includes lintian results, results of various rebuild efforts, piuparts, but also bug summaries, extraction of changelog files, dependency checks, etc
  3. Powerful new possibilities arise to combine existing information in new ways without the need to coerce information into compatible formats
  4. Existing and future data gathering can easily be made to also process secondary archives, such as security.debian.org, volatile and backports, without the need to specifically target those archives

Sorts of information available

There are several classes of information:

Storage formats

Things are multiple storage types possible, at the moment two are defined, both for 'fixed' types of information (doesn't change over time), such as "the control file out of a source package", and unlike for example "rebuilding the package"

Examples

See for raw data: http://qa.debian.org/data/mole/db

Or for a very very slim web interface: http://qa.debian.org/cgi-bin/mole

More information

The code is available for Debian Developers at merkel:/org/qa.debian.org/mole. It's also in subversion: svn.debian.org, repository "qa", subdir "mole".

The primary author is ["Jeroen"] van Wolffelaar <jeroen@debian.org>