Mole development page

Each section here lists a development task. Tasks are roughly in order of planning-to-be-worked-on, and in itself, should typically be of at-most-one-day-work size.

See Mole for a general description of Mole.

Near future

Define terminology

Need to come up with a nonconfusing terminology; particularly, for datasets (dataset, database, table, ?), the collection of such tables which only vary in parameters (f.e., all packagelists as opposed to packagelist for binaries in unstable)

Status: terminology in use is currently inconsistent, idea's exist

Rest of Google SOC plan

Implement stacking

Most notably, tables that build upon others should get some better way than bluntly redoing their work periodically: the fact that input data changed needs to be propagated some way.

Status: idea's are being pondered, nothing concrete yet coded

Implement the web interface

Status: Dumping webinterface exists, real work starting monday august 6th

Implement HTTP-submission with authentication

Implement HTTP-authentication on the work distributor

Integrate whatever other existing datasets are available in QA and elsewhere and are not yet in mole, as far as time permits

Write a user manual

Finalize the dev overview (generally made while programming)

Package mole and make it host-agnostic, FHS-compliant

Consider any-DD creation of databases without QA-group interference

Consider an email submission interface

More idea's

Add config checker

The config checker should also alert for unknown/unused configuration stanza's, as they might indicate typo's

Status: not started

Add logging configurability

For each dataset, it should be possible to define logging behaviour: log detail, and whether and for how long incoming update files are to be retained.

Define configuration format

Need to come up with a good definition for the configuration file. It'll be dak-like (using apt's config file parser).

Status: definition-by-example mostly done, still need proper definition and documenation

Implement configuration parser

The configuration needs to be parsed into a datastructure that the rest of mole can easily use.

Status: Current system works, would still be nice to have so that configuration is more reliable and replacable, but not at all a priority

Do something about not-yet-created tables

When tables do not yet exist, you get backtraces when trying to read them (for example, in todo code, or various other places). Best would be to have some clean way to simply get an empty then then, and/or make sure that tables are always created if mentioned in the config.

Status: with a hack in, not an urgent issue anymore

Cleanup older cruft

At several places, cruft can accumulate. Ways to clean this up need to come there:


But kept for historic reasons, for now.

Cleanup code

The current version in subversion contains a number of hacks and shortcuts, those need to be cleared away/generalized.

Status: mostly done not worth a mention anymore

Make path configuration fully configurage

Ensure you do not need to edit the code if you install mole in a different location. Mole should start out by reading $HOME/.mole.conf (which can of course be a symlink), and all future paths are defined in that config file or files referenced by the config file.

Status: Done, for mole itself (exceptions are logrotate.conf and worker-config)

Move package-specific code into mole jobs

Mole itself doesn't need to know about package files and archives, that work can (and should) be put into a seperate mole job.

Status: done

Implement transitional datatypes

Some (many) datatypes are 'transitional', that is, the value for a given key can change over time, and it makes sense to keep a history of those changes.

Status: done, though query interface not complete yet

Implement mass-submission

There needs to be a way to submit many key/values in one go, to reduce load.

Status: done

Deliver talk in Edinburgh

I'm going to give a talk in Edinburgh about mole, on Friday June 22nd.

Status: done,

Implement everything else needed to at least get a database for lintian results

Status: Well, this works again.