Package Repository Analysis and Migration Automation
Mentor
Neil Williams codehelp@debian.org (codehelp on IRC, typically #emdebian, #debian-arm, #debian-uk and #debian-soc)
Synopsis
Emdebian uses a filter to select packages from the main Debian repositories that are considered useful to embedded devices, excluding the majority of packages. The results of processing the filter are automated but maintaining the filter list is manual. This project seeks to automate certain elements of the filtering process to cope with three specific conditions:
- Packages which have been removed from Debian need to be removed from the filter - on a per suite basis.
- Packages which have been added to Debian to meet the dependency requirements of other packages already in the filter need to be added to the filter.
- Packages need to migrate from unstable into testing in a manner that ensures that all dependencies are met in each suite.
Benefits to Debian
The aim is to produce three daily lists:
- source package names which need to be removed from each suite,
- source package names which need to be added to each suite and
- source package names to migrate between suites.
This will allow Emdebian to quickly and effectively manage it's own package repository. Right now too much manual effort goes into maintaining these lists. By automating the migration of packages, it will save hours of work better spent on improving other features of Debian.
Deliverables
A Debian package to run on the Emdebian server as a cron task.
Project schedule
May 24 - Begin designing application structure and necessary algorithms
June 7 - Begin coding data package parser portion of the application
June 14 - Rigorous testing and documentation of first portion of the application
June 16 - Submit midterm evaluations
June 21 - Begin coding package comparision portion of the project
June 28 - Rigorous testing and documentation of second portion of the application
July 5 - Begin coding migration validation portion of the project
July 12 - Rigorous testing and documentation of third portion of the application
July 19 - Begin coding dependancy satisfaction portion of the project
July 26 - Rigorous testing and documentation of the final portion of the application
August 2 - Ensure validity of output produced and make necessary corrections
August 9 - Finish writing application and spend time improving documentation and writing any necessary user guides
August 20 - Submit final evaluation
Project details
A lot of work has been done on modeling the dependencies between packages by the EDOS project using OCAML. A similar approach is needed to calculate the list of candidate packages which can be migrated at the same time.
Undoubtedly the most complex part of the project is to calculate the testing migrations where several criteria must be met:
- Version in Debian unstable must match version in Debian testing
- Version in Debian unstable must match version in Emdebian unstable
- Version in Emdebian unstable must be newer than Emdebian testing
- All architectures are compared, including source.
- All dependencies must migrate together, adding new packages to the filter where necessary.
(Emdebian versions use a suffix which needs to be handled before comparing version strings against Debian.)
The resulting code needs to run on a server as an automated task, using minimal resources and in a shorter time frame than be achieved with the current perl support.
So, the process by which packages are determined to be accepted is as follows:
- Emdebian uses Debian as 'parent'. The first criterion is that the package has already satisfied the criteria for Debian and this is a simple data parsing operation from the Packages files. A package in file A is the same package as in file B, therefore that package has (at some point) previously met the criteria for Debian and is a candidate for us. The number of packages which would fail this initial test will vary according to where emdebian is in the release cycle - after a release, this number is very high, during a freeze it can be very low (single digits).
A simple check on our own Packages files (files C and D) to see if we need to bother with this package or whether the work has already been done by a previous run. A lot more packages will drop out at this test. Debian has >20,000 packages, Emdebian only cares about ~2,000 so 90% of the work has gone by the time this criterion is completed.
- Whether we have the right package for a migration. The following check is performed.
- Package in Debian unstable == Emdebian unstable
- and
- Package in Debian testing != Emdebian testing.
- Package in Debian unstable == Emdebian unstable
- Now that the packages have been filtered, the dependency solving maths needs to be done. The dependencies of the package in unstable need to exist at the correct versions in testing. If this test fails, the package drops through to the "missing dependency" output - once missing packages are sorted out, the next run will be able to migrate the package.
Application Information
Name: Ricardo O'Donell
Contact/Email: ricardo@odonell.ca, IRC Nick rodonell #debian-soc
Background: I’m completing my final year in a concurrent degree in Electrical Engineering and Computer Science at The University of Western Ontario in Canada. I have also completed a 16 month internship at 3M Canada as a programmer analyst, along with a summer working in the IT department at the university writing applications in Perl. Between the internship and CS degree the majority of my programming experience is in C, C++ and Java but on my own I’ve dabbled in a variety of other platforms and languages.
Travel: It will be difficult to attend DebConf10 as I have a wedding to attend on August 1st. If possible I could leave on the 2nd and attend the rest of the conference.
Other summer plans: Beside minor weekend trips nothing that would conflict with my work schedule.
Exams and other commitments: I have no conflicts or commitments.
If you are not a Debian Developer: Between work and school I've never had the time to sit down and actively contribute to an open-source project before. I see this as an opportunity to finally sit down and put my full effort into a project which has real benefits. I would like to continue to support the community and help mentor other students in the future.