Student Application Template
Name: Bogdan Purcăreață
IRC: dodgerblue on OFTC, Freenode
- bogdan.purcareata AT gmail DOT com
- Senior undergraduate student at the Computer Science and Automatic Control Faculty, Politehnica University of Bucharest
Knowledge in C (8 years), as well as C++, Java and Python
Knowledge in Algorithm Design, Data Structures and Project Development Workflow
Knowledge in Compilers, Operating Systems, Networking, Distributed Systems
- Highly dependable and efficiency oriented professional
- Ambitious, focused and enduring individual
- Familiar with the concepts of Open Source Development - I've pursued an Open Source Development Course, organized by ROSEdu.
- Refactoring, optimizing and unifying the metadata acquire system for APT would significantly improve the whole Debian user experience, as well as improve the Package Management System's consistency. To me, this is both a thrilling challenge and a great opportunity to analyze the Debian OS internals and core features.
Project title: Pluggable Acquire-System for APT
- Debian has developed several tools to manage packages, each one having its own way of handling metadata. This results in a mixed package management system, prone to inconsistencies and loss in overall OS performance. The aim of this project is to build a broad image of the package metadata locally, so all the information is kept in one place and is updated at the same time. The user, by choosing which tools to use, tells the manager what metadata to acquire, therefore how specific a Debian Archive local image he wishes to interact to.
Several tools for package management - apt-get, debtags, apt-file.
- Each one handles its own set of package metadata, therefore its own view of the remote Debian Archive:
apt-get: Release, Packages, Sources, Translations.
debtags: Tags (facets and tags).
- Each one requires individual updating and interaction.
- Private parsing of the sources.list file in apt-get.
Performance: minimal bandwidth usage by efficient diffs and broad local package metadata.
Effectiveness: the user defines which components the acquire system will use, and the system only uses these components.
Scalability: the system is pluggable.
Forward compatibility: the system will have a generic design, open for future development.
Backward compatibility: it is desired that the system doesn't break the existing interfaces, and the transition to the new sistem is as transparent as possible to the user.
Openness: providing a public parser for the sources.list file, that other package management tools can use instead of inventing their own.
The Enhanced sources.list File:
- stores additional information for each URL, besides suite and area - e.g. the enabled plugins.
- remains compatible with the current apt-get update private parser, which is compatible with present format only - this can be solved in the format of the sources.list, or as a patch for the current parser.
another option to enhance the functionality of this static file is to store additional plugin info in a separate directory - e.g. /etc/sources.list.d/plugins/. The parser will scan the contents of this directory to fetch additional relevant info.
The Enhanced Public Parser:
- is implemented using the libapt API.
- provides a parsing API for the package management tools frontends.
- represents the pluggable component of the system - plugins are registered at install time with a default configuration. The user may handle plugin management via an interface.
- comes with apt-get update old parser functionality by default.
- supports a generic plugin model for new types of metadata.
- there are two ways of categorizing the plugins:
I suggest the second one is used, due to finer granularity.
The Unified Metadata Backend:
invokes the parser to build an index of desired metadata to fetch from the Debian Archive.
is responsible with fetching the metadata from the Debian Archive, processing it, and retrieving it to apt-get update.
- implements efficient transport mechanisms.
- implements security enforcement mechanisms.
Benefits to Debian OS:
- Improving the package management system translates into improving the fundamental layer of the Debian OS.
- Better bandwidth usage.
- Less configuration and temporary files, and all kept in one place.
- Scalability of the metadata acquire system.
- Better metadata cohesion.
- Future package management tools won't be coerced to build their own metadata framework - they will just have to come up with a plugin.
Benefits to Debian Community:
- Popularity through usability.
- Popularity through performance.
Integration with other communities through AppStream.
A new and enhanced format for the sources.list file and the additional information (the enhanced sources.list file).
A public, pluggable parser, capable of understanding this new format (the enhanced public parser)
An insightful configuration interface for the parser's plugin management (the enhanced public parser interface)
An efficient and secure acquire logic (metadata backend).
A generic, extensible model for a plugin - what it handles, how does it handle it, when does the information change (generic plugin).
Plugins for present tools - apt-get, apt-file, debtags, ... (specific plugins).
(Possible plugin for AppStream).
All of the above would result in a powerful apt-get update tool capable of handling all OS package management metadata in a structured and coherent way.
April 23 - May 21:
- Get in touch with the mentor.
- Install a local build environment.
- Get familiar with the Debian community and development model.
- Debian source code structure.
- C++ is a very powerful language - how much of its cutting-edge features are used by DD, do I need to improve language knowledge to cope with understanding the code?
- Security issues - authentication, authorization, types of attacks, data integrity.
- Efficiency issues - responsiveness, bandwidth usage, caching.
Research State of the Art
- The present package metadata acquire logic.
- The format of the configuration files.
- The relanshionships between different pieces of metadata.
- The Debian Archive format.
The AppStream specs and metadata.
- Configuration model.
- Plugins model.
- Parser model.
May 21 - July 13
- Implement the configuration model and acquire logic - first draft and unit tests.
- Implement the metadata acquire backend - first draft and unit tests.
- Integrate apt-get update with the new metadata acquire backend - assert functionality.
- Implement the first supported plugin.
- Integrate apt-get update with the backend using this plugin - assert functionality through integration testing.
- Implement sources.list parser - first draft.
- Generate configuration file using the parser - assert functionality.
At this point there should be a complet proof of concept upon the functionality of the new model and a set of tests for each module.
July 13 - August 13
- Configuration interface for the user.
- Support for other plugins.
- Sources.list parser - final implementation.
- Configuration and acquire logic - final implementation.
- Metadata acquire backend - final implementation.
- Final testing and refactoring.
August 13 - hard deadline
- Final touches.
Exams and other commitments:
- During this period I'm developing my Bachelor Thesis project, which is expected to be done until de 30th of May. After this date, my main interest will switch to GSoC.
Other summer plans:
- This is not certain, but I'm planning to leave the country for a few days on a trip this summer - a week at most. During this time, I won't be able to code at GSoc.
My first real contact with Linux - I had a first try with Slackware and SUSE, but I didn't find them that easy to learn.
- Open Source - powerful in learning new technologies and meeting new professionals.
- Used worldwide - currently one of the most popular Linux distros (along with Ubuntu, which is basically Debian as well).
- C/C++ - my first programming playground, and the one I'm most familiar with.
Are you applying for other projects in SoC? No.