Differences between revisions 23 and 24
Revision 23 as of 2013-05-17 02:05:09
Size: 8355
Editor: ?ClementSchreiner
Comment: add API to schedule
Revision 24 as of 2013-05-17 13:58:34
Size: 8752
Editor: ?ClementSchreiner
Comment:
Deletions are marked like this. Additions are marked like this.
Line 41: Line 41:
Line 49: Line 50:
      * plugins define information sources: emails, RSS feeds, JSON APIs, apt repositories, etc.       * a plugin is specific to an information source: emails, RSS feeds, JSON APIs, apt repositories, etc.
Line 51: Line 52:
      * they provide django templates and orm models       * a plugin can provide orm models, templates, and API commands
Line 53: Line 54:
      * they can provide celery tasks when useful       * a plugin can provide celery tasks when it's useful
Line 56: Line 57:

      * only basic information about a package will be here
Line 62: Line 65:
     The current implementation of the PTS is very fast, since it serves exclusively static html files (generated with XSLT). Since this rewrite will be dynamic, some cache will be needed to achieve acceptable performances.      The current implementation of the PTS is very fast, since it serves exclusively static html files (generated with XSLT). Since this rewrite will be dynamic, a cache will be needed to achieve acceptable performances.
Line 84: Line 87:
     * packages: serves the webpages with information about packages      * packages: serves the webpages with information about packages. It will depend on ''plugins''
Line 87: Line 90:
        * api: the SOAP API (and later JSON or others). Depends on ''plugins''.

     * feeds: RSS/ATOM feeds for news about a package.


   * '''Communication'''

    In addition to IRC and email, I will blog at least weekly to give a report of my progress.

PTS rewrite in Django

---IN PROGRESS---

I am familiar with free software, as an user since 2004, and a developer since 2010: I have made occasional contributions to the Weboob framework, which allows easy interactions between console or graphical applications and various websites.

I have participated to the 2012 Summer of Code, working on debexpo. You can read my final report to get an idea about what I did. For a good part of the summer, I have struggled with designing a good plugin system for debexpo's various sources of data. I think this will help me a lot for this project, since it needs to be modular.

Thus, I have experience with team work, VCS, bugtrackers and mailing lists. My last summer of code taught me a lot about the Debian process and how to interacting with packages in python. Because I revamped the plugin architecture, I have rewritten all QA tests plugins and learned how to retrieve data from the BTS, debtags, lintian, etc.

My work has not yet been merged with the live http://mentors.debian.net because of various problems with the existing codebase I had to work around, but I am currently (trying to find the time for) helping rewrite the whole application with Django. I hope this project will give me experience that will help me working on debexpo's rewrite more efficiently.

  • Project title: PTS rewrite in Django

  • Project details:

I will be using celery for asynchronously retrieving data from external programs or other websites, and for scheduling tasks of retrieval that cannot be completely dynamic.

One of the most challenging part of this project is to write a robust plugin architecture for easily defining information sources (how to retrieve the data, and how to display it on the web pages). It must be flexible enough to allow Debian derivatives to write their own plugins and seamlessly integrate their own tools into their PTS instance. I have a few ideas of existing libraries that I could use for that, or at least take inspiration from: django-plugins, django-app-plugins, another django-plugins. The latter might be fit for direct use in this project.

  • Email Interface

    • It could be useful to find or write a small library for defining forwarding rules like we do for url routing with django. A project I should look into: lamson, although probably not for direct use (it acts as a full MTA/LDA, which I don't find a good idea).

      • Tasks to do:
      • import information from emails into the database (receive_news.py)
      • forward emails depending on regexp (dispatch.pl)
      • interface with users (control.pl)
      • handle bounces (bounces_handler.py)
  • API

    • provide an equivalent to the former SOAP API for legacy
    • provide new APIs (JSON, REST, ...), if possible. Otherwise I'll do that after the summer of code is over.
  • Plugin Architecture

    • Plugins can import data into the database as news, todo or excuses item. Maybe they will also retrieve the number of bugs from BTS, and other small tasks that I will find useful to modularize.
    • a plugin is specific to an information source: emails, RSS feeds, JSON APIs, apt repositories, etc.
    • a plugin can provide orm models, templates, and API commands
    • a plugin can provide celery tasks when it's useful
  • Web pages for accessing the packages' information

    • only basic information about a package will be here
    • define 'mount points' for inserting data from the plugins' template
  • Caching

    • The current implementation of the PTS is very fast, since it serves exclusively static html files (generated with XSLT). Since this rewrite will be dynamic, a cache will be needed to achieve acceptable performances. Django's cache framework with memcached seems the easiest (and an easy) solution, but I will have to compare its performances with other technologies.
  • Design

    • I will be using bootstrap for the design of the application. It could be interesting to develop base templates and CSS that could readily be used and extended by other Debian django webapps.
  • Testing

    • I will write unit tests as I write features, with Django's testing framework.
  • Database migrations

    • I will use South for smooth migrations after changes in a model.

  • Tentative list of apps and their models

    • plugins: the plugin system
    • mail_interface: for former dispatch.pl, control.pl, bounces-handler.pl, etc.
    • packages: serves the webpages with information about packages. It will depend on plugins

    • profiles: the users' profiles, allowing them to edit their subscriptions
    • api: the SOAP API (and later JSON or others). Depends on plugins.

    • feeds: RSS/ATOM feeds for news about a package.
  • Communication

    • In addition to IRC and email, I will blog at least weekly to give a report of my progress.
  • Synopsis: The Package Tracking System is currently a collection of scripts written in perl, python and shell, with static webpages generated from XML and XLST templates. Although it does its job correctly, it is becoming difficult to maintain and improve. At the same time, an increasing number of Debian projects are switching to the Django web framework, which could allow future sharing of code among them. It is thus a great time to make the same move for the PTS.

  • Benefits to Debian

With this project, the PTS will use a consistent set of technologies that are also popular in the Debian community. That should help attracting new contributors.

Ths PTS will also be more modular, and would be useful to Debian derivatives, which could also bring more interest from the community, and contributors.

  • Deliverables:

    • a modular replacement to the current PTS, using the Django framework
    • a set of plugins providing the same features as the current PTS
    • a web UI for displaying data from those plugins
    • an 'email controller' for processing/forwarding incoming emails
  • Project schedule:

    • (before GSoC)
      • more discussion with mentors and others developers with an interest in the PTS
      • research existing plugin manager implementations for inspiration or code reuse.
      • package any missing dependency (django 1.5, logutils) for wheezy-backports

    • June 17 - 24 (1 week): Design a simple plugin system

    • June 24 - July 8 (2 weeks): Then, in order to test and improve the plugin architecture in real situations, I will, in short iterations:

      1. Develop a basic web interface for displaying a package
      2. Write simple plugins importing some data into the database
      3. Improve plugin system if needed
      4. Repeat
    • July 8 - 15 (1 week): design and implement the API, using the plugins already written for testing

    • July 15 - 29 (2 weeks): Implement the mail interface and write plugins that imports data from emails.

    • August 2 - 19 (2 weeks 1/2): Write all remaining plugins. It should be easy once the plugin system is well designed.

    • August 19 - September 9 (3 weeks): Implement the cache and study resulting performances

    • September 9 - 16 (1 week): Improve the website's design

    • Septembre 16 - 29 (2 weeks) Finalize documentation and write more tests.

      • If I have the time, I will also write plugins for features that have been requested, for example adding TODO items from debtags.
  • Exams and other commitments: thanks to the gsoc being one month later this year, my exams will be over and I'll be free of any other commitment the entire summer.

  • Other summer plans: None

  • Why Debian?: I have been a user of Debian for almost 10 years, and a small contributor since last summer.

  • I am not applying to other summer of code projects.