Differences between revisions 6 and 7
Revision 6 as of 2013-05-02 22:45:31
Size: 5413
Editor: BorisBobrov
Comment:
Revision 7 as of 2013-05-02 22:48:30
Size: 5493
Editor: BorisBobrov
Comment:
Deletions are marked like this. Additions are marked like this.
Line 36: Line 36:
 * What script will return a '''JSON''' with required data (see "Why JSON" section below)  * What script will return a '''JSON''' with required data (see "Why JSON" section below). This script will be invoked when a page with metrics is requested by a visitor
  • Name: Boris Bobrov

  • Contact/Email: email: breton@cynicmansion.ru, breton in IRC, jabber: breton@jabber.ru

  • Background: I am a 3rd year student at Tashkent branch of Moscow State University, applied math and informatics faculty. Experience:

    • 2.5 years of Python
    • 2 years of Django
    • 2.5 years of C in university
    • 1.5 years of C++ in university
    • 4 years of javascript (though I don't write too much in it); with jQuery and pure
    • 5 years of HTML and CSS
    • 1 year of Scheme and 6 months of Common Lisp
    • Basic knowledge of system administration (set up nginx, uwsgi etc, familiar with cron, shell scripting)
    • Familiar with unit tests and test-driven development techniques
    I use:
    • Debian GNU/Linux as main OS for 4 years
    • git (also know mercurial)
    • vim (emacs for Lisp)
    git://github.com/bretonium/gsoc_metrics_task.git - the required code
  • Project title: Debian Metrics Portal

  • Project details:

The project is to create a Debian Metrics Portal, a portal, which will be a central place for various metrics and stats. The portal will:

  1. Perform measurements by itself, from various sources and by different ways;
  2. Collect ready-made stats from various places;
  3. Display collected data in various ways (in text, in plots)

What is required to describe when creating a new metric

  • Name, category etc
  • Where measurement happens (local/remote)
  • If remote, what script will receive and save the data
  • If local, what script will collect and save the data
  • What script will return a JSON with required data (see "Why JSON" section below). This script will be invoked when a page with metrics is requested by a visitor

  • What fields can be selected for building graphs, what is their types and what is the type of the graph (plot, histogram etc)
  • A Django template, which will be used to render the data (though it is possible that other templating languages will be added in future) The author of the script decides by himself how and where to store the data.

Why JSON

  • Most of Debian statistic collectors are written in either Python or Ruby. Both these languages perfectly support convertion to JSON and the data we pass is rather simple

Django templates

  • A tag for inserting graphs into user templates will be made
  • Some default generic templates will be available

The metrics can be pretty simple. For example, a dependency of number of bugs from the time can be represented as a simple list of dicts. For these cases an even more generic approach can be used.

A simple metric

  • Name, category etc
  • Where measurement happens (local/remote)
  • If local, what script generates the JSON with data (and how often it should be called)
  • Is the data a delta from previous measurement or the scripts regenerates the whole sample
  • What keys does the data have and what is their type (int or string)
  • What keys can be used for graphing
  • A template (with an option to use a generic template)

This data will be saved in an inner table; the script author does not need to care, where to save his measurements.

For remote measurements

  • The remote script will send a query to the portal with a JSON, containing the data
  • The JSON will be passed to a defined script (via stdin)
  • The output of the script will be return in reply

Some other notes

  • A nice example of graphs layout: https://metrics.torproject.org/

  • A visitor can select by himself, which data should be plotted (if allowed by the metrics author)
    • Maybe even without page reload, with Ajax.
  • The "simple metric" will be based on the "generic metric" and new types of metrics can be added, if required.
    • For example, a "UDD metric", where the author will be required only to select tables and fields from UDD


  • Synopsis: building a Debian metrics portal with a uniform (Web) interface to peruse Debian metrics, as well as a uniform (programming) interface to maintain them.

  • Benefits to Debian:

    • A single place for all statistics and metrics
    • A possibility to easily create simple metrics
  • Deliverables:

    • standardized interface to add/remove metrics to be graphed (possibly with different sampling rate)
    • integration of existing graphs in the metrics infrastructure
    • web interface to show daily (or more frequently) updated graphs of the various metrics
    • dynamic web interface to graph, on demand, specific metrics (possibly more than one at a time) over specific time period
    • a portal with modular architecture
      • where generic tasks could be simplified
      • suitable for use by adepts of different programming languages
  • Project schedule: TODO

  • Exams and other commitments:

    • Possible (not confirmed yet) a two-weeks trip to a Moscow State University Summer School. During these 2 weeks I will be able to work on GSoC for ~4h a day
    • Maybe 2 exams in the middle of the June. A day of idling for each one.
    • My next semester begins on the 3rd of Septermber, so most of the job needs to be done before that.
  • Other summer plans: None

  • Why Debian?: I use Debian GNU/Linux for ~4 years and see GSoC as a good way to integrate into community closer.

  • Applications to other orgs: Possibly. Though in case of duplication I'd like to join Debian.