Differences between revisions 2 and 35 (spanning 33 versions)
Revision 2 as of 2011-05-30 07:11:57
Size: 1172
Editor: PaulWise
Comment: add packages integration
Revision 35 as of 2019-04-12 01:54:33
Size: 6955
Editor: PaulWise
Comment: update popcon status/TODO
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:

The code being written to help integrate the census into Debian infrastructure is available in git under the deriv-team project:

{{{git clone https://salsa.debian.org/deriv-team/census.git}}}<<BR>>
https://salsa.debian.org/deriv-team/census

The data output by the census code is available here:

http://deriv.debian.net/

The census code's [[https://salsa.debian.org/deriv-team/census/blob/master/doc/README|README]] documents extensively how the codebase works.

<<TableOfContents()>>

== Status ==

Currently the census is only minimally used for integration of derivatives with Debian:

 * [[#planet-debian-deriv|planet.debian.org/deriv]]: aggregates news feeds and developer blogs from Debian derivatives

Some information about derivatives has been manually integrated:

 * PTS: contains information about Ubuntu bugs, version numbers and patches
 * DDPO: contains information about Ubuntu bugs, version numbers and patches
 * UDD: contains information about:
  * packages from aptosid, skolelinux, Ubuntu
  * bugs from Ubuntu
  * popcon from Ubuntu
 * website: contains information about some derivatives with a few links to the census
 * wiki: some javascript to detect the status of bugs linked to from the wiki. Currently supports the Debian bug tracker and the Launchpad bug tracker.
 * debtags: imports packages from Ubuntu and allows them to be tagged
 * screenshots: imports packages from Ubuntu and allows them to be screenshotted
 * popcon: derivatives users submit vendor information

<<Anchor(planet-debian-deriv)>>
=== Planet Debian derivatives ===

Aggregates all the developer and main blogs of all derivatives in one location on https://planet.debian.org/deriv/. The derivatives census scripts perform the following steps:

 1. download the census pages (scripts [[https://salsa.debian.org/deriv-team/census/blob/master/bin/get-wiki-text|1]] [[https://salsa.debian.org/deriv-team/census/blob/master/bin/get-url-eol-fix|2]])
 1. look for items related to blogs (scripts [[https://salsa.debian.org/deriv-team/census/blob/master/bin/wiki-text-to-blogs-list|1]] [[https://salsa.debian.org/deriv-team/census/blob/master/bin/wiki-text-to-dev-blogs-list|2]])
 1. look for a logo ([[https://salsa.debian.org/deriv-team/census/blob/master/bin/wiki-text-to-logo|script]])
 1. find if there are RSS feeds on those pages ([[https://salsa.debian.org/deriv-team/census/blob/master/bin/get-blogs-rss|script]])
 1. download the logo ([[https://salsa.debian.org/deriv-team/census/blob/master/bin/get-logo-img|script]])
 1. generate a planet config snippet ([[https://salsa.debian.org/deriv-team/census/blob/master/bin/derivative-to-planet-config|script]])
 1. resize the logo into a head ([[https://salsa.debian.org/deriv-team/census/blob/master/bin/logo-to-planet-head|script]])
 1. aggregate the [[http://deriv.debian.net/planet-config|planet config]] snippets and [[http://deriv.debian.net/planet-heads/|planet heads]] (scripts [[https://salsa.debian.org/deriv-team/census/blob/master/bin/aggregate-planet-config|1]] [[https://salsa.debian.org/deriv-team/census/blob/master/bin/aggregate-planet-heads|2]])
 1. download the [[https://salsa.debian.org/planet-team/config|Planet Debian git repository]] ([[https://salsa.debian.org/deriv-team/census/blob/master/bin/get-planet-git|script]])
 1. compare the new planet config file and heads against the planet [[https://salsa.debian.org/planet-team/config/raw/master/config/config.ini.deriv|config]] and [[https://salsa.debian.org/planet-team/config/tree/master/heads/deriv|heads]] in git, mail the results via cron output (scripts [[https://salsa.debian.org/deriv-team/census/blob/master/bin/compare-planet-config|1]] [[https://salsa.debian.org/deriv-team/census/blob/master/bin/compare-planet-heads|2]])
 1. A human should then review the cron mails and when necessary run the [[https://salsa.debian.org/deriv-team/census/blob/master/bin/review-planet-config-heads|review script]] using {{{cd var ; make review}}} and commit any changes that are valid and useful to [[PlanetDebian|Planet Debian]] git.

<<Anchor(patches)>>
=== Patches ===

A script running on a machine containing a mirror of [[https://snapshot.debian.org/|Debian snapshot]] and a database with the metadata [[https://salsa.debian.org/deriv-team/census/blob/master/bin/get-package-lists|updates apt]] and then runs the [[https://salsa.debian.org/deriv-team/census/blob/master/bin/compare-source-package-list|compare-source-package-list]] script once for each derivative.

compare-source-package-list iterates through all the Sources files for a derivative, iterates through all the source packages, downloads source packages when needed, checks if they were ever in Debian and if they were not, applies some heuristics to figure out which Debian source package they were derived from, runs debdiff against the relevant packages and outputs some metadata.

The patches are [[http://deriv.debian.net/patches/|available]] via HTTP and shell on [[DebianMachine:lw08|lw08]]. Some metadata about them is [[http://deriv.debian.net/sources.patches|available]] (warning, very large).

The patches are linked from the patches panel of the Debian PTS, but [[DebianBug:779400|not yet]] from the package tracker.
Line 14: Line 75:
 * qa.debian.org/madison.php
Line 15: Line 77:
=== Planet Debian Downstream === === Bugs ===
Line 17: Line 79:
Aggregate all the developer and main blogs of all derivatives in one location on http://planet.debian.org/downstream/. Will require a script to: Link to the derivatives bug trackers from the PTS so that Debian folks can find potential problems in their packages. This might be hard since it can be rare for derivatives to provide mappings between a bug report and the source package it applies to.
Line 19: Line 81:
 1. download the census pages
 1. look for items containing the word blog
 1. find if there are RSS feeds on those pages
 1. generate a new planet config file
 1. compare with old planet config file
 1. print a list of changes
=== Screenshots ===
Line 26: Line 83:
Then a human should review the feeds and any make changes needed. Allow folks to upload screenshots for packages that are only in derivatives. This would rely on some way to differentiate between free and non-free sections in our derivatives, since screenshots.d.n does not accept screenshots for contrib/non-free.

=== Popcon ===

Publish stats about dpkg vendors in the data exports.

Add links from the [[https://popcon.debian.org/|Debian Popularity Contest]] statistics to the corresponding census pages ([[https://lists.debian.org/msgid-search/51cbf614107873f2b0cdfff572e5107cf0b997ce.camel@debian.org|discussion]])

Import popcon information from derivatives, present on the graphs and link from Debian graphs

We now have a list of derivatives and some information about them. A list of derivatives hiding away in a corner of the wiki is not that useful. Better would be to aggregate the available information about derivatives and present it in Debian infrastructure in a way that makes the derivatives visible to Debian contributors and useful to them. Got an idea about how we can do that? Add it below!

The code being written to help integrate the census into Debian infrastructure is available in git under the deriv-team project:

git clone https://salsa.debian.org/deriv-team/census.git
https://salsa.debian.org/deriv-team/census

The data output by the census code is available here:

http://deriv.debian.net/

The census code's README documents extensively how the codebase works.

Status

Currently the census is only minimally used for integration of derivatives with Debian:

Some information about derivatives has been manually integrated:

  • PTS: contains information about Ubuntu bugs, version numbers and patches
  • DDPO: contains information about Ubuntu bugs, version numbers and patches
  • UDD: contains information about:
    • packages from aptosid, skolelinux, Ubuntu
    • bugs from Ubuntu
    • popcon from Ubuntu
  • website: contains information about some derivatives with a few links to the census
  • wiki: some javascript to detect the status of bugs linked to from the wiki. Currently supports the Debian bug tracker and the Launchpad bug tracker.
  • debtags: imports packages from Ubuntu and allows them to be tagged
  • screenshots: imports packages from Ubuntu and allows them to be screenshotted
  • popcon: derivatives users submit vendor information

Planet Debian derivatives

Aggregates all the developer and main blogs of all derivatives in one location on https://planet.debian.org/deriv/. The derivatives census scripts perform the following steps:

  1. download the census pages (scripts 1 2)

  2. look for items related to blogs (scripts 1 2)

  3. look for a logo (script)

  4. find if there are RSS feeds on those pages (script)

  5. download the logo (script)

  6. generate a planet config snippet (script)

  7. resize the logo into a head (script)

  8. aggregate the planet config snippets and planet heads (scripts 1 2)

  9. download the Planet Debian git repository (script)

  10. compare the new planet config file and heads against the planet config and heads in git, mail the results via cron output (scripts 1 2)

  11. A human should then review the cron mails and when necessary run the review script using cd var ; make review and commit any changes that are valid and useful to Planet Debian git.

Patches

A script running on a machine containing a mirror of Debian snapshot and a database with the metadata updates apt and then runs the compare-source-package-list script once for each derivative.

compare-source-package-list iterates through all the Sources files for a derivative, iterates through all the source packages, downloads source packages when needed, checks if they were ever in Debian and if they were not, applies some heuristics to figure out which Debian source package they were derived from, runs debdiff against the relevant packages and outputs some metadata.

The patches are available via HTTP and shell on lw08. Some metadata about them is available (warning, very large).

The patches are linked from the patches panel of the Debian PTS, but not yet from the package tracker.

Ideas

Packages

Aggregate all the derivative's sources.list snippets and integrate them into the following:

  • udd.debian.org
  • packages.debian.org
  • packages.qa.debian.org
  • patch-tracker.debian.org
  • qa.debian.org/developer.php
  • qa.debian.org/madison.php

Bugs

Link to the derivatives bug trackers from the PTS so that Debian folks can find potential problems in their packages. This might be hard since it can be rare for derivatives to provide mappings between a bug report and the source package it applies to.

Screenshots

Allow folks to upload screenshots for packages that are only in derivatives. This would rely on some way to differentiate between free and non-free sections in our derivatives, since screenshots.d.n does not accept screenshots for contrib/non-free.

Popcon

Publish stats about dpkg vendors in the data exports.

Add links from the Debian Popularity Contest statistics to the corresponding census pages (discussion)

Import popcon information from derivatives, present on the graphs and link from Debian graphs