Differences between revisions 3 and 4
Revision 3 as of 2011-07-21 16:16:40
Size: 14935
Editor: EnricoZini
Comment: More cleanup
Revision 4 as of 2011-07-21 16:39:28
Size: 10976
Editor: EnricoZini
Comment: Done cleaning up
Deletions are marked like this. Additions are marked like this.
Line 95: Line 95:
==== Is there a debtags-policy? === === Is there a debtags-policy? ===
Line 153: Line 153:
===How can maintainers interact with debtags? ===

They can go in their DDPO page (http://qa.debian.org/developer.php) and
=== How can maintainers tag their packages? ===

They can go in their [[http://qa.debian.org/developer.php|DDPO page]] and
Line 157: Line 157:
their packages and edit the categories.


=== When is Debtags going to be integrated into apt or aptitude or ...? ===

That is still a bit out of reach at the moment.

There are proof-of-concept implementations inside the ``debtags`` tool: you can
do ``debtags search``, which is like ``apt-cache search`` but also shows tags.
or ``debtags grep`` which shows packages matching a certain tag expression (try
``debtags grep 'use::editing && media::rasterimage'`` and even ``debtags install``
that does the same as ``debtags grep`` but also invokes ``apt-get`` to install
the resulting packages.

The hope lays in `libept`_, which is still in the making but will
provide a unique interface to all kinds of package metadata. It will hopefully
be a solid and complete foundation to be used by package managers, and will
also make Debtags information available to them.

In the meantime, if you want a graphical interface to look for packages you can
use packagebrowser_ or `debtags-edit`_.
their packages and edit their tags.
Line 182: Line 162:
Everyone can tag new packages using the packagebrowser_ or `debtags-edit`_, but
you can see the new data in apt-cache only after Enrico manually reviews them.


=== When are the tags going to move in the control file? ===
New packages automatically get tagged using a set of euristics. That is better than nothing, but obviously it is not enough, so they are marked with {{{special::not-yet-tagged}}} tags in the web interface, waiting for a human to have a look.

Once a human does proper tagging of a package and removes the {{{special::not-yet-tagged}}} tags, the automatic tagging euristics will not touch it anymore.


=== How do the tags reach the Debian archives? ===
Line 192: Line 173:
Tags are added to the override file after manual review. Think of the tags in
the override file as the "stable" tags and the ones in the debtags database as
the "unstable" tags.

Allowing the maintainers to specify tags in the control file could be
difficult for many reasons:

 * Some tags are more easily added by people who are not the DD. The
   maintainer can add ``made-of::*`` and ``interface::*``, but some other
   person could add ``works-with::*`` and ``accessibility::*``.
 * Sometimes we do a reorganization (for example, moving ``protocol::icq`` to
   ``protocol::im:icq``) and we can't ask all maintainers to handle those
   changes, and these reorganisations tend to happen quite often.


=== What should I do if I'm [also] packaging for derivative distributions? ===

You can provide tag information specific to your target group of users.
See `Can I create my own set of tags and add them to all the packages I want?`_.

Since new versions of debtags (>= 1.7.3), you can create a package that
installs the tag data somewhere (say, in /usr/share/mydistro/tags) and
installs a file under /etc/debtags/sources.list.d/ to automatically get
debtags to use them.

Tags are added to the override file after manual review, which should happen about once a month.

You can think of the tags in the override file as the "stable" tags and the ones in the debtags database as the "unstable" tags.


=== Why don't you just ask the maintainers to tag their own packages? ===

Giving the maintainers ultimate responsibility over the tags of their packages is difficult for many reasons:

 * Some tags are more easily added by people who are not the maintaine. The maintainer can surely add {{{role::*}}}, {{{made-of::*}}} and {{{interface::*}}}, but some other person could be more qualified for adding {{{works-with::*}}} and {{{accessibility::*}}}.
 * We cannot expect every maintainer to understand all the 600+ tags in the debtags vocabulary.
 * Sometimes we do some reorganization of the tags (for example, moving {{{protocol::icq}}} to {{{protocol::im:icq}}}), and we cannot ask all maintainers to handle those changes.
Line 222: Line 189:
I don't think they'll be dropped, as they're serving a different purpose at the
moment (that is, splitting the archive somehow). In my view, they should be
ignored by package managers, using debtags instead.


=== How come there are different sets of tags in the Packages file and in ``/var/lib/debtags``? ===

There are a few reasons:

 * Debtags supports merging different tag sources: for example, iterating.org
   provides a tag source with package rankings and debtags is able to download
   it and merge it to the other tags. Tag sources are listed in
   ``/etc/debtags/sources.list``. This also allows some of us to use the
   unreviewed tags on Alioth instead of the ones in the Package database.
 * For many applications the tags are easier to access when aggregated on a
   small file rather than by parsing the very large package database
 * Finally, the debtags database in ``/var/lib/debtags`` is also indexed for
   fast access.

Have the Packages file as the primary tag storage has never been the main idea,
although it's turned out to be useful to allow tags to be useable in
software such as apt-cache, aptitude and grep-dctrl without them having to be
modified to access an extra database.


=== The Packages file has tags like ``network::{client,server,service}`` and this breaks ``grep-dctrl`` ===

Those compressed tags are there because APT does not like long lines.

You can use ``debtags dumpavail`` or ``ept-cache dumpavail`` to feed data to
grep-dctrl without the compressed tags.

``debtags dumpavail`` also supports tag expressions, so you can even run
commands like::
No, because they are useful for some archive management purposes.


=== How come there can be different sets of tags in the Packages file and in {{{/var/lib/debtags}}}? ===

Debtags supports merging different tag sources: for example, [[http://www.miriamruiz.es/weblog/?p=155]] is an effort to provide parental rating for games in Debian, as tags to be downloaded and merged into the system. Tag sources are listed in {{{/etc/debtags/sources.list}}}. This also allows some of us to use the unreviewed tags on Alioth instead of the ones in the Package database.

Have the Packages file as the primary tag storage has never been the main idea, although it's turned out to be useful to allow tags to be useable in software such as apt-cache, aptitude and grep-dctrl without them having to be modified to access an extra database.


=== Why do I see mangled tags in the Packages file like "network::{client,server,service}"? ===

Those tags are compressed because APT does not like long lines.

You can use {{{debtags dumpavail}}} to feed data to {{{grep-dctrl}}} without the compressed tags.

{{{debtags dumpavail}}} also supports tag expressions, so you can even run
commands like:
{{{
Line 258: Line 209:

``ept-cache dumpavail`` instead supports all ``ept-cache`` search and sort
options, so you can do something like::

        ept-cache dumpavail -t gui image editor -s p | grep-dctrl <options>
}}}
Line 270: Line 217:
http://debtags.alioth.debian.org/todo.html or
http://debtags.alioth.debian.org/edit.html, choose a package, then click on the
`[help]`__ link on top of the page.

__ http://debtags.alioth.debian.org/edit-help.html
[[http://debtags.alioth.debian.org/todo.html]] or
[[http://debtags.alioth.debian.org/edit.html]], choose a package, then click on the
[[http://debtags.alioth.debian.org/edit-help.html|help]] link on top of the page.
Line 280: Line 225:
http://debtags.alioth.debian.org/tags/tags-current.gz [[http://debtags.alioth.debian.org/tags/tags-current.gz]]
Line 283: Line 228:
commits the reviewed updates to ``svn://svn.debian.org/debtags/tagdb/tags``,
which also gets uploaded to Debian.
commits the reviewed updates to {{{svn://svn.debian.org/debtags/tagdb/tags}}},
which gets uploaded to Debian.
Line 290: Line 235:
=== How can I experiment writing applications using debtags? ===

One way to start is reading the `apt-xapian-index`_ introduction and follow to
the next posts that show how to use the index.

For C++, have a look at `libept-dev`_, which allows access to both
debtags and apt package data.

For Python, the `python-debian`_ package has a good ``debtags`` module and
various interesting code examples.

Otherwise, you just access the data files directly: when the ``debtags``
package is installed, you can find them in ``/var/lib/debtags``.

And of course don't forget to subscribe to the `debtags-devel mailing list`_,
where you can ask for help.
=== How can I experiment writing applications that use debtags? ===

The best way is to look at [[http://www.enricozini.org/sw/apt-xapian-index/|apt-xapian-index]], which is accessible from most programming languages via the [[http://xapian.org/|Xapian]] libraries.

You can find a [[http://www.enricozini.org/2007/debtags/apt-xapian-index/|tutorial]] published as a series of blog posts, each post links to the next one.

For Python, the `python-debian`_ package has a ``debtags`` module and various code examples.

Otherwise, you just access the data files directly: when the {{{debtags}}} package is installed, you can find them in {{{/var/lib/debtags}}}.

And of course don't forget to subscribe to the [[http://lists.alioth.debian.org/mailman/listinfo/debtags-devel|debtags-devel]] mailing list, where you can ask for help.
Line 310: Line 250:
There are three main things needing help:

 1. You can take care of the website__, and keep it updated with the news
    that happen in the list.

 2. You can try to use debtags functions (you can now do it from C++, Python
    and Perl!), and ask questions that could then be turned into Doxygen
    comments, HOWTOs, tutorials, FAQs, example code and other forms of
    documentation.

 3. If you have knowledge of some specific field and a twist on categorization,
    you can help `improving the vocabulary`__

__ http://debtags.alioth.debian.org
__ http://debtags.alioth.debian.org/vocabulary.html


Here are other things that would be needed, but might be a bit more difficult:

 * Help maintain library bindings to languages different than C++
 * Help improve the GUI tools
 * Help packaging all the various Debian packages related to Debtags
 * Help writing more C++ test cases for the libraries
 * Help with i18n/l10n issues, to take Debtags on a trip outside of the C
   locale
 * Use libtagcoll1 to bring the Debtags faceted classification approach to
   domains different than Debian packages: think browser bookmarks, multimedia
   repositories, mp3 archives, documentation, launcher menus... the approach
   has big potential in so many fields!


== Older questions ==

=== Aren't debram and debtags duplicating the same effort? ===

Yes, but only up to some point: they started as two parallel projects that
didn't know about each others. Debtags has a more solid `theorical foundation`_,
while debram_ has data for the entire set of packages in Sarge.

Thaddeus H. Black, the author of debram_, intends to converge to debtags and is
an active poster in the `debtags-devel mailing list`_. For this reason
the debram_ package suggests debtags: like saying "yes, I'm ok, but you might
want to look at debtags as well".



.. _adept: http://web.ekhis.org/adept.html
.. _debram: http://packages.debian.org/unstable/admin/debram
.. _debtags-devel mailing list: http://lists.alioth.debian.org/mailman/listinfo/debtags-devel
.. _debtags-edit: http://packages.debian.org/unstable/misc/debtags-edit
.. _libept: http://packages.qa.debian.org/libe/libept.html
.. _libdebtags1-dev: http://packages.debian.org/libdebtags1-dev
.. _libapt-front-dev: http://packages.debian.org/libapt-front-dev
.. _libept-dev: http://packages.debian.org/libept-dev
.. _packagebrowser: http://debian.vitavonni.de/packagebrowser/
.. _packagesearch: http://packagesearch.sourceforge.net/
.. _theorical foundation: http://debtags.alioth.debian.org/faceted.html
.. _Alioth project: http://debtags.alioth.debian.org/
.. _python-debian: http://packages.debian.org/python-debian
.. _apt-xapian-index: http://www.enricozini.org/2007/debtags/apt-xapian-index.html
You can [[http://debtags.alioth.debian.org/todo.html|help tagging packages]].

You can write cool package interfaces using Debtags.

You can help to take care of this wiki.

If you have knowledge of some specific field, you can help improving the vocabulary.

Debtags Frequently Asked Questions

Table of contents:

General

What is Debtags?

Debtags is a set of categories to describe Debian packages.

It provides a vocabulary of categories as well as tag information for the packages.

Where can I find information about debtags?

On Debtags you can find an index.

Where are debtags used?

Higher-level package managers such as software-center, synaptic and aptitude all supports debtags in some way.

If you would like to create some high level package tool, you could build on apt-xapian-index, which is a package information backend which supports debtags and much more.

Why aren't debtags integrated with apt?

They probably don't need to be integrated with apt, whose main purpose is to resolve dependencies and figure out what packages to install.

apt-cache, however, does show the tags of a package.

What are future plans and perspectives?

debtags is rather mature, and not much is expected to happen besides regular maintenance.

If you are looking for exciting new work you can look at the Appstream project.

Is there a debtags mailinglist?

Yes: debtags-devel, on Alioth can be used for anything about debtags, not just development. You are more than welcome to subscribe to it.

Using the data

What is a facet?

A facet is a group of tags which describe the same quality of a package. For more informations, see Debtags/FacetedClassification.

What does a notation such as "works-with::image:raster" mean?

It means that the facet (the point of view from which we look at the packages) is works-with, and that the tag (what kind of data this package can handle) tag: image:raster.

In other words, works-with::image:raster should be read as "Looking at what kind of data a package can handle, this package handles raster images".

Is debtags a hierarchy of tags?

No, there are only 2 levels: the facets and the tags. Some tags are written with a colon (:) in them, such as image:raster in the works-with facet, but that is just part of the tag name.

Tag names are ugly. Are there nice descriptions?

Yes, the debtags vocabulary contains short and long descriptions for each tag. In its raw form it can be downloaded at http://debtags.alioth.debian.org/tags/vocabulary.gz

If you have debtags installed in your system, you can also access the tag vocabulary locally at /var/lib/debtags/vocabulary

Providing new data

Where can I add tags in my packages?

Everyone can add tags in every package, using the interface at http://debtags.alioth.debian.org/todo.html and http://debtags.alioth.debian.org/edit.html

What if I feel like I need a new tag?

You post to ?debtags-devel@lists.alioth.debian.org asking for it to be added.

Before doing that, please read What makes a tag good for being added to the vocabulary?

Is there a debtags-policy?

No, there hasn't been a need for it yet.

Can I create my own set of tags and add them to all the packages I want?

Yes: see Debtags/CustomTags. This has been used, for example, for parental ratings for games

What makes a tag good for being added to the vocabulary?

This is a list of rule-of-thumb criteria:

  • It should represent a clear, atomic concept
  • It should have a facet to fit in
  • There should be more than 6 or 7 packages in Debian that can make use of it

Remember that categorisation in Debtags happens with a combination of tags; this means that instead of having a "dvdplayer" tag, we have the combination use::playing, works-with::video, hardware::storage:dvd.

These combinations also allow to create reasonable approximations of tags that should not be added because they are not yet used by many packages. For example, the tag devel::lang:brainfuck should not yet be added because the corresponding packages in Debian are too few, but it can be reasonably approximated using combinations of devel::interpreter, devel::compiler and use::entertaining.

Do you have tips for tagging?

Justin says:

  • The following tools have been particularly useful for working out what unfamiliar packages are all about.
    • apt-cache (obviously; but nb "apt-cache rdepends")
    • apt-file ("does it put anything in /usr/bin? In init.d/?")
    • debman, in debian-goodies ("what does its man page say?")
    • surfraw (instant lookups of packages.debian.org/foo)

Any reason why there are no ``license::`` tags in debtags?

It has been tried, but we had discouraging replies.

The main problem is that licensing information for a package are too complex to be represented in a single tag.

Please also read this thread in debian-devel for a discussion of other ways to implement this.

Integration in Debian

How can maintainers tag their packages?

They can go in their DDPO page and click on the "Reports: debtags" link to view the Debtags situation of their packages and edit their tags.

When does a new package get tagged?

New packages automatically get tagged using a set of euristics. That is better than nothing, but obviously it is not enough, so they are marked with special::not-yet-tagged tags in the web interface, waiting for a human to have a look.

Once a human does proper tagging of a package and removes the special::not-yet-tagged tags, the automatic tagging euristics will not touch it anymore.

How do the tags reach the Debian archives?

Good tags are copied in the Packages file by means of an "override" file, which is a file that adds or overrides a field from the control file written by the package maintainers.

Tags are added to the override file after manual review, which should happen about once a month.

You can think of the tags in the override file as the "stable" tags and the ones in the debtags database as the "unstable" tags.

Why don't you just ask the maintainers to tag their own packages?

Giving the maintainers ultimate responsibility over the tags of their packages is difficult for many reasons:

  • Some tags are more easily added by people who are not the maintaine. The maintainer can surely add role::*, made-of::* and interface::*, but some other person could be more qualified for adding works-with::* and accessibility::*.

  • We cannot expect every maintainer to understand all the 600+ tags in the debtags vocabulary.
  • Sometimes we do some reorganization of the tags (for example, moving protocol::icq to protocol::im:icq), and we cannot ask all maintainers to handle those changes.

Is there any plan to drop "Section:" ?

No, because they are useful for some archive management purposes.

How come there can be different sets of tags in the Packages file and in {{{/var/lib/debtags}}}?

Debtags supports merging different tag sources: for example, http://www.miriamruiz.es/weblog/?p=155 is an effort to provide parental rating for games in Debian, as tags to be downloaded and merged into the system. Tag sources are listed in /etc/debtags/sources.list. This also allows some of us to use the unreviewed tags on Alioth instead of the ones in the Package database.

Have the Packages file as the primary tag storage has never been the main idea, although it's turned out to be useful to allow tags to be useable in software such as apt-cache, aptitude and grep-dctrl without them having to be modified to access an extra database.

Why do I see mangled tags in the Packages file like "network::{client,server,service}"?

Those tags are compressed because APT does not like long lines.

You can use debtags dumpavail to feed data to grep-dctrl without the compressed tags.

debtags dumpavail also supports tag expressions, so you can even run commands like:

        debtags dumpavail 'role::program && game::*' | grep-dctrl <options>

Web interface

How does the web interface work?

It is explained in the web interface itself: go to http://debtags.alioth.debian.org/todo.html or http://debtags.alioth.debian.org/edit.html, choose a package, then click on the help link on top of the page.

Where do the tags added through the web interface get stored?

They are stored in a file on Alioth, which you can download at http://debtags.alioth.debian.org/tags/tags-current.gz

Enrico regularly fetches the updates to that file, does a manual review, then commits the reviewed updates to svn://svn.debian.org/debtags/tagdb/tags, which gets uploaded to Debian.

Development

How can I experiment writing applications that use debtags?

The best way is to look at apt-xapian-index, which is accessible from most programming languages via the Xapian libraries.

You can find a tutorial published as a series of blog posts, each post links to the next one.

For Python, the python-debian_ package has a debtags module and various code examples.

Otherwise, you just access the data files directly: when the debtags package is installed, you can find them in /var/lib/debtags.

And of course don't forget to subscribe to the debtags-devel mailing list, where you can ask for help.

How can I help?

You can help tagging packages.

You can write cool package interfaces using Debtags.

You can help to take care of this wiki.

If you have knowledge of some specific field, you can help improving the vocabulary.