DebTags and Debian Science

The possibilities opened up by an extended use of ?DebTags should be explored. Enhancing task pages -- for example by sectioning tasks pages according to ?DebTags -- is one possibility; another one would be Keeping Task Pages Up-to-date through regular comparison of tagged packages with task files. These possibilities were first proposed in the ProblemsToWorkOn page.

A precondition to these would be to Review the Current Status of ?DebTags and tagging practices, and propose amendments where necessary. This would be beneficial irregardless of whether the above goals were to be pursued or not. This should be followed by formulating Best Tagging Practices and then Performing QA on the current packages.

Plan

The following is a proposed course of action:

1. Review the Current Status of ?DebTags

The current tags and tagging practices should be reviewed by Debian Science members to identify any possible deficiencies. To avoid excessive formality, problems, proposed solutions and counterpoints should simply be collected below.

Once a consensus has been reached, we should communicate our results to the ?DebTags team.

2. Formulate "Tagging Best Practices"

Devise a general strategy for tagging, and add a section to the policy page.

3. Perform QA

Post-review (and after any necessary changes have been implemented), we should perform QA on packages according to the newly formulated tagging practices.

4. Enhance Task Pages

TBD (volunteers)?

5. Keeping Task Pages Up-to-date

TBD (volunteers)?


Implementation

1. Review the Current Status of DebTags

The following is a list of potential issues (add at will)


1.1 Representation of Debian Science Tasks

How do we represent the various tasks in the form of tags?

Proposal #1: The fields:: facet mirrors most, but not all of the tasks. Add the following tags:

The tasks astronomy-dev, engineering-dev and meteorology-dev can be represented by combining their field:: with the devel:: facet. Tasks typesetting and viewing are represented by use::typesetting and use::viewing, respectively. Task image analysis is represented by use::analysis and works-with::images.

Proposal #2: Add the tags use::data-acquisition and use::numerical-computation to the use:: facet. These topics are more indicative of an activity than a field.

Under proposals #1 and #2, there would be no direct (1:1) mapping between tasks and a ?DebTags facet. This shouldn't be much of a problem though - mappings could simply be defined by a set of tags, for example:

task mathematics

field::mathematics

task mathematics-dev

field::mathematics AND devel::library

task numerical-computation

use:numerical-computation


This problem is relevant to ?DebTags in general

There are two kinds of relationships between packages and binary data:

  1. Packages working with data ("App Package")

  2. Packages providing data ("Data Package")

App Packages can currently be classified using the following facets:

works-with-format::
A specific binary format the package supports (xml, zip, png)
works-with::
A generic type for data the package can work with (db, graphs, image:vector)

Data packages can currently be classified using the following facet:

made-of::

Focus is on the binary format of the data -- ie, this complements works-with-format:: for App Packages

Issue #1: Data packages currently lack the means to classify the type of their contents. Such a classification could be either generic (images, ...) or domain-specific ('maps', 'wallpapers', 'dna-sequences').

Proposal #1: add a new facet data:: to indicate the kind of data the package contains. There are two possible approaches: either a generic type (complementing works-with:: for App Packages, which would then have to be extended), or a more domain-specific type, eg:

For a Data Package, the latter approach appears more worthwhile, as the domain-specific content of the package is its defining characteristic. This level of detail is currently not present for App Packages, but it is also probably not necessary, as they are usually approached via fields::.

Counterpoint: This would result in an asymmetry between App Package tags and Data Package, which may be counterintuitive: works-with-format:: would complement made-of::, but works-with:: (generic) would not complement data:: (specific)

Alternative to Proposal #1: An simpler, though less accurate, alternative to proposal #1 might be to just add a tag made-of::data and use this tag in combination with fields::, eg:


1.3 Facet biology::

Issue #1: The facet biology:: appears both misplaced and redundant to field::biology.

Proposal #1: Remove this facet, and migrate current tags as following:


1.4 Facet science::

Issue #1: This facet combines the aspects of two other facets - field:: and use:: - which violates the facet approach (“[...] mutually exclusive aspects”).

Proposal #1:

Note that in the above list, field:: qualifications are not listed intentionally. A plotting tool would usually be tagged just as use::plotting; were it to provide its functionality only to a specific area, it could be further qualified by a field:: tag.


1.5 Other Smaller issues

Proposal #1: Remove tag field::medicine:imaging, and replace its occurrence with field::medicine and use::imaging.

Proposal #2: Add tag implemented-in::octave


2. Formulate "Tagging Best Practices"

TBD


3. Perform QA

TBD


4. Enhance Task Pages

TBD


5. Keeping Task Pages Up-to-date

TBD