DebTags and Debian Science
The possibilities opened up by an extended use of ?DebTags should be explored. Enhancing task pages -- for example by sectioning tasks pages according to ?DebTags -- is one possibility; another one would be Keeping Task Pages Up-to-date through regular comparison of tagged packages with task files. These possibilities were first proposed in the ProblemsToWorkOn page.
A precondition to these would be to Review the Current Status of ?DebTags and tagging practices, and propose amendments where necessary. This would be beneficial irregardless of whether the above goals were to be pursued or not. This should be followed by formulating Best Tagging Practices and then Performing QA on the current packages.
Plan
The following is a proposed course of action:
1. Review the Current Status of ?DebTags
The current tags and tagging practices should be reviewed by Debian Science members to identify any possible deficiencies. To avoid excessive formality, problems, proposed solutions and counterpoints should simply be collected below.
Once a consensus has been reached, we should communicate our results to the ?DebTags team.
2. Formulate "Tagging Best Practices"
Devise a general strategy for tagging, and add a section to the policy page.
3. Perform QA
Post-review (and after any necessary changes have been implemented), we should perform QA on packages according to the newly formulated tagging practices.
4. Enhance Task Pages
TBD (volunteers)?
5. Keeping Task Pages Up-to-date
TBD (volunteers)?
Implementation
1. Review the Current Status of DebTags
The following is a list of potential issues (add at will)
1.1 Representation of Debian Science Tasks
How do we represent the various tasks in the form of tags?
Proposal #1: The fields:: facet mirrors most, but not all of the tasks. Add the following tags:
field::engineering
field::cognitive-neuroscience
field::machine-learning
field::robotics
The tasks astronomy-dev, engineering-dev and meteorology-dev can be represented by combining their field:: with the devel:: facet. Tasks typesetting and viewing are represented by use::typesetting and use::viewing, respectively. Task image analysis is represented by use::analysis and works-with::images.
Proposal #2: Add the tags use::data-acquisition and use::numerical-computation to the use:: facet. These topics are more indicative of an activity than a field.
Under proposals #1 and #2, there would be no direct (1:1) mapping between tasks and a ?DebTags facet. This shouldn't be much of a problem though - mappings could simply be defined by a set of tags, for example:
- task mathematics
field::mathematics
- task mathematics-dev
field::mathematics AND devel::library
- task numerical-computation
use:numerical-computation
1.2 Facets related to binary data
This problem is relevant to ?DebTags in general
There are two kinds of relationships between packages and binary data:
Packages working with data ("App Package")
Packages providing data ("Data Package")
App Packages can currently be classified using the following facets:
- works-with-format::
- A specific binary format the package supports (xml, zip, png)
- works-with::
- A generic type for data the package can work with (db, graphs, image:vector)
Data packages can currently be classified using the following facet:
- made-of::
Focus is on the binary format of the data -- ie, this complements works-with-format:: for App Packages
Issue #1: Data packages currently lack the means to classify the type of their contents. Such a classification could be either generic (images, ...) or domain-specific ('maps', 'wallpapers', 'dna-sequences').
Proposal #1: add a new facet data:: to indicate the kind of data the package contains. There are two possible approaches: either a generic type (complementing works-with:: for App Packages, which would then have to be extended), or a more domain-specific type, eg:
data::biology
data::biology:nuceleic-acids
data::games:maps
data::images:wallpapers
For a Data Package, the latter approach appears more worthwhile, as the domain-specific content of the package is its defining characteristic. This level of detail is currently not present for App Packages, but it is also probably not necessary, as they are usually approached via fields::.
Counterpoint: This would result in an asymmetry between App Package tags and Data Package, which may be counterintuitive: works-with-format:: would complement made-of::, but works-with:: (generic) would not complement data:: (specific)
Alternative to Proposal #1: An simpler, though less accurate, alternative to proposal #1 might be to just add a tag made-of::data and use this tag in combination with fields::, eg:
foopackage: field::biology, made-of::data
1.3 Facet biology::
Issue #1: The facet biology:: appears both misplaced and redundant to field::biology.
Proposal #1: Remove this facet, and migrate current tags as following:
biology::emboss:
either suite::emboss or none at allbiology::format:aln:
create a tag works-with-format::alnbiology::format:fasta:
create a tag works-with-format::fastabiology::nuceleic-acids, biology::peptidic:
see 1.2 Facets related to binary data above
1.4 Facet science::
Issue #1: This facet combines the aspects of two other facets - field:: and use:: - which violates the facet approach (“[...] mutually exclusive aspects”).
Proposal #1:
Migrate the activity aspect to the use:: facet (creating a new tag when necessary). The science:: aspect is already represented by the field:: facet.
science::bibliography:
create a tag use::bibliographyscience::modelling:
create a tag use::modelingscience::publishing:
create a tag use::publishingscience::plotting:
create a tag use::plottingscience::calculation:
replace occurrences with use::calculatingscience::visualization:
replace occurrences with use::viewingscience::data-acquisition:
replace occurrences with use::data-acquisition
Note that in the above list, field:: qualifications are not listed intentionally. A plotting tool would usually be tagged just as use::plotting; were it to provide its functionality only to a specific area, it could be further qualified by a field:: tag.
1.5 Other Smaller issues
Proposal #1: Remove tag field::medicine:imaging, and replace its occurrence with field::medicine and use::imaging.
Proposal #2: Add tag implemented-in::octave
2. Formulate "Tagging Best Practices"
TBD
3. Perform QA
TBD
4. Enhance Task Pages
TBD
5. Keeping Task Pages Up-to-date
TBD