Problems to work on in Debian Science

At Debian Science round table on DebConf 10 some problems where discussed which did not find an immediate solution. This page tries to summarise the discussion to enable further discussion.

Make smaller tasks

Probably the simplest way to handle tasks which have grown to a lot of packages is just splitting the task in question. The main argument against such a split is that some packages might fit in all the new tasks and thus need to be mentioned in any of them. The answer was given on the mailing lists several times: There is no reason why a package should be not mentioned in more than one task. We are not doing an exclusive classification. We are providing packages for certain tasks. If a package is useful for more than one task it makes perfectly sense to mention it in all these tasks.

Start a new Blend

Andreas Tille once propagated to make a general Debian Science Blend as a general umbrella for those sciences which do not have enough supporters to run a specific Blend. While Chemistry is covered by ?DebiChem and microbiology is a part of Debian Med other sciences have no dedicated own Blend but remained under this umbrella up to now. While this is perfectly fine it should be considered to split up a Debian Mathematics and a Debian Physics Blend. Both sciences could build more fine grained tasks - several of them might be have quite similar content. The main question is whether there are enough supporters for such an attempt because the success of a Blend heavily depends from the people who are involved. Experience of Debian Med and ?DebiChem people has shown that all maintain a strong conncetion to Debian Science anyway - so there is no real danger to just "loose" the supporters of Debian Mathematics or Debian Physics in the general sciences because there are several common topics (see the other sections on this wiki page here).

Make better use of DebTags and find a better way to visualise DebTags

?DebTags is another way to categorise packages in Debian - there are even people who regard ?DebTags as a "competing" technique to metapackages in Debian. In fact, when installing packages via ept-cache/axi-cache, you have a similar functionality like installing a metapackage. There are some pros and cons of both methods, but this should not be discussed here, and a Blend is not only about installing some metapackages.

If Debian Science manages to define a reasonable set of ?DebTags which might enable a reasonable separation of the packages in one task, there should be ways to visualise this for instance on the tasks pages. For instance, some sectioning according to ?DebTags might come to mind or something like this.

As a sidenote, ?DebTags might be also useful to verify whether all interesting packages are really mentioned in the tasks files. It is a known fact that not all package maintainers of scientific software know about Debian Science or might ignore it intentionally and thus there might be software in Debian which is interesting for some task, but is not in it. So making a ?DebTags based search and compare with the content of the according task could be a good idea.

Pursuit of the above proposals will additionally require at least a review of current ?DebTags and tagging practices for relevant packages. To facilitate the discusssion of design and implementation details for these proposals, a separate subpage ProblemsToWorkOn/DebTags has been created.

Giving credit to upstream

For the moment there are two ways to give upstream some credit on the tasks pages: On one hand we publish popcon data on the other hand we are providing a Registration URL in case such a thing exists.

(FIXME: Other methods were discussed in the BOF - please add here to complete the list

Enable pinning to defined versions of programs

One problem of using software in scientific research is that you sometimes need to create absolutely identical results with the same data and this sometimes is only possible with identical versions of a certain software. There might be two approaches which address this problem.

Providing and using test suites

We should try to convince upstream to provide a reasonable set of test data which can be processed in the package build process and need to reproduce an according result set. This might be approached by calling a (wo be written) dh_runtestsuite (or something like this) which can be switched on and of by some variable (to reduce stress on weak architecture autobuilders) which calls a maintainer provided script debian/<pkg>.testsuite

snapshot.debian.org

We should try to consider http://snapshot.debian.org/ to pin packages to a certain version which is not necessarily available in any current release (neither stable, nor testing or unstable). Even if we would consider http://backports.debian.org/ it would not really help for the problem above because the intend of backports is to provide the latest versions to users of stable but scientists rather want to use older versions of a program in a more recent installation.

This suggestion has a lot of technical implications (will the snapshot version work with installed libraries etc.) and there is no clear suggestion how to technically establish the pinning to a certain version, but for the moment it should be discussed whether this idea makes some sense at all and would really help to solve the problem above.

DebTags

There is a separate page to coordinate the ?DebTags effort.

Duplicated names

Frequently there exist name conflicts of special programs and Debian Policy requires that an executable needs to be renamed. Since scientists expect some programms to have certain names (and some of their scripts might rely on these names) we could provide the following solution in the Blends framework:

  1. install the binary with the original name to /usr/lib/blends/<blendname>/bin

  2. symlink to this location with the non-conflicting name in /usr/bin

  3. metapackage <blendname>-common carries a script /etc/profile.d/<blend>-setpath.sh which prepends the dir above to the PATH variable