Differences between revisions 6 and 7
Revision 6 as of 2009-02-02 02:22:19
Size: 7741
Editor: RussAllbery
Comment: Update projects, add Adam
Revision 7 as of 2009-02-14 04:33:39
Size: 5436
Editor: RussAllbery
Comment: Edit the DebConf7 notes to remove things that are out of date
Deletions are marked like this. Additions are marked like this.
Line 38: Line 38:
Line 39: Line 40:
=== DebConf7 Discussion ===
Accumulated notes from the Deb{{{}}}Conf7 lintian BOF. Russ promised to, and still intends to, turn this into a post, but hasn't had time to do so yet. Please feel free to re-edit, format into a more coherent structure, etc.

This is a somewhat random collection of additional notes about Lintian, mostly from the Deb{{{}}}Conf7 Lintian BOF.
Line 48: Line 49:
 * Biggest current limitation is squashing all information about checks into a three-tier system of error, warning, and info. In practice, people treat error and warning as largely interchangeable and many people never look at info. lintian instead needs to classify tags by source (Policy, Developer's Reference, obvious brokenness, documentation of tools used, aesthetics, etc.), severity (package won't work, Policy must/should/recommended, will be confusing, best practice, good style, minor nit, etc.), and certainty (lintian is sure this problem is really present, mostly sure, many false positives, wild guess, experimental). Some of the machinery for things like this is already present, but this is a significant amount of work and will require significant restructuring of the tag handling.  * lintian's architecture uses an unpack script to get information about the source or binary package (possibly including unpacking it entirely) and then collect scripts to analyze the results of the unpack script and store that information in various files. All this information is kept on disk and Lintian reads and writes small files to pass information between the unpack and collect scripts and the checks, although increasingly the Lintian::Collect modules are used to cache that information so that it's not read more than once.
Line 50: Line 51:
 * lintian tag long descriptions need editing for consistency, check references, make sure that everything that should have a reference does, remove double-apostrophe quotes in favor of ASCII quotes, and so forth. The docbook manual also needs attention, and the man pages should probably be re-written in POD for better *roff output (since lintian is written in Perl anyway).

 * lintian's architecture uses an unpack script to get information about the source or binary package (possibly including unpacking it entirely) and then collect scripts to analyze the results of the unpack script and store that information in various files. All this information is kept on disk and the lintian code is full of reading and writing small files to pass information between the unpack and collect scripts and the checks. The checks are now all run directly in lintian's process rather than as separate scripts and the unpack and collect scripts could be as well, which means that some things could be passed in memory for a possible speed improvement. It's also not clear if anything uses the lower levels of unpack; in theory, you can run lintian with a lower unpack level on large packages and it won't fully unpack the package and only run the tests it can, but in practice it's not clear anyone uses this or that the additional complexity is worth it.

 * ''Foreword: linda has been droped on March 2008 Bug:469039''. [[BR]]lintian is the main package checkers. linda is still maintained, but only changes at a fairly slow rate. It's unfortunate that there are two separate programs for doing the same thing with slightly different checks, but the internal architecture is quite a bit different and neither maintainers have the time or a lot of interest in trying to merge the programs. linda is written in Python, which some prefer. lintian probably could run Python checks, although it's gaining additional benefit from having everything in Perl so that it can all be loaded into the same Perl interpretor; eventually, information can be shared that way between modules.
 * It's also not clear if anything uses the lower levels of unpack; in theory, you can run lintian with a lower unpack level on large packages and it won't fully unpack the package and only run the tests it can, but in practice it's not clear anyone uses this or that the additional complexity is worth it.
Line 58: Line 55:
 * It would be great to have DAK run lintian and reject packages on the basis of lintian diagnosis, but this depends on the restructuring and reclassification of tags since DAK could only reject for things that lintian is absolutely certain about and which are from Policy or some similar authoritative source.  * There has been some discussion of having DAK run lintian and reject packages on the basis of lintian diagnosis. Lintian now has support for only reporting a specific set of tags.

Lintian

Infrastructure

Interacting with the team

Usual roles

  • [wiki:RussAllbery Russ Allbery] (rra) commits patches, fixes bugs, does infrastructure development, and watches over the version on gluck.debian.org for http://lintian.debian.org/

  • Adam D. Barratt commits patches, fixes bugs, and does infrastructure development
  • Frank Lichtenheld (djpig) does bug fixing and commits
  • Marc 'HE' Brockschmidt does some bug fixing and commits and was the Google Summer of Code mentor
  • Colin Watson keeps an eye on Ubuntu integration and man page checks

Task description

Lintian is a comprehensive package checker for Debian packages. It primarily tries to check for Debian Policy violations and violations of various sub-policies, but it also checks for best practices, common mistakes, and problems that maintainers like to catch before uploads.

Lintian by design only performs checks internal to a single package which can be done without external information other than Lintian itself. This allows stability of output: Lintian will produce the same report for a given package and a given version of Lintian each time it's run. Cross-repository checks and checks for consistency and linkages between packages should be done by other tools.

The team also maintains http://lintian.debian.org/ as a presentation of the results of running Lintian on the entire archive. Currently, due to limitations in both disk space and available CPU, as well as limitations in Lintian's archive-wide configuration abilities, these checks are only done on the main category of the archive and only for arch: i386 and arch: all packages.

Get involved

Easy Projects

  • Pick a fully specified wishlist bug for a Lintian check and implement it.
  • Look at t/COVERAGE and write a test case for a tag that's not currently tested.
  • Convert the Lintian man pages to POD.
  • Edit all tag descriptions for consistency and add appropriate markup.

Larger Projects

  • Rewrite the Lintian manual. It should probably be converted to DocBook and is somewhat out of date.

  • Add documentation of what Lintian extracts into the laboratory. A rewritten Lintian manual would be a good place to put this information.

Hard Projects

  • Help the team with refactoring the existing code to share more common modules. (Please discuss this on the team list before starting. RussAllbery has some ideas for what could be done.)

More stuff

This is a somewhat random collection of additional notes about Lintian, mostly from the DebConf7 Lintian BOF.

  • Role of lintian: package checker without external dependencies besides lintian (and its Depends) and the package. Does not look at other packages, the rest of the archive, external web sites like the BTS, and so forth, so that if neither lintian nor the package have changed, the results will be the same as the last time the check was run.
  • lintian wontfix bugs accumulate ideas for archive-wide checks that are outside lintian's scope but may be interesting for other tools.
  • lintian has grown organically and has a wide variety of different Perl coding styles and structure for its tests. It could use a cleanup. Russ removes unused code and refactors from time to time, but hasn't made any comprehensive changes.
  • lintian's architecture uses an unpack script to get information about the source or binary package (possibly including unpacking it entirely) and then collect scripts to analyze the results of the unpack script and store that information in various files. All this information is kept on disk and Lintian reads and writes small files to pass information between the unpack and collect scripts and the checks, although increasingly the Lintian::Collect modules are used to cache that information so that it's not read more than once.
  • It's also not clear if anything uses the lower levels of unpack; in theory, you can run lintian with a lower unpack level on large packages and it won't fully unpack the package and only run the tests it can, but in practice it's not clear anyone uses this or that the additional complexity is worth it.
  • Manoj has some ideas for more closely integrating package checking with Policy and putting checking pseudocode into Policy as part of the Policy constraint. Some of the program analysis and proof languages may be useful for doing something like this, but it needs someone interested in this sort of thing to work on it.
  • There has been some discussion of having DAK run lintian and reject packages on the basis of lintian diagnosis. Lintian now has support for only reporting a specific set of tags.
  • Anibal suggested to have an extra column called lintian in the Debian Developer's Packages Overview page.