Differences between revisions 11 and 12
Revision 11 as of 2007-08-03 17:46:06
Size: 7204
Comment:
Revision 12 as of 2007-10-15 21:09:39
Size: 7040
Comment: update to latest status
Deletions are marked like this. Additions are marked like this.
Line 4: Line 4:
 * '''Status''': implemented
Line 54: Line 55:
 * Prototype
  
* Discussion on -devel:
     * http://lists.debian.org/debian-devel/2007/05/msg00922.html
   * http://lists.debian.org/debian-devel/2007/06/msg00197.html
 * Discussion on -devel:
   * http://lists.debian.org/debian-devel/2007/05/msg00922.html
   * http://lists.debian.org/debian-devel/2007/06/msg00197.html
   * http://lists.debian.org/debian-devel/2007/08/msg00235.html
   * http://lists.debian.org/debian-devel/2007/08/msg00915.html
   * http://lists.debian.org/debian-devel-announce/2007/09/msg00004.html
Line 60: Line 63:
   * Ongoing work in git://git.debian.org/git/dpkg/dpkg.git on a branch dpkg-shlibdeps-buxy
   * [http://git.debian.org/?p=dpkg/dpkg.git;a=shortlog;h=dpkg-shlibdeps-buxy Browse latest changes]

With Git 1.5, you can do the following to grab the latest version of the work:
{{{
$ git clone git://git.debian.org/git/dpkg/dpkg.git
$ cd dpkg
$ git checkout --track -b dpkg-shlibdeps-buxy origin/dpkg-shlibdeps-buxy
}}}
If you already have a repository with the branch, you can simply get updates with:
{{{
$ git pull
}}}
   * Merged in the master branch since 2007-10-08.
   * Integrated in dpkg 1.14.8.
Line 86: Line 78:
   * RaphaelHertzog: If something needs to be done for that, it's probably a hack in dpkg-shlibdeps that detects the hashing method used and add the corresponding dependency. I don't think it makes sense to integrate the notion of hashing in the "symbols" file.    * RaphaelHertzog: If something needs to be done for that, it's probably a hack in dpkg-shlibdeps that detects the hashing method used and add the corresponding dependency. I don't think it makes sense to integrate the notion of hashing in the "symbols" file. Update: all affected packages got recompiled to have both hashing.
  • Created: ?Date(2007-05-13T15:45:54Z)

  • Contributors: RaphaelHertzog

  • Packages affected: dpkg-dev, debhelper, all libraries

  • Status: implemented

Summary

The goal is to improve dpkg-shlibdeps so that it generates the minimal dependency required to make the application work instead of blindly using the dependency provided by the shlibs file.

Rationale

In many cases, the dependency generated is too strict as the application doesn't necessarily use the newly-added symbols which justify the dependency bump in the shlibs file. This has many consequences, it can block the propagation of a package in testing waiting for the new version of the library while it would work perfectly fine with the version in testing. The same could even be true between unstable and stable.

Use Cases

  • rdesktop doesn't work at all in etch but this bug is fixed in sid. Thanks to the improved dependency generation, the fixed package doesn't depend on the newer libc6 and users can grab the corrected version directly from sid without needing to update the libc6 at the same time.
  • Thanks to the improved dependencies, the library transitions block a lot less packages and the release managers have less work. Lenny is thus released on time and of higher quality. :-)

Design

Summary

The library packages should provide a new file <package>.symbols along the traditional <package>.shlibs. This file indicates (in theory) for each symbol in which version of the library it got introduced. However since we're not going to do historical research to find the right version, we can use whatever first version we want. In general, it probably makes sense to initiate that file with the symbols coming from the stable version of the library (when the soname hasn't changed).

dpkg-shlibdeps will then be modified to extract the list of symbols used by each application and will identify the first version of the library that provides all required symbols.

Storage of the symbols file

The format of the symbols file is the following:

<soname> <main dependency template>
[| <alternative dependency template>]
[ as many alternative dependency templates as needed ]
    <symbol> <first-version>[ <id of dependency template>]
    [ as many symbols as needed ]

A dependency template is a full dependency that might integrate "#MINVER#". This place-holder is then replaced by the appropriate "(>= <min-version>)" when the real dependency is generated. The minimal version is generated by finding the biggest <first-version> out of all the symbols used by the application that are affected to this dependency template. Note that #MINVER# can be empty if the application doesn't use any symbols from a library that it's still linked with.

If a symbol has no explicit <id of dependency template>, then it's supposed to be affected to the main dependency template (id=0). Otherwise the number refers to the <n>'th alternative dependency template.

The file is distributed as part of dpkg's control file in /var/lib/dpkg/info exactly like the current shlibs file.

However we need to keep a copy of that file from one version to the next since it's of no interest if we generate it from scratch each time (listing thus always a dependency on the last version of the library). It would seem logical to store them in the source package itself.

Implementation Plan

  • Create the tool that generates the symbols file.
  • Modify dpkg-shlibdeps.
  • Find a nice way to integrate the call to this tool in the package generation (dh_makeshlibs is a logical place).

Implementation

Outstanding Issues

  • Many libraries export private symbols which generate noise in the symbols file.
  • A full-source build could auto-update a symbols file in the debian/ directory but binary-only builds can't and we can't assume that the set of symbols is the same on all arches.
    • Idea 1:
      • The full-source build generates a debian/package.symbols.default file used by all architectures.

      • We could have debian/package.symbols.arch manually handled by the maintainer and the binary-build could FTBFS if there's any major discrepancy. The build log would include the details of symbols which are not present, as well as the supplementary symbols.

    • Idea 2:
      • We could have a script update-symbols that downloads the latest version of the symbols file from a Debian server and put it in debian/. That script would have to be called once each time that the maintainer packages a new upstream version.
      • On the server side, the package would simply extract the latest version of the symbols file from the real .deb. (This can be easily automated with Mole)
      • During a binary build, the symbols file from the debian directory is used a a start, the newer symbols are merged in automatically and the result is shipped in the package.
  • We must pay attention to symbol hashing. Currently a package compiled on sid won't run on etch even if the version if etch provides all needed symbols. The change is that symbols are not hashed in the same way and ld.so of etch doesn't cope with the symbols available in packages compiled on sid. The glibc shlibs has been bumped (and slightly abused) to make sure that recently compiled packages depend on an glibc new enough to understand the new hashing mechanism.
    • RaphaelHertzog: If something needs to be done for that, it's probably a hack in dpkg-shlibdeps that detects the hashing method used and add the corresponding dependency. I don't think it makes sense to integrate the notion of hashing in the "symbols" file. Update: all affected packages got recompiled to have both hashing.

Possible extensions

  • If the system has enough knowledge at the symbol level, it can be used to detect errors and make package FTBFS from for example if a symbol is dropped and the soname is not changed. Of course, an override file could then be used to explicitely ignore this (some private symbols can be dropped at any time).

Alternative

Guillem Jover has worked on another solution which would work only for libraries providing versioned symbols. dpkg-shlibdeps would extract the version from all symbols used and use that as a base to compute the minimal dependency. The maintainer still has to supply a mapping between "version of symbols" and "version of package".


CategorySpec