Translation(s): none


dedup.debian.net

The debian duplication detector is a service that scans binary Debian packages and records hashes of regular files contained. It can then discover files shipped in multiple packages or multiple times in one package, that can possibly replaced by links to save space. Another use case is to discover embedded copies in scripting languages. Note that there is a similar service called clonewise is being worked on and looks at source packages, so it might be better suited for discovering embedded copies.

Tips for reducing duplication in packages

Within a single binary package

If the software accessing the duplicate files supports symlinks, you can run the following commands from debian/rules after the files are installed by make install or similar.

# Replace duplicate files with symlinks
rdfind -outputname /dev/null -makesymlinks true debian/mypackage/
# Fix those symlinks to make them relative
symlinks -r -s -c debian/mypackage/

Within multiple binary packages from a single source package

If the duplicated files are significant, you might want to pool them in a foo-common package and have the other binary packages depend on that.

Within multiple binary packages from multiple source packages

You should co-ordinate with the maintainers of the source packages and come up with a solution.

Where the files are from embedded copies of other projects, the other projects should be packaged separately and the packages containing them should drop the files and depend on the new packages.

The dh-linktree helper can assist with replacing embedded copies by symbolic links to files in other packages.