Debian Policy Manual: 4.13. Embedded code copies
Some software packages include in their release distributions "convenience" copies of code from other software packages, generally so that users compiling from source don’t have to download multiple archives. Debian packages should not make use of these copies unless the included package is explicitly intended to be used in this way. If the included code is already in the Debian archive in the form of a library, the Debian packaging should ensure that binary packages reference the libraries already in Debian and not the embedded copy. If the included code is not already in Debian, it should be packaged separately as a prerequisite dependency, if possible.

Embedded copies of code, data, fonts or other things should be removed from the upstream VCS and source tarballs. Upstream might want to only embed the copies in the binary packages they distribute, script the install of their dependencies and or bundle the dependencies into a single but separate source tarball rather than embedding copies of them. Once upstream has fixed the issue, the Debian package can then be updated to the fixed version. If upstream refuse to remove the embedded copies, then Debian should either repack the upstream tarball using Files-Excluded (if there is a DFSG or size issue) or remove the files in debian/rules' clean target and/or very early in the build target, so that there is no chance of them being used by the build process.

The list of packages that embed copies (including unused ones) of other projects is maintained in the security-tracker git repository. This list also contains information about forks so that the security team can check if all forks contain the same vulnerabilities.

All Debian members have commit access to the security-tracker repository and others can send suggestions or additions to the debian-security-tracker mailing list.

Tools

Lintian

Lintian detects embedding of

Others

The Debian Sources service allows searching for specific hashes and ctags throughout all Debian source code, which may be useful for detecting duplication of source code and data.

If you have a particular file with some interesting aspect (security issue, etc.), you can likely find other copies using Debian Code Search or similar external service, such as Black Duck Open Hub, SourceGraph Public Code Search or GitHub Search.

If a file has a fairly unique name, you can often find copies of that file by searching the contents of Debian binary or source packages using apt-file:

apt-file search uniquename.py

or

apt-file search -I dsc uniquename.c

Tracking

Various Debian folks keep track of embedded copies they found via usertags:

rbrito@ime.usp.br jwilk@debian.org mbehrle@debian.org pabs@debian.org sramacher@debian.org dr@jones.dk

See also

These wiki pages mention embedded copies: arc4random


CategoryPackaging