Here's some details about issues that were preventing packages to build reproducibly and who has been fixed or worked around in Debian. This is mainly intended as a reference for developers and packagers working on other projects.

Files in data.tar.gz contains build paths

The build path is embedded in DWARF sections of ELF files among other types of file generated during builds. This has proven a real headache to fix after the path have been captured.

We are thus making the build path part of the build environment, and record it in .buildinfo files.

Generation of files in data.tar depends on (pseudo-)randomness

Now fixed:

{data,control}.tar.{gz,xz,bz2} will store files in filesystem order

Changes to dpkg are discussed in 719845. Test case patch for pkg-tests. Patches that fork `sort` to get a stable order for files in control and data archives.

Files generated by debhelper depend on filesystem order

Several components of debhelper generate output which depends on the filesystem order. See 774100, 774102, and 775020.

Randomness in control file

Now fixed:

.deb ar-archive header contains a timestamp

.deb are ar-archives. The header currently contains the “current time”.

759999 contains patches against dpkg that will preset the timestamp to the time of the latest entry of debian/changelog when a package is built using dpkg-buildpackage.

XSLT generate-id() is non-deterministic

XSLT's generate-id() function is explicitly allowed by the XSLT spec to be non-deterministic, and is frequently implemented using memory addresses of XML nodes, which are of course non-deterministic thanks to ASLR. Consequentially, files that are generated by XSLT (typically documentation) that include the result of generate-id() in their output do not build deterministically.

Tentative fix is in the pu/reproducible_builds branch.

Files in data.tar contain timestamps