Differences between revisions 1 and 2
Revision 1 as of 2013-08-15 23:40:25
Size: 3244
Comment: initial page
Revision 2 as of 2013-08-15 23:43:39
Size: 3245
Comment: Put a space before the asterisk for the bullet
Deletions are marked like this. Additions are marked like this.
Line 42: Line 42:
* Someone needs to document Lunar's script here: http://people.debian.org/~paulproteus/lunar-verify-script.rb  * Someone needs to document Lunar's script here: http://people.debian.org/~paulproteus/lunar-verify-script.rb

The goal

It should be possible to reproduce, byte for byte, every build of every package in Debian.

For now, we will start with a few maintainers who want to opt in to this goal as we flesh out the details of what will make it possible. This page tracks our progress.

Status

  • hello package: Contents of data.tar.gz and control.tar.gz can be made reproducible when 'gzip' replaced by 'gzip -n' in debian/rules. (#xyz)
  • Waiting on a few dpkg bugs for avoiding timestamps and file order inconsistency in {data,control}.tar.gz (or .xz)
  • 5 packages from 5 maintainers are interested, of which 0 so far have reproducible contents of {data,control}.tar.gz
  • You can use a script to rebuild a package, with the same build-depends that were used by the build daemons. See "How to reproduce a build" below.
  • Things that need further investigation (by e.g. you!)
    • Document how to use Lunar's script to reproduce a build.
    • Find out if {control,data}.tar.gz files created by dpkg 1.17.1+ have a timestamp embedded.

Use cases

  • If the Debian build daemons are compromised, end users can assure themselves that their binaries are OK if they can regenerate them (and their build dependencies). (You could use a more complicated equivalence test than "do the hashes match?" but if the hashes do match, this is simple.)

Detailed package status list

  • alpine (Asheesh Laroia)
    • Status: Untested
  • haveged (Lunar)
    • Status: Unknown
  • iotop (pabs)
    • Status: Unknown
  • debhelper (joeyh)
    • Status: Unknown
  • magit (lindi)
    • Status: Unknown

How to reproduce a build

Known bugs we are waiting on

  • dpkg: some bug #xxx about gzip timestamps
  • dpkg: some other bug #xxx tar directory order

Different problems, and their solutions

Non-problems

  • You might think ELF binaries (e.g. /usr/bin/hello in the hello package) have embedded timestamps. Luckily, they don't!

Data files in data.tar.gz have timestamps

  • Recommended solution:
    • Use the timestamp of the of the last debian/changelog entry as reference.
    • touch all files to the reference timestamp before building the binary packages.
    • gzip -n when gzipping anything
    • get rid of non-determinisim (yup...)
    • Alternate solutions:
      • (or) libfaketime (probably breaks some things) (sudo apt-get install faketime)

{data,control}.tar.{gz,xz,bz2} may have timestamps

  • dpkg 1.17.1 might or might not store a timestamp for the .gz versions of these files.
  • *.xz and *.bz2 seem to provide no ability to store a timestamp.

{data,control}.tar.{gz,xz,bz2} will store files in readdir order

This is dependent on an accident of filesystem layout at build time, so it would sometimes not be reproducible.

We should probably fix this in dpkg by sorting the contents of the tar files.

References

* Mike Perry's discussion of how it took him eight weeks to make the Tor Browser Bundle have this feature: http://people.debian.org/~paulproteus/mike-perry-reproducible-tbb.txt