Attachment 'dc13-bof-reproducible-builds.txt'

Download

   1 apt-get install gobby-infinote
   2 gobby -c gobby.debian.org -n
   3 debconf13/bof/reproducible-builds
   4 
   5 Byte-for-byte identical reproducible builds?
   6 ============================================
   7 
   8 BoF at DebConf13 / Vaumarcus, Switzerland; chair: Lunar
   9 
  10 Abstract:
  11 
  12     The Bitcoin client and the upcoming Tor Browser Bundle 3.0 series
  13     are using a build system that produces “deterministic builds” —
  14     packages which are byte-for-byte identical no matter who actually
  15     builds them, or what hardware they use. The idea is that current
  16     popular software development practices simply cannot survive
  17     targeted attacks of the scale and scope that we are seeing today.
  18     With “deterministic builds”, any individual can use an anonymity
  19     network to download publicly signed and audited source code and
  20     reproduce the builds exactly, without being subject to such
  21     targeted attacks. If they notice any differences, they can alert
  22     the public builders/signers, hopefully anonymously.
  23 
  24     Is such ideas applicable to Debian? To what extent? What would be
  25     the first stones to pave the way toward reproducible builds of
  26     Debian packages?
  27 
  28 Foreword
  29 --------
  30 
  31 Huge, huge thanks to Asheesh for helping me prepare this BoF.
  32 
  33 Agenda
  34 ------
  35 
  36     “Good news everyone! We are are going to get pwned!”
  37                                          — Professor Farnsworth
  38 
  39 1. Go around: why do you care? (5-10 min.)
  40 2. Mike Perry's work on the Tor Browser Bundle (5 min.)
  41 3. Asheesh's experiments (5 min.)
  42 4. On the technical side, there's two aspects to the problem:
  43    a. at the package level: How do we guarantee that given the same
  44       source package and the same build environment, we get the same
  45       binary results? (5-10 min.)
  46    b. at the archive level: How to record the build environment of a
  47       package (and enable its reproduction at a later time)?
  48       (5 min.)
  49 5. What's next? (15 min.)
  50 
  51 Experience from making the Tor Browser Bundle builds reproducible
  52 -----------------------------------------------------------------
  53 
  54 Mike Perry worked on making the Tor Browser Bundle builds
  55 reproducible. That's hard work: Tor Browser is based on Firefox
  56 (huge code base) and is built for Linux, Mac OS X and Windows.
  57 
  58   - How:
  59     - Uses Gitian from Bitcoin
  60       - Thin layer around Ubuntu virtualization tools
  61       - Spins up a ubuntu VM with fixed hostname, username,
  62         path, and fake timestamps (via faketime)
  63       - List packages and architecture
  64       - Runs a bash script you specify
  65       - Cross compiles for Windows (mingw-w64) and Mac (toolchain4)
  66     - Took about 3-4 days per OS to write a working descriptor set
  67       for Tor, Firefox and bundling/localization
  68     - 2 weeks after starting, I was producing matching repeat builds
  69       on my own hardware
  70       - Issues:
  71         - FIPS-140 mode has non-deterministic sigs on Linux
  72         - Millisecond timestamps encoded by Firefox
  73         - Mystery 3 bytes of randomness on Windows. Bitstomped
  74     - 6 more weeks of work to get the builds to match externally
  75       - Filesytem reordering
  76         - Affects Zip, Tar, .a, and even aspects of Firefox scripts
  77           - created wrappers for archives
  78           - Firefox ordering enforced via sorting inputs in Firefox scripts
  79       - Localization LC_ALL leaks
  80         - Alters sort order
  81       - Permissions differences
  82         - Even though I set umask...
  83 
  84 To sum it up: the key that needs to be controled are the hostname,
  85 username, build path, OS locale, uname output, toolchain version,
  86 and time. We can either make everything deterministic or record on first build and the replay on subsequent builds.
  87 
  88 Results from Asheesh's experiments
  89 ----------------------------------
  90 
  91 Asheesh jumped on the idea and played with the hello package.
  92 Rebuilt using faketime on top of fakeroot.
  93 
  94 * When you rebuild that way, the data.tar.gz of the built Debian
  95   package has the same contents
  96 * Same with control.tar.gz
  97 
  98 However, the data.tar.gz and control.tar.gz *both* don't match each
  99 other. This is because of a semi-bug in dpkg, we need convince dpkg
 100 to fix the 'not calling gzip -n' issue.
 101 
 102 * ELF binaries like /usr/bin/hello in the "hello" package
 103   contain *no* timestamp that needs to be stripped.
 104 * gzip files need '-n' to be passed to gzip for avoiding embedding a
 105   timestamp.
 106 * xz and bzip2 don't have this problem. I'm too pressed for time to
 107   write a test script, but I did test it.
 108 * dedup.debian.net can be used to detect duplicates, especially if
 109   we hack it to detect files that change between uploads of a package,
 110   rather than just between packages.
 111   - future work: ssdeep hashes, which could be useful for finding files
 112     that should be duplicates but aren't
 113 
 114 NOTE that this might instead be because the *timestamps* of files within
 115 control.tar.gz and data.tar.gz.. testing that theory... I have not finished
 116 testing this theory, sadly, but here is a shell script I use to set up a lab:
 117 
 118 http://rose.makesad.us/~paulproteus/tmp/extract_both.sh
 119  - please provide an index for the PTS :)
 120 
 121 Package level issues
 122 --------------------
 123 
 124 ### time
 125 
 126  * Remove/strip the timestamps for build results.
 127  * Use faketime (reports faked system time to programs). Time could be
 128    automatically set to the time of the last debian/changelog entry.
 129  * Base timestamps on timestamps of the source code, which should be unchanged
 130  * Record time on first build and replay them later (see below).
 131 
 132 (In most case, recording the time of the build is actually
 133 wrong. For documentation, what matters is the time of the last
 134 change in the source package and not the time of the build
 135 itself.)
 136 
 137 ### Build path
 138 
 139  * Debian buildds use per-build temporary path names; so that any paths accidentally embedded in binaries do not exist on end-user systems (potential security issue).
 140  * Stripping the path with debugedit (???)
 141  * Correct solution: patch out where path appears -> use paths relative to the builddir
 142    instead of having a common build directory for everyone.
 143    (Because having encoded paths can hide real bugs, anyway.)
 144 
 145 ### OS locale
 146 
 147  * Use LANG=C.UTF-8 ? -> LC_ALL=C.UTF-8
 148  * Let's make dpkg-buildpackage export this value
 149    (or another wrapper? because dpkg-buildpackage is not
 150    the policy canonical way to build all packages;
 151    but debian/rules is painful)
 152    Lets make this an option so that users see translated messages
 153    and the buildds all build with English 
 154  * Change the policy to make dpkg-buildpackage be the canonical
 155    solution to build package.
 156 
 157 ### hostname, uname output, username
 158 
 159 liblietome?
 160 
 161 But kernel version is part of the build environment, so
 162 we might need to record that somewhere else. Are kernels used on buildds always available? Or are some using non-standard kernels?
 163 
 164 ### toolchain version
 165 
 166  * part of the system state and build info
 167 
 168 ### file ordering issues
 169 
 170 Need to patch the build systems to add proper `sort` calls.
 171 
 172 ### Randomisation
 173 
 174  * Define seed?
 175  * ASLR?
 176 
 177 ### pid numbers
 178 
 179 Let's patch that out if needed.
 180 
 181 ### Others issues?
 182 
 183 
 184 Archive level issues
 185 --------------------
 186 
 187 Not all packages are built on the buildds so the build environment isn't going to be the same (for now).
 188 
 189 .changes file are not currently kept except on mailing lists.
 190 
 191 We want .changes files: they are signed by the maintainer.
 192 
 193 If we keep .changes file, we can add a `XC-Built-Environment` field.
 194 It would add to the .changes files something like:
 195 
 196 Built-Environment:
 197  apt (= 0.9.9.4), aptitude (= 0.6.8.2-1), aptitude-common (= 0.6.8.2-1),
 198  base-files (= 7.2), base-passwd (= 3.5.26), bash (= 4.2+dfsg-1),
 199  binutils (= 2.23.52.20130727-1), bsdutils (= 1:2.20.1-5.5),
 200  build-essential (= 11.6), bzip2 (= 1.0.6-4), ccache (= 3.1.9-1),
 201  coreutils (= 8.21-1), cpp (= 4:4.8.1-2), cpp-4.6 (= 4.6.4-4),
 202  cpp-4.7 (= 4.7.3-6), cpp-4.8 (= 4.8.1-8), dash (= 0.5.7-3),
 203  debconf (= 1.5.50), debconf-i18n (= 1.5.50),
 204  debian-archive-keyring (= 2012.4), debianutils (= 4.4),
 205  diffutils (= 1:3.2-8), dpkg (= 1.17.1), dpkg-dev (= 1.17.1),
 206  e2fslibs (= 1.42.8-1), e2fsprogs (= 1.42.8-1), fakeroot (= 1.19-2),
 207  findutils (= 4.4.2-6), g++ (= 4:4.8.1-2), g++-4.6 (= 4.6.4-4),
 208  g++-4.8 (= 4.8.1-8), gcc (= 4:4.8.1-2), gcc-4.4-base (= 4.4.7-4),
 209  gcc-4.5-base (= 4.5.4-1), gcc-4.6 (= 4.6.4-4), gcc-4.6-base (= 4.6.4-4),
 210  gcc-4.7 (= 4.7.3-6), gcc-4.7-base (= 4.7.3-6), gcc-4.8 (= 4.8.1-8),
 211  gcc-4.8-base (= 4.8.1-8), gnupg (= 1.4.14-1), gpgv (= 1.4.14-1),
 212  grep (= 2.14-2), gzip (= 1.6-1), hostname (= 3.13),
 213  initscripts (= 2.88dsf-43), insserv (= 1.14.0-5), less (= 458-2),
 214  libacl1 (= 2.2.52-1), libapt-pkg4.12 (= 0.9.9.4), libasan0 (= 4.8.1-8),
 215  libatomic1 (= 4.8.1-8), libattr1 (= 1:2.4.47-1), libblkid1 (= 2.20.1-5.5),
 216  libboost-iostreams1.49.0 (= 1.49.0-4), libbz2-1.0 (= 1.0.6-4),
 217  libc-bin (= 2.17-92), libc-dev-bin (= 2.17-92), libc6 (= 2.17-92),
 218  libc6-dev (= 2.17-92), libcap2 (= 1:2.22-1.2),
 219  libclass-isa-perl (= 0.36-5), libcloog-isl4 (= 0.18.0-2),
 220  libcloog-ppl1 (= 0.16.1-3), libcomerr2 (= 1.42.8-1),
 221  libcwidget3 (= 0.5.16-3.4), libdb5.1 (= 5.1.29-6), libdpkg-perl (= 1.17.1),
 222  libept1.4.12 (= 1.0.9), libfile-fcntllock-perl (= 0.14-2),
 223  libgcc-4.7-dev (= 4.7.3-6), libgcc-4.8-dev (= 4.8.1-8),
 224  libgcc1 (= 1:4.8.1-8), libgdbm3 (= 1.8.3-12), libgmp10 (= 2:5.1.2+dfsg-2),
 225  libgmpxx4ldbl (= 2:5.1.2+dfsg-2), libgomp1 (= 4.8.1-8),
 226  libgpm2 (= 1.20.4-6.1), libisl10 (= 0.11.2-1), libitm1 (= 4.8.1-8),
 227  liblocale-gettext-perl (= 1.05-7+b1), liblzma5 (= 5.1.1alpha+20120614-2),
 228  libmount1 (= 2.20.1-5.5), libmpc2 (= 0.9-4), libmpc3 (= 1.0.1-1),
 229  libmpfr4 (= 3.1.1-1), libncurses5 (= 5.9+20130608-1),
 230  libncursesw5 (= 5.9+20130608-1), libpam-modules (= 1.1.3-9),
 231  libpam-modules-bin (= 1.1.3-9), libpam-runtime (= 1.1.3-9),
 232  libpam0g (= 1.1.3-9), libpcre3 (= 1:8.31-2), libppl-c4 (= 1:1.0-7),
 233  libppl12 (= 1:1.0-7), libquadmath0 (= 4.8.1-8),
 234  libreadline6 (= 6.2+dfsg-0.1), libselinux1 (= 2.1.13-2),
 235  libsemanage-common (= 2.1.10-2), libsemanage1 (= 2.1.10-2),
 236  libsepol1 (= 2.1.9-2), libsigc++-2.0-0c2a (= 2.2.10-0.2),
 237  libslang2 (= 2.2.4-15), libsqlite3-0 (= 3.7.17-1),
 238  libss2 (= 1.42.8-1), libstdc++-4.8-dev (= 4.8.1-8),
 239  libstdc++6 (= 4.8.1-8), libstdc++6-4.6-dev (= 4.6.4-4),
 240  libswitch-perl (= 2.16-2), libtext-charwidth-perl (= 0.04-7+b1),
 241  libtext-iconv-perl (= 1.7-5), libtext-wrapi18n-perl (= 0.06-7),
 242  libtimedate-perl (= 1.2000-1), libtinfo5 (= 5.9+20130608-1),
 243  libtsan0 (= 4.8.1-8), libusb-0.1-4 (= 2:0.1.12-23.2),
 244  libustr-1.0-1 (= 1.0.4-3), libuuid1 (= 2.20.1-5.5),
 245  libxapian22 (= 1.2.15-2), linux-libc-dev (= 3.10.3-1),
 246  login (= 1:4.1.5.1-1), lsb-base (= 4.1+Debian12),
 247  make (= 3.81-8.2), mawk (= 1.3.3-17), mount (= 2.20.1-5.5),
 248  multiarch-support (= 2.17-92), ncurses-base (= 5.9+20130608-1),
 249  ncurses-bin (= 5.9+20130608-1), passwd (= 1:4.1.5.1-1), patch (= 2.7.1-3),
 250  perl (= 5.14.2-21),
 251  perl-base (= 5.14.2-21), perl-modules (= 5.14.2-21),
 252  readline-common (= 6.2+dfsg-0.1), screen (= 4.1.0~20120320gitdb59704-9),
 253  sed (= 4.2.2-2), sensible-utils (= 0.0.9), sysv-rc (= 2.88dsf-43),
 254  sysvinit (= 2.88dsf-43), sysvinit-utils (= 2.88dsf-43),
 255  tar (= 1.26+dfsg-6), tzdata (= 2013d-1), ucf (= 3.0027+nmu1),
 256  util-linux (= 2.20.1-5.5), vim (= 2:7.3.923-3), vim-common (= 2:7.3.923-3),
 257  vim-runtime (= 2:7.3.923-3), xz-utils (= 5.1.1alpha+20120614-2),
 258  zlib1g (= 1:1.2.8.dfsg-1)
 259 
 260    (Example naively generated by taking all packages installed
 261     by pbuilder when building the `hello` package.)
 262 
 263  * Do we want to trim this list? How?
 264     -> use the access time to files in the various packages
 265        to determine what was used or not (or another mechanism
 266        to be notified of packages that matters)
 267  * Do we want to include arch (eg. `:amd64`) in there? Yes - multiarch means we can have cross-arch deps (but not yet - britney needs work)
 268 
 269 Then, the good news: snapshot.debian.org keeps binary packages! but not .changes
 270 
 271 make (= 3.81-8.2)
 272   => http://snapshot.debian.org/package/make-dfsg/3.81-8.2/#make_3.81-8.2
 273 
 274 Is there an easy way to script installing a specific set of
 275 binary packages from snapshot? Yes - use a specific date in your sources.list:
 276 
 277 deb     http://snapshot.debian.org/archive/debian/20091004T111800Z/ lenny main
 278 deb-src http://snapshot.debian.org/archive/debian/20091004T111800Z/ lenny main
 279 deb     http://snapshot.debian.org/archive/debian-security/20091004T121501Z/ lenny/updates main
 280 deb-src http://snapshot.debian.org/archive/debian-security/20091004T121501Z/ lenny/updates main
 281 
 282 What's next?
 283 ------------
 284 
 285  * Do we have a “Champion”?… looks like not. :(
 286  * Fill up a page on the wiki
 287  * Who wants to have their package build reproducible?
 288    - Asheesh: alpine
 289    - Lunar: haveged
 290    - pabs: iotop (python based)
 291    - joeyh: debhelper :D
 292    - lindi: magit
 293  * [Asheesh] Convince dpkg to fix the 'not calling gzip -n' issue.
 294  * Another change needed in dpkg: tar --numeric-owner --owner=0
 295  * [Asheesh, Helmut] Attempt to code a downstream version of dedup.debian.net
 296    that lets us detect when files change between uploads of a package,
 297    and then run it on the archive.
 298  * Automated archive-wide testing of this issue and export to the PTS
 299  * [rbalint, lindi] libfaketime updates?
 300    advancing time in faketime with each time() call: https://github.com/wolfcw/libfaketime/pull/20
 301    [rbalint] replaying timestamp needs bigger changes in faketime, I'm working on those
 302  * [fil] talk to Ganeff about keeping .changes - hash chain from the Release files needed
 303  * Script to transform the "Built-Environment" list to
 304    links to file in the snapshot archives.
 305  * pbuilder like script that install all the packages in a
 306    chroot and rebuild the package there.
 307  * How about a sprint‽ Yes!
 308    Together with Multi-Arch friends? Sponsorship from ARM?
 309 
 310 Other ideas:
 311 
 312  * Research other distros (NixOS?)
 313  * Research
 314    https://build.opensuse.org/package/show/openSUSE:Factory/build-compare
 315  * Deterministic virtual machines
 316    "ReVirt: Enabling Intrusion Analysis through Virtual-Machine Logging and Replay" http://www.eecs.umich.edu/virtual/papers/dunlap02.pdf (HTTP 403 currently :-()
 317    "Debugging operating systems with time-traveling virtual machines" http://www.eecs.umich.edu/virtual/papers/king05_1.pdf (HTTP 403 currently :-()
 318    "A Particular Bug Trap: Execution Replay Using Virtual Machines" http://arxiv.org/pdf/cs.DC/0310030
 319    "ReTrace: Collecting Execution Trace with Virtual Machine Deterministic Replay"
 320    "Execution Replay for Multiprocessor Virtual Machines" http://www.eecs.umich.edu/~pmchen/papers/dunlap08.slides.ppt
 321 
 322 
 323 
 324 More post-BoF experiments
 325 -------------------------
 326 
 327 diff --git a/debian/control b/debian/control
 328 index 1ef9ccd..50b5221 100644
 329 --- a/debian/control
 330 +++ b/debian/control
 331 @@ -7,6 +7,7 @@ Standards-Version: 3.9.4
 332  Homepage: http://www.issihosts.com/haveged/
 333  Vcs-Git: git://git.debian.org/git/collab-maint/haveged.git
 334  Vcs-Browser: http://git.debian.org/?p=collab-maint/haveged.git
 335 +XC-Build-Environment: ${misc:Build-Environment}
 336  
 337  Package: haveged
 338  Architecture: linux-any
 339 diff --git a/debian/rules b/debian/rules
 340 index 04d6fcc..cb2cdf3 100755
 341 --- a/debian/rules
 342 +++ b/debian/rules
 343 @@ -15,3 +15,10 @@ override_dh_auto_configure:
 344  
 345  override_dh_strip:
 346         dh_strip --dbg-package=libhavege1-dbg
 347 +
 348 +override_dh_gencontrol:
 349 +       COLUMNS=999 | dpkg -l | awk ' \
 350 +                       BEGIN { printf "misc:Build-Environment=" } \
 351 +                       /^ii/ { ORS=", "; print $$2 " (= " $$3 ")" }' | \
 352 +               sed -e 's/, $$//' >> debian/substvars
 353 +       dh_gencontrol
 354 
 355 
 356 This does not work as `dpkg-genchanges` does not substitute
 357 the variable before adding the field in debian/changes! :(
 358   — Lunar
 359 
 360 But it is a trivial patch against dpkg:
 361 
 362 diff --git a/scripts/dpkg-genchanges.pl b/scripts/dpkg-genchanges.pl
 363 index 0b004c7..13cedd6 100755
 364 --- a/scripts/dpkg-genchanges.pl
 365 +++ b/scripts/dpkg-genchanges.pl
 366 @@ -516,4 +516,5 @@ for my $f (keys %remove) {
 367      delete $fields->{$f};
 368  }
 369  
 370 -$fields->output(\*STDOUT); # Note: no substitution of variables
 371 +$fields->apply_substvars($substvars);
 372 +$fields->output(\*STDOUT);
 373 
 374 
 375 
 376 --------------------------------------------------------
 377 
 378 -----------------------------------------------------------

Attached Files

To refer to attachments on a page, use attachment:filename, as shown below in the list of files. Do NOT use the URL of the [get] link, since this is subject to change and can break easily.
  • [get | view] (2015-01-12 14:43:29, 15.7 KB) [[attachment:dc13-bof-reproducible-builds.txt]]
 All files | Selected Files: delete move to page copy to page

You are not allowed to attach a file to this page.