Attachment 'dc13-bof-reproducible-builds.txt'
Download 1 apt-get install gobby-infinote
2 gobby -c gobby.debian.org -n
3 debconf13/bof/reproducible-builds
4
5 Byte-for-byte identical reproducible builds?
6 ============================================
7
8 BoF at DebConf13 / Vaumarcus, Switzerland; chair: Lunar
9
10 Abstract:
11
12 The Bitcoin client and the upcoming Tor Browser Bundle 3.0 series
13 are using a build system that produces “deterministic builds” —
14 packages which are byte-for-byte identical no matter who actually
15 builds them, or what hardware they use. The idea is that current
16 popular software development practices simply cannot survive
17 targeted attacks of the scale and scope that we are seeing today.
18 With “deterministic builds”, any individual can use an anonymity
19 network to download publicly signed and audited source code and
20 reproduce the builds exactly, without being subject to such
21 targeted attacks. If they notice any differences, they can alert
22 the public builders/signers, hopefully anonymously.
23
24 Is such ideas applicable to Debian? To what extent? What would be
25 the first stones to pave the way toward reproducible builds of
26 Debian packages?
27
28 Foreword
29 --------
30
31 Huge, huge thanks to Asheesh for helping me prepare this BoF.
32
33 Agenda
34 ------
35
36 “Good news everyone! We are are going to get pwned!”
37 — Professor Farnsworth
38
39 1. Go around: why do you care? (5-10 min.)
40 2. Mike Perry's work on the Tor Browser Bundle (5 min.)
41 3. Asheesh's experiments (5 min.)
42 4. On the technical side, there's two aspects to the problem:
43 a. at the package level: How do we guarantee that given the same
44 source package and the same build environment, we get the same
45 binary results? (5-10 min.)
46 b. at the archive level: How to record the build environment of a
47 package (and enable its reproduction at a later time)?
48 (5 min.)
49 5. What's next? (15 min.)
50
51 Experience from making the Tor Browser Bundle builds reproducible
52 -----------------------------------------------------------------
53
54 Mike Perry worked on making the Tor Browser Bundle builds
55 reproducible. That's hard work: Tor Browser is based on Firefox
56 (huge code base) and is built for Linux, Mac OS X and Windows.
57
58 - How:
59 - Uses Gitian from Bitcoin
60 - Thin layer around Ubuntu virtualization tools
61 - Spins up a ubuntu VM with fixed hostname, username,
62 path, and fake timestamps (via faketime)
63 - List packages and architecture
64 - Runs a bash script you specify
65 - Cross compiles for Windows (mingw-w64) and Mac (toolchain4)
66 - Took about 3-4 days per OS to write a working descriptor set
67 for Tor, Firefox and bundling/localization
68 - 2 weeks after starting, I was producing matching repeat builds
69 on my own hardware
70 - Issues:
71 - FIPS-140 mode has non-deterministic sigs on Linux
72 - Millisecond timestamps encoded by Firefox
73 - Mystery 3 bytes of randomness on Windows. Bitstomped
74 - 6 more weeks of work to get the builds to match externally
75 - Filesytem reordering
76 - Affects Zip, Tar, .a, and even aspects of Firefox scripts
77 - created wrappers for archives
78 - Firefox ordering enforced via sorting inputs in Firefox scripts
79 - Localization LC_ALL leaks
80 - Alters sort order
81 - Permissions differences
82 - Even though I set umask...
83
84 To sum it up: the key that needs to be controled are the hostname,
85 username, build path, OS locale, uname output, toolchain version,
86 and time. We can either make everything deterministic or record on first build and the replay on subsequent builds.
87
88 Results from Asheesh's experiments
89 ----------------------------------
90
91 Asheesh jumped on the idea and played with the hello package.
92 Rebuilt using faketime on top of fakeroot.
93
94 * When you rebuild that way, the data.tar.gz of the built Debian
95 package has the same contents
96 * Same with control.tar.gz
97
98 However, the data.tar.gz and control.tar.gz *both* don't match each
99 other. This is because of a semi-bug in dpkg, we need convince dpkg
100 to fix the 'not calling gzip -n' issue.
101
102 * ELF binaries like /usr/bin/hello in the "hello" package
103 contain *no* timestamp that needs to be stripped.
104 * gzip files need '-n' to be passed to gzip for avoiding embedding a
105 timestamp.
106 * xz and bzip2 don't have this problem. I'm too pressed for time to
107 write a test script, but I did test it.
108 * dedup.debian.net can be used to detect duplicates, especially if
109 we hack it to detect files that change between uploads of a package,
110 rather than just between packages.
111 - future work: ssdeep hashes, which could be useful for finding files
112 that should be duplicates but aren't
113
114 NOTE that this might instead be because the *timestamps* of files within
115 control.tar.gz and data.tar.gz.. testing that theory... I have not finished
116 testing this theory, sadly, but here is a shell script I use to set up a lab:
117
118 http://rose.makesad.us/~paulproteus/tmp/extract_both.sh
119 - please provide an index for the PTS :)
120
121 Package level issues
122 --------------------
123
124 ### time
125
126 * Remove/strip the timestamps for build results.
127 * Use faketime (reports faked system time to programs). Time could be
128 automatically set to the time of the last debian/changelog entry.
129 * Base timestamps on timestamps of the source code, which should be unchanged
130 * Record time on first build and replay them later (see below).
131
132 (In most case, recording the time of the build is actually
133 wrong. For documentation, what matters is the time of the last
134 change in the source package and not the time of the build
135 itself.)
136
137 ### Build path
138
139 * Debian buildds use per-build temporary path names; so that any paths accidentally embedded in binaries do not exist on end-user systems (potential security issue).
140 * Stripping the path with debugedit (???)
141 * Correct solution: patch out where path appears -> use paths relative to the builddir
142 instead of having a common build directory for everyone.
143 (Because having encoded paths can hide real bugs, anyway.)
144
145 ### OS locale
146
147 * Use LANG=C.UTF-8 ? -> LC_ALL=C.UTF-8
148 * Let's make dpkg-buildpackage export this value
149 (or another wrapper? because dpkg-buildpackage is not
150 the policy canonical way to build all packages;
151 but debian/rules is painful)
152 Lets make this an option so that users see translated messages
153 and the buildds all build with English
154 * Change the policy to make dpkg-buildpackage be the canonical
155 solution to build package.
156
157 ### hostname, uname output, username
158
159 liblietome?
160
161 But kernel version is part of the build environment, so
162 we might need to record that somewhere else. Are kernels used on buildds always available? Or are some using non-standard kernels?
163
164 ### toolchain version
165
166 * part of the system state and build info
167
168 ### file ordering issues
169
170 Need to patch the build systems to add proper `sort` calls.
171
172 ### Randomisation
173
174 * Define seed?
175 * ASLR?
176
177 ### pid numbers
178
179 Let's patch that out if needed.
180
181 ### Others issues?
182
183
184 Archive level issues
185 --------------------
186
187 Not all packages are built on the buildds so the build environment isn't going to be the same (for now).
188
189 .changes file are not currently kept except on mailing lists.
190
191 We want .changes files: they are signed by the maintainer.
192
193 If we keep .changes file, we can add a `XC-Built-Environment` field.
194 It would add to the .changes files something like:
195
196 Built-Environment:
197 apt (= 0.9.9.4), aptitude (= 0.6.8.2-1), aptitude-common (= 0.6.8.2-1),
198 base-files (= 7.2), base-passwd (= 3.5.26), bash (= 4.2+dfsg-1),
199 binutils (= 2.23.52.20130727-1), bsdutils (= 1:2.20.1-5.5),
200 build-essential (= 11.6), bzip2 (= 1.0.6-4), ccache (= 3.1.9-1),
201 coreutils (= 8.21-1), cpp (= 4:4.8.1-2), cpp-4.6 (= 4.6.4-4),
202 cpp-4.7 (= 4.7.3-6), cpp-4.8 (= 4.8.1-8), dash (= 0.5.7-3),
203 debconf (= 1.5.50), debconf-i18n (= 1.5.50),
204 debian-archive-keyring (= 2012.4), debianutils (= 4.4),
205 diffutils (= 1:3.2-8), dpkg (= 1.17.1), dpkg-dev (= 1.17.1),
206 e2fslibs (= 1.42.8-1), e2fsprogs (= 1.42.8-1), fakeroot (= 1.19-2),
207 findutils (= 4.4.2-6), g++ (= 4:4.8.1-2), g++-4.6 (= 4.6.4-4),
208 g++-4.8 (= 4.8.1-8), gcc (= 4:4.8.1-2), gcc-4.4-base (= 4.4.7-4),
209 gcc-4.5-base (= 4.5.4-1), gcc-4.6 (= 4.6.4-4), gcc-4.6-base (= 4.6.4-4),
210 gcc-4.7 (= 4.7.3-6), gcc-4.7-base (= 4.7.3-6), gcc-4.8 (= 4.8.1-8),
211 gcc-4.8-base (= 4.8.1-8), gnupg (= 1.4.14-1), gpgv (= 1.4.14-1),
212 grep (= 2.14-2), gzip (= 1.6-1), hostname (= 3.13),
213 initscripts (= 2.88dsf-43), insserv (= 1.14.0-5), less (= 458-2),
214 libacl1 (= 2.2.52-1), libapt-pkg4.12 (= 0.9.9.4), libasan0 (= 4.8.1-8),
215 libatomic1 (= 4.8.1-8), libattr1 (= 1:2.4.47-1), libblkid1 (= 2.20.1-5.5),
216 libboost-iostreams1.49.0 (= 1.49.0-4), libbz2-1.0 (= 1.0.6-4),
217 libc-bin (= 2.17-92), libc-dev-bin (= 2.17-92), libc6 (= 2.17-92),
218 libc6-dev (= 2.17-92), libcap2 (= 1:2.22-1.2),
219 libclass-isa-perl (= 0.36-5), libcloog-isl4 (= 0.18.0-2),
220 libcloog-ppl1 (= 0.16.1-3), libcomerr2 (= 1.42.8-1),
221 libcwidget3 (= 0.5.16-3.4), libdb5.1 (= 5.1.29-6), libdpkg-perl (= 1.17.1),
222 libept1.4.12 (= 1.0.9), libfile-fcntllock-perl (= 0.14-2),
223 libgcc-4.7-dev (= 4.7.3-6), libgcc-4.8-dev (= 4.8.1-8),
224 libgcc1 (= 1:4.8.1-8), libgdbm3 (= 1.8.3-12), libgmp10 (= 2:5.1.2+dfsg-2),
225 libgmpxx4ldbl (= 2:5.1.2+dfsg-2), libgomp1 (= 4.8.1-8),
226 libgpm2 (= 1.20.4-6.1), libisl10 (= 0.11.2-1), libitm1 (= 4.8.1-8),
227 liblocale-gettext-perl (= 1.05-7+b1), liblzma5 (= 5.1.1alpha+20120614-2),
228 libmount1 (= 2.20.1-5.5), libmpc2 (= 0.9-4), libmpc3 (= 1.0.1-1),
229 libmpfr4 (= 3.1.1-1), libncurses5 (= 5.9+20130608-1),
230 libncursesw5 (= 5.9+20130608-1), libpam-modules (= 1.1.3-9),
231 libpam-modules-bin (= 1.1.3-9), libpam-runtime (= 1.1.3-9),
232 libpam0g (= 1.1.3-9), libpcre3 (= 1:8.31-2), libppl-c4 (= 1:1.0-7),
233 libppl12 (= 1:1.0-7), libquadmath0 (= 4.8.1-8),
234 libreadline6 (= 6.2+dfsg-0.1), libselinux1 (= 2.1.13-2),
235 libsemanage-common (= 2.1.10-2), libsemanage1 (= 2.1.10-2),
236 libsepol1 (= 2.1.9-2), libsigc++-2.0-0c2a (= 2.2.10-0.2),
237 libslang2 (= 2.2.4-15), libsqlite3-0 (= 3.7.17-1),
238 libss2 (= 1.42.8-1), libstdc++-4.8-dev (= 4.8.1-8),
239 libstdc++6 (= 4.8.1-8), libstdc++6-4.6-dev (= 4.6.4-4),
240 libswitch-perl (= 2.16-2), libtext-charwidth-perl (= 0.04-7+b1),
241 libtext-iconv-perl (= 1.7-5), libtext-wrapi18n-perl (= 0.06-7),
242 libtimedate-perl (= 1.2000-1), libtinfo5 (= 5.9+20130608-1),
243 libtsan0 (= 4.8.1-8), libusb-0.1-4 (= 2:0.1.12-23.2),
244 libustr-1.0-1 (= 1.0.4-3), libuuid1 (= 2.20.1-5.5),
245 libxapian22 (= 1.2.15-2), linux-libc-dev (= 3.10.3-1),
246 login (= 1:4.1.5.1-1), lsb-base (= 4.1+Debian12),
247 make (= 3.81-8.2), mawk (= 1.3.3-17), mount (= 2.20.1-5.5),
248 multiarch-support (= 2.17-92), ncurses-base (= 5.9+20130608-1),
249 ncurses-bin (= 5.9+20130608-1), passwd (= 1:4.1.5.1-1), patch (= 2.7.1-3),
250 perl (= 5.14.2-21),
251 perl-base (= 5.14.2-21), perl-modules (= 5.14.2-21),
252 readline-common (= 6.2+dfsg-0.1), screen (= 4.1.0~20120320gitdb59704-9),
253 sed (= 4.2.2-2), sensible-utils (= 0.0.9), sysv-rc (= 2.88dsf-43),
254 sysvinit (= 2.88dsf-43), sysvinit-utils (= 2.88dsf-43),
255 tar (= 1.26+dfsg-6), tzdata (= 2013d-1), ucf (= 3.0027+nmu1),
256 util-linux (= 2.20.1-5.5), vim (= 2:7.3.923-3), vim-common (= 2:7.3.923-3),
257 vim-runtime (= 2:7.3.923-3), xz-utils (= 5.1.1alpha+20120614-2),
258 zlib1g (= 1:1.2.8.dfsg-1)
259
260 (Example naively generated by taking all packages installed
261 by pbuilder when building the `hello` package.)
262
263 * Do we want to trim this list? How?
264 -> use the access time to files in the various packages
265 to determine what was used or not (or another mechanism
266 to be notified of packages that matters)
267 * Do we want to include arch (eg. `:amd64`) in there? Yes - multiarch means we can have cross-arch deps (but not yet - britney needs work)
268
269 Then, the good news: snapshot.debian.org keeps binary packages! but not .changes
270
271 make (= 3.81-8.2)
272 => http://snapshot.debian.org/package/make-dfsg/3.81-8.2/#make_3.81-8.2
273
274 Is there an easy way to script installing a specific set of
275 binary packages from snapshot? Yes - use a specific date in your sources.list:
276
277 deb http://snapshot.debian.org/archive/debian/20091004T111800Z/ lenny main
278 deb-src http://snapshot.debian.org/archive/debian/20091004T111800Z/ lenny main
279 deb http://snapshot.debian.org/archive/debian-security/20091004T121501Z/ lenny/updates main
280 deb-src http://snapshot.debian.org/archive/debian-security/20091004T121501Z/ lenny/updates main
281
282 What's next?
283 ------------
284
285 * Do we have a “Champion”?… looks like not. :(
286 * Fill up a page on the wiki
287 * Who wants to have their package build reproducible?
288 - Asheesh: alpine
289 - Lunar: haveged
290 - pabs: iotop (python based)
291 - joeyh: debhelper :D
292 - lindi: magit
293 * [Asheesh] Convince dpkg to fix the 'not calling gzip -n' issue.
294 * Another change needed in dpkg: tar --numeric-owner --owner=0
295 * [Asheesh, Helmut] Attempt to code a downstream version of dedup.debian.net
296 that lets us detect when files change between uploads of a package,
297 and then run it on the archive.
298 * Automated archive-wide testing of this issue and export to the PTS
299 * [rbalint, lindi] libfaketime updates?
300 advancing time in faketime with each time() call: https://github.com/wolfcw/libfaketime/pull/20
301 [rbalint] replaying timestamp needs bigger changes in faketime, I'm working on those
302 * [fil] talk to Ganeff about keeping .changes - hash chain from the Release files needed
303 * Script to transform the "Built-Environment" list to
304 links to file in the snapshot archives.
305 * pbuilder like script that install all the packages in a
306 chroot and rebuild the package there.
307 * How about a sprint‽ Yes!
308 Together with Multi-Arch friends? Sponsorship from ARM?
309
310 Other ideas:
311
312 * Research other distros (NixOS?)
313 * Research
314 https://build.opensuse.org/package/show/openSUSE:Factory/build-compare
315 * Deterministic virtual machines
316 "ReVirt: Enabling Intrusion Analysis through Virtual-Machine Logging and Replay" http://www.eecs.umich.edu/virtual/papers/dunlap02.pdf (HTTP 403 currently :-()
317 "Debugging operating systems with time-traveling virtual machines" http://www.eecs.umich.edu/virtual/papers/king05_1.pdf (HTTP 403 currently :-()
318 "A Particular Bug Trap: Execution Replay Using Virtual Machines" http://arxiv.org/pdf/cs.DC/0310030
319 "ReTrace: Collecting Execution Trace with Virtual Machine Deterministic Replay"
320 "Execution Replay for Multiprocessor Virtual Machines" http://www.eecs.umich.edu/~pmchen/papers/dunlap08.slides.ppt
321
322
323
324 More post-BoF experiments
325 -------------------------
326
327 diff --git a/debian/control b/debian/control
328 index 1ef9ccd..50b5221 100644
329 --- a/debian/control
330 +++ b/debian/control
331 @@ -7,6 +7,7 @@ Standards-Version: 3.9.4
332 Homepage: http://www.issihosts.com/haveged/
333 Vcs-Git: git://git.debian.org/git/collab-maint/haveged.git
334 Vcs-Browser: http://git.debian.org/?p=collab-maint/haveged.git
335 +XC-Build-Environment: ${misc:Build-Environment}
336
337 Package: haveged
338 Architecture: linux-any
339 diff --git a/debian/rules b/debian/rules
340 index 04d6fcc..cb2cdf3 100755
341 --- a/debian/rules
342 +++ b/debian/rules
343 @@ -15,3 +15,10 @@ override_dh_auto_configure:
344
345 override_dh_strip:
346 dh_strip --dbg-package=libhavege1-dbg
347 +
348 +override_dh_gencontrol:
349 + COLUMNS=999 | dpkg -l | awk ' \
350 + BEGIN { printf "misc:Build-Environment=" } \
351 + /^ii/ { ORS=", "; print $$2 " (= " $$3 ")" }' | \
352 + sed -e 's/, $$//' >> debian/substvars
353 + dh_gencontrol
354
355
356 This does not work as `dpkg-genchanges` does not substitute
357 the variable before adding the field in debian/changes! :(
358 — Lunar
359
360 But it is a trivial patch against dpkg:
361
362 diff --git a/scripts/dpkg-genchanges.pl b/scripts/dpkg-genchanges.pl
363 index 0b004c7..13cedd6 100755
364 --- a/scripts/dpkg-genchanges.pl
365 +++ b/scripts/dpkg-genchanges.pl
366 @@ -516,4 +516,5 @@ for my $f (keys %remove) {
367 delete $fields->{$f};
368 }
369
370 -$fields->output(\*STDOUT); # Note: no substitution of variables
371 +$fields->apply_substvars($substvars);
372 +$fields->output(\*STDOUT);
373
374
375
376 --------------------------------------------------------
377
378 -----------------------------------------------------------
Attached Files
To refer to attachments on a page, use attachment:filename, as shown below in the list of files. Do NOT use the URL of the [get] link, since this is subject to change and can break easily.You are not allowed to attach a file to this page.