Differences between revisions 159 and 160
Revision 159 as of 2017-06-12 15:22:46
Size: 46442
Editor: ?DanielShahaf
Comment: grammar: add missing hyphen
Revision 160 as of 2017-10-03 15:14:39
Size: 46556
Editor: Infinity0
Comment: mention steven's talk, mention that this and srebuild are WIP
Deletions are marked like this. Additions are marked like this.
Line 40: Line 40:
The srebuild program is a sbuild wrapper which finds a timestamp from
snapshot.debian.org containing all versions of the binary packages in a
`.buildinfo` file and then carries out the build with the right versions
installed.

See [[https://lists.alioth.debian.org/pipermail/reproducible-builds/Week-of-Mon-20141229/000613.html|srebuild]].
This is work-in-progress.

See [[https://debconf17.debconf.org/talks/91/|Fun with buildinfo]] (2017).

See also [[https://lists.alioth.debian.org/pipermail/reproducible-builds/Week-of-Mon-20141229/000613.html|srebuild]] (2015). The srebuild program is a sbuild wrapper which finds a timestamp from
snapshot.debian.org containing all versions of the binary packages in a `.buildinfo` file and then carries out the build with the right versions installed.

With free software, anyone can inspect the source code for malicious flaws. But Debian provide binary packages to its users. The idea of “deterministic” or “reproducible” builds is to empower anyone to verify that no flaws have been introduced during the build process by reproducing byte-for-byte identical binary packages from a given source.

More information about reproducible builds in general are available at reproducible-builds.org.

Why do we want reproducible builds?

  • Allow independent verifications that a binary matches what the source intended to produce.
    • Should reproducible uploads become mandatory, then the incentive of an attacker to compromise the system of a developer with upload rights is lowered because it is not anymore possible for the developer to upload a binary that does not match the uploaded sources.
    • Additionally, the incentive for this kind of attack is further lowered because an attacker now has to compromise all machines that can check the reproducibility of the uploaded source.
    • Finally, with a sufficiently large body of independent (geographically and administratively) machines, reproducible builds can help find systems which are compromised in a way to produce binaries with altered functionality.
  • Help Multi-Arch: same packages co-installation (as they need every matching file to be byte identical).

  • Be able to generate debug symbols for packages which do not have a “debug package”.
  • Ensure packages can be built from source. The archive could be made to only accept reproducible uploads: the maintainer would stop uploading .deb files but keep them referenced in the .changes. A build would then build the source. Only if the hash matches the upload gets accepted.
  • Allow file-level deduplication on Debian mirror sites, or maybe snapshots.d.o, of .deb files whose contents didn't really change between versions.
  • Allow .deb deltas to be smaller.
  • Packages with build profiles must offer the exact same functionality for all profiles. Reproducible builds could be use to verify that it is the case.

  • Making sure that Architecture:all packages are build identically on different build architectures.

  • Validate cross-builds against native builds.

  • Find release critical bugs

  • Find embedded code copies (when packages should be reproducible because a toolchain package got fixed but are not because they use an embedded copy instead)
  • Run builds in environments that trace things like system calls, file system or network access for QA or general analytical purposes. Reproducible builds help to ensure that the used tracing method had no influence on the produced binary.
  • allow diverse double compilation to verify compiler integrity: if one can compile gcc with gcc and clang (and any other compiler in Debian capable of compiling gcc) and then recompile gcc again with the gcc compiler packages created in the first step. The resulting packages should be bit by bit identical.

  • diverse double compile (bootstrap) a base Debian on different distributions to make sure that also secondary build input like coreutils and C library are not affected by the Ken Thompson problem (this also allows to verify that hardware is not compromised using cross compilation on different platforms)
  • proprietary binary blobs can verify that they are used by unmodified free software (example: Firefox Encrypted Media Extensions binary blob requires an untampered-with Firefox)
  • Allow Debian package maintainers to verify that their packaging related changes to a source package (like switching the build system to debhelper/dh or upgrading the compat level) do not introduced unexpected side effects.

Reproducing builds

There are two sides to the problem: the build environment needs to be recorded during the initial build, and the same environment needs to be reproduced for later rebuilds.

Recording the environment

Information on a build will be recorded in a new control file with extension `.buildinfo`.

Reproduce the build environment

This is work-in-progress.

See Fun with buildinfo (2017).

See also srebuild (2015). The srebuild program is a sbuild wrapper which finds a timestamp from snapshot.debian.org containing all versions of the binary packages in a .buildinfo file and then carries out the build with the right versions installed.

References

Presentations

Publicity

This section lists URLs, people, and dates for when other people have publicly expressed interest, or shared information about, the project.

Weekly reports

Stretch cycle

GSoC 2015: akira

GSoC 2015: Dhole

Related projects

  • CARE monitors the execution of the specified command to create an archive that contains all the material required to re-execute it in the same context.

Further work

Having reproducible builds allows us to trust binary packages better, because it becomes easier to have:

  • diversity of build location and jurisdiction - build packages in more than one location, including the developer's
  • diversity of build hardware, in case of hardware bugs, or malicious implants - a mix of VMs, some real hardware, different CPU manufacturers, different date of manufacture and supplier
  • diversity of people - multiple signatures on a .changes file
  • diversity of kernels, explained below

Kernel packages

Special features of kernel packages (including bootloaders and hypervisors) - GRUB2, Xen, linux, kfreebsd...

  • we put huge trust in them - kernels are the ultimate target of any rootkit, able to completely hide from userland
  • a kernel image built for amd64, if the build system is portable and reproducible enough, will be the same whether built from linux-amd64 or kfreebsd-amd64
  • or maybe from different kernel versions - for example, a jessie build chroot on a wheezy host system

Then we would be better protected from something that could affect many systems at once, such as a kernel vulnerability; or widespread infection by a rootkit, which now must be compatible with more than one type of kernel to go unnoticed.