Currently, this page a high-level overview of our design of components that use buildinfo files. Working together, these components provide various security guarantees on publicly-distributed binaries, relating to reproducibility.

Most of the details are yet to be worked out, and this page will be updated once that happens. In particular, discussion with ftpmasters is happening in 763822.

Overview

The characters in our story:

Servers:

Clients:

Guide to verbs:

actors.png

source: actors.dot

Details

Archive

Initially, we plan to collect all buildinfo files for a given architecture into one build Buildinfos-$arch.xz file. This is an easy approach that the archive mirror network can cope with. Later on, we might think about dividing this up so that one can get the data in a more fine-grained manner, but the initial proposal seems able to satisfy our other goals without too much overhead.

The main issue of concern here, is that any rebuilder who wants to rebuild one binary package will need to download the Buildinfos file corresponding to their architecture (or "all" if they are building an arch:all package). We have made some measurements and this is about 9MB for each architecture (about 250MB uncompressed, about 20MB with gzip). So it is not too much of a burden - it should take less than a minute to download this on an average modern internet connection.

If we collect all architectures into one file, this download would be much greater, and would likely greatly discourage typical users from performing rebuilds. In the other direction, collecting these into a per-source-package Buildinfos-$src-$ver.xz might put extra resource strain on the mirror network due to the large number of files that rsync must stat. There is a good chance that the mirror network will be able to cope with this, but it requires more discussion between different teams so for now we've chosen to leave this for future work: 9MB seems small enough.

TODO: import dkg's suggested "validation" steps

The archive should make a strongly-attributable statement that (a) "this is all of the buildinfo files for this release", by including the names and hashes of the Buildinfos-$arch.xz in the Release file, and (b) "these are the buildinfo files for this package", by including the names and hashes of the signed-buildinfo files in the Packages indices.

(Hashing the signed file allows for lookup in a sig-repo later, and is a strongly-attributable statement that the archive processed the buildinfo file as signed by that particular developer/buildd and not by someone else. This does not imply the buildinfo file is correct or true; clients (non-building and rebuilders) are still free to request as many buildinfo files that match a given binary hash (of the build output) as they want from a sig-repo, to verify the claim or to find independent parties that agree with it.)

For more background information on the Debian archive, see DebianRepository/Format.

Signed-buildinfo repository

TBD

Adapt a keyserver?

Developer

TBD

dput should be patched to upload a Buildinfo file to a sig-repo.

Rebuilders

TBD

This functionality could be added to reprotest, e.g. as a background service.

Non-building clients

TBD

One could configure apt to contact sig-repos and ask for a minimum number of independent signatures before installing packages.

Security considerations

Due to resource constraints - i.e. a design optimised for binary distribution but not signature distribution - archives are not expected to store buildinfo signatures by the original developer. Instead, they can re-sign a whole batch of buildinfo files at once, after doing basic sanity checks on them - e.g. to check that the developer isn't lying about them - and publish this. One may raise the point that this batch is redundant given the sig-repos, but actually they can help to avoid some attacks:

To prevent the archive framing them for generating a false or bad buildinfo file, developers MUST publish their own signed-buildinfos to a (or several) sig-repos. Developers must do this directly from their own machines, rather than relying on the archive to forward this - since the archive could just drop it, if they are being malicious. Again, yes nobody can forge the developer's signature on a buildinfo file, but if there is no signed version in public distribution, then there is no way to distinguish a legitimate one vs a fake one.

We do not yet attempt to define what sort of logic non-building clients should perform, in order to classify a "safe" vs an "unsafe" binary. This (a) does not affect the rest of our system, and (b) is a hard problem to solve, and would require more real-world data and research. The strictness of the policy will depend on the user's security needs.

  1. "One" means in the sense of who controls the keys; there might be a CDN or mirror network that duplicates the contents. (1)