Differences between revisions 4 and 5
Revision 4 as of 2016-10-10 14:14:57
Size: 3827
Editor: Infinity0
Comment: add a diagram
Revision 5 as of 2016-10-10 16:13:56
Size: 5804
Editor: Infinity0
Comment: add more details
Deletions are marked like this. Additions are marked like this.
Line 30: Line 30:
== Rationale ==
Line 32: Line 31:
=== Security considerations === == Details ==

=== Archive ===

Initially, we plan to collect all buildinfo files for a given architecture into one build `Buildinfos-$arch.xz` file. This is an easy approach that the archive mirror network can cope with. Later on, we might think about dividing this up so that one can get the data in a more fine-grained manner, but the initial proposal seems able to satisfy our other goals without too much overhead.

The main issue of concern here, is that any rebuilder who wants to rebuild one binary package will need to download the Buildinfos file corresponding to their architecture (or "all" if they are building an arch:all package). We have made some measurements and this is about 9MB for each architecture (about 250MB uncompressed, about 20MB with gzip). So it is not too much of a burden - it should take less than a minute to download this on an average modern internet connection.

If we collect all architectures into one file, this download would be much greater, and would likely greatly discourage typical users from performing rebuilds. In the other direction, collecting these into a per-source-package `Buildinfos-$src-$ver.xz` might put extra resource strain on the mirror network due to the large number of files that rsync must `stat`. There is a good chance that the mirror network ''will'' be able to cope with this, but it requires more discussion between different teams so for now we've chosen to leave this for future work: 9MB seems small enough.

TODO: import dkg's suggested "validation" steps

=== Signed-buildinfo repository ===

TBD

Adapt a keyserver?

=== Developer ===

TBD

`dput` should be patched to upload a Buildinfo file to a sigrepo.

=== Rebuilders ===

TBD

This functionality could be added to `reprotest`, e.g. as a background service.

=== Non-building clients ===

TBD

One could configure `apt` to contact sigrepos and ask for a minimum number of
independent signatures before installing packages.


== Security considerations ==

Discussion with ftpmasters is happening in 763822.

Overview

The characters in our story:

Servers:

  • One 1 archive. This accepts signed-buildinfo files from developers, and publishes a small collection of re-signed-buildinfo files to rebuilders and non-building clients.

  • Several sig-repos ("signed-buildinfo repository"). These accept signed-buildinfo files from developers and rebuilders, and publish all of those that they accept to clients. They are very similar to PGP keyservers, but instead of holding signatures-on-keys (and keys), they hold signed-buildinfo files. Eventually these could submit their contents to, or be, one or more transparency logs.

Clients:

  • Many developers, who generate signed-buildinfo files and push them to an archive and the sig-repos.

  • Many rebuilders. These pull data from the archive, attempt to reproduce the builds, then generate signed-buildinfo files and push them to the sig-repos (c.f. the workflow of developers). Some might be continuous integration services, some might be manually-run.

  • Many non-building clients. Theses pull data from the archive and the sig-repos, to gain confidence that what they install is reproducible, without rebuilding it themselves.

Guide to verbs:

  • Push-servers accept things from clients that push to them.

  • Pull-servers publish things to clients that pull from them.

actors.png

source: actors.dot

Details

Archive

Initially, we plan to collect all buildinfo files for a given architecture into one build Buildinfos-$arch.xz file. This is an easy approach that the archive mirror network can cope with. Later on, we might think about dividing this up so that one can get the data in a more fine-grained manner, but the initial proposal seems able to satisfy our other goals without too much overhead.

The main issue of concern here, is that any rebuilder who wants to rebuild one binary package will need to download the Buildinfos file corresponding to their architecture (or "all" if they are building an arch:all package). We have made some measurements and this is about 9MB for each architecture (about 250MB uncompressed, about 20MB with gzip). So it is not too much of a burden - it should take less than a minute to download this on an average modern internet connection.

If we collect all architectures into one file, this download would be much greater, and would likely greatly discourage typical users from performing rebuilds. In the other direction, collecting these into a per-source-package Buildinfos-$src-$ver.xz might put extra resource strain on the mirror network due to the large number of files that rsync must stat. There is a good chance that the mirror network will be able to cope with this, but it requires more discussion between different teams so for now we've chosen to leave this for future work: 9MB seems small enough.

TODO: import dkg's suggested "validation" steps

Signed-buildinfo repository

TBD

Adapt a keyserver?

Developer

TBD

dput should be patched to upload a Buildinfo file to a sigrepo.

Rebuilders

TBD

This functionality could be added to reprotest, e.g. as a background service.

Non-building clients

TBD

One could configure apt to contact sigrepos and ask for a minimum number of independent signatures before installing packages.

Security considerations

Due to resource constraints - i.e. a design optimised for binary distribution but not signature distribution - archives are not expected to store buildinfo signatures by the original developer. Instead, they can re-sign a whole batch of buildinfo files at once, after doing basic sanity checks on them - e.g. to check that the developer isn't lying about them - and publish this. One may raise the point that this batch is redundant given the sig-repos, but actually they can help to avoid some attacks:

  • Fake buildinfo files presented as "from the archive". Yes, these would not be signed by the archive - but if the archive does not officially publish a signed version, there is no way to distinguish a legitimate one vs a fake one.

  • Sig-repos getting ?DoSd by buildinfo files for junk. They may instead filter only for buildinfo files that build a source package that was actually published by the archive. Of course, they MUST accept (subject to self-protection vs DoS) buildinfo files with binary hashes that contradict what the archive said - that is the whole point of reproducible-builds.

To prevent the archive framing them for generating a false or bad buildinfo file, developers MUST publish their own signed-buildinfos to a (or several) sig-repos. Developers must do this directly from their own machines, rather than relying on the archive to forward this - since the archive could just drop it, if they are being malicious. Again, yes nobody can forge the developer's signature on a buildinfo file, but if there is no signed version in public distribution, then there is no way to distinguish a legitimate one vs a fake one.

We do not yet attempt to define what sort of logic non-building clients should perform, in order to classify a "safe" vs an "unsafe" binary. This (a) does not affect the rest of our system, and (b) is a hard problem to solve, and would require more real-world data and research. The strictness of the policy will depend on the user's security needs.

  1. "One" means in the sense of who controls the keys; there might be a CDN or mirror network that duplicates the contents. (1)