Differences between revisions 8 and 9
Revision 8 as of 2015-01-06 00:18:07
Size: 10681
Editor: HolgerLevsen
Comment: explain how i'm not a bottleneck here ;-) (and how this could be fixed)
Revision 9 as of 2015-01-06 00:22:45
Size: 10786
Editor: Lunar
Comment: reword task suggestions
Deletions are marked like this. Additions are marked like this.
Line 10: Line 10:
= Useful things you (yes, you!) can do = = Task suggestions =
Line 12: Line 12:
 * If you maintain a package for debian, you can make sure that your package uses a modern debhelper style (e.g. one-liner `debian/rules` with overrides as needed). We aim to fix many causes of non-deterministic builds in the debhelper suite directly, so packages that use debhelper will be much easier to make reproducible with just an upgrade of the toolchain.
 * Look at the last 24h of results from [[https://reproducible.debian.net/userContent/index_last_24h.html|Jenkins reproducible jobs]], pick a package, look at the `debbindiff` output and investigate.
 * Find a way to prevent javadoc from writing timestamps.
 * Find a way to prevent Epydoc from writing timestamps and output links in filesystem order.
 * Find a way to get reproducible PE binaries.
 * If you maintain a package for Debian, you can make sure that your package uses a [[http://anonscm.debian.org/cgit/debhelper/debhelper.git/tree/dh#n77|modern debhelper style]] (e.g. one-liner `debian/rules` with overrides as needed). We aim to fix many causes of non-deterministic builds in the debhelper suite directly, so packages that use debhelper will be much easier to make reproducible with just an upgrade of the toolchain.
 * [[#Inventorying_issues|Inventory issues]] found by the continuous integration platform.
 * [[#Fixing_issues|Fix known reproducibility issues]]. Bonus point for finding ways to:
   * prevent javadoc from writing timestamps;
   * prevent Epydoc from writing timestamps and get links in a stable order;
   * prevent maven from writing timestamps;
   * prevent timestamps in PHP registry files;
   * prevent yard from writing timestamps;
   * get reproducible PE binaries.
Line 18: Line 22:
 * Research about other distributions: NixOS, SUSE (see [[https://build.opensuse.org/package/show/openSUSE:Factory/build-compare|build-compare]]), then write about it on your blog and link to it on this wiki page.
 * make the `perl` package build entirely without calling perl at all
 * Research about other distributions: NixOS, SUSE (see [[https://build.opensuse.org/package/show/openSUSE:Factory/build-compare|build-compare]]), then write about it on a blog, this wiki or the mailing list.
Line 21: Line 24:
If you want to help with this, feel free to ping the mailing list or edit this wiki page. To get help, feel free to ask on the IRC channel or the mailing list. We want to be friendly, supportive, and have fun experimenting together.

Stay in touch

Task suggestions

  • If you maintain a package for Debian, you can make sure that your package uses a modern debhelper style (e.g. one-liner debian/rules with overrides as needed). We aim to fix many causes of non-deterministic builds in the debhelper suite directly, so packages that use debhelper will be much easier to make reproducible with just an upgrade of the toolchain.

  • ?Inventory issues found by the continuous integration platform.

  • ?Fix known reproducibility issues. Bonus point for finding ways to:

    • prevent javadoc from writing timestamps;
    • prevent Epydoc from writing timestamps and get links in a stable order;
    • prevent maven from writing timestamps;
    • prevent timestamps in PHP registry files;
    • prevent yard from writing timestamps;
    • get reproducible PE binaries.
  • Create a patch for pbuilder to build packages in /usr/src/debian/hello-2.8-1 instead of /tmp/buildd.

  • Research about other distributions: NixOS, SUSE (see build-compare), then write about it on a blog, this wiki or the mailing list.

To get help, feel free to ask on the IRC channel or the mailing list. We want to be friendly, supportive, and have fun experimenting together.

How to report bugs

All bugs relevant to the reproducible builds project should use usertags with user reproducible-builds@lists.alioth.debian.org. Also use X-Debbugs-Cc to notify the list.

Current usertags in use:

toolchain
affects a tool used by other package build systems
infrastructure
affects the whole Debian infrastructure or policies
timestamps
time of build in recorded during the build process
fileordering
build output varies with readdir() order
buildpath
path of sources is recorded during the build process
username
username is recorded during the build process
hostname
hostname is recorded during the build process
uname
uname output is recorded during the build process
randomness
some build aspects are dependent on (pseudo-)randomness
cpu
some build aspects are dependent on CPU features or computation speed
buildinfo
issues related to .buildinfo control files

?Control commands to update the view on the BTS.

Example email to submit a patch:

From: J. Random Hacker <jrhacker@example.org>
To: submit@bugs.debian.org
Subject: foo: please make the build reproducible
X-Debbugs-Cc: reproducible-builds@lists.alioth.debian.org

Source: foo
Version: 1.0-1
Severity: wishlist
Tags: patch
User: reproducible-builds@lists.alioth.debian.org
Usertags: timestamps fileordering

Hi!

While working on the “reproducible builds” effort [1], we have noticed
that foo could not be built reproducibly.

The attached patch removes extra timestamps from the build system and
ensure a stable file order when creating the source archive. Once applied,
foo can be built reproducibly in our current experimental framework.

 [1]: https://wiki.debian.org/ReproducibleBuilds

Inventorying issues

The easiest way to find issues is to examine the list of packages failing to build reproducibly as found by continuous integration. The first packages in the list are the one who have been tried most recently.

Notes about packages are kept in the notes Git repository in packages.yml. The list of known common issues is kept in the issues.yml file.

The page for a given package should open on the debbindiff output. Read the list of known issues to get an idea of what you may found. Here are some more advices:

The clean-notes script in the misc repository will detect outdated notes and re-order packages by alphabetical order. It should be run before committing changes to the notes repository.

Fixing issues

Fixing reproducibility issues falls into two categories: either the problem is specific to a single package or the cause is the output of another package (then referenced as “toolchain” package).

Fixing a single package

The usual steps are:

  1. Use debcheckout or apt-get source to retrieve the source code.

  2. Do the changes. With packages using the 3.0 (quilt) format, dpkg-source --commit can be useful.

  3. Update debian/changelog. New version is usually original version with .0~reproducible1.

  4. Use dpkg-buildpackage -S to create source package.

  5. Use the prebuilder script to test reproducibility. If the package is not reproducible, examine debbindiff output logs/PACKAGE.debbindiff.html or compare build logs logs/PACKAGE.build1 and logs/PACKAGE.build2, then repeat from step 2 unless the issue comes from another package. In that case, see about “toolchain” packages below.

  6. Use debdiff or git format-patch to create patches.

  7. ?Create a new bug report, and don't forget to attach the patch!

  8. Add an entry or reference the bug in packages.yml in notes.git.

Fixing a toolchain package

Fixing an issue in a package that affects the reproducibility of other packages requires a bit more step, but the general process is the same:

  1. Use debcheckout or apt-get source to retrieve the source code.

  2. Do the changes. With packages using the 3.0 (quilt) format, dpkg-source --commit can be useful.

  3. Update debian/changelog. New version is usually original version with .0~reproducible1.

  4. Use pdebuild or gbp buildpackage to build the package.

  5. Backup base-reproducible.tgz.

  6. Use pbuilder --login --save-after-exec --basetgz base-reproducible.tgz to install the newly built package.

  7. Test a package affected with prebuilder. If the issue is still not fixed, repeat from step 2.

  8. If the package is in Git, use SSH to login on alioth.debian.org. Go to /git/reproducible. Use ./setup-repository to create a new repository. Push your changes to a (rebasable) pu/reproducible_builds branch.

  9. Subscribe to the upload-source notification for the package on the Package Tracking System. This is needed so you don't forget to update the custom package when a new version hits the archive.

  10. Upload the package to the reproducible APT repository.

  11. Document the changes on the wiki.

  12. Reference the bug in issues.yml in notes.git and on the wiki page about the issue if there's one.

  13. Ask h01ger to reschedule affected packages. (Usually via the #debian-reproducible IRC channel, sometimes he might ask for emails instead. In theory we could also allow signed gpg mails to trigger that but so far manual rescheduling on demand has proven to work well.)

Working on the continuous integration platform

Several jobs have been created to regularly test packages (from sid main) on jenkins.debian.net. As a result there is the reproducible build overview of packages.

The setup is explained in this blog post only, but this post is somewhat outdated by now and needs to be amended.

See the various reproducible_* scripts in the Jenkins Git repository.

bash script to analyze images

Deterministic images (raw images, qcow2 images, iso's) are the next logical evolution. There is a analyze_image bash script that creates sha512 hashes of all files included within an image, access rights, symlinks, parition table, bootloader and more. Doing this with two images that should match and comparing the reports the script creates can help to identify sources of non-determinism in images.

See also:

Does not have iso support yet. The autor (Patrick Schleizer) is interested to generalize the script for more generic, Debian use cases.