Instead of having this list hang on my fridge, I'll maintain my personal Debian TODO list here.
have a service like codesearch but index the built source package directories. This allows to search in the files generated during the build process. For example, the glib/gobject function g_signal_list_ids returns an array in random order (relevant for ReproducibleBuilds) but its invocations are automatically generated when valac turns vala code into C, making it impossible to find all affected packages.
- have a tool like apt-file but let it search the files the binary package creates *after* installation. This is useful for tools which use maintainer scripts to create symlinks or files.
extend binarycontrol.d.n with contrib and non-free and hopefully merge it into codserach.d.n at some point
- extend jenkins bin/reproducible_create_meta_pkg_sets.sh to use more ceve once new dose3 version with support for source packages arrives in unstable
extend jenkins bin/find_dpkg_trigger_cycles.sh with a heuristic based on maintainer scripts and then testing whether a cycle actually exists as documented in 774803 This will fix problems as in 778695
add a jenkins test checking the validity of Debian architecture wildcard: https://github.com/josch/findarchwildcardproblems
check for unused build dependencies: https://github.com/josch/findunusedbd
- in the process, check for writes outside the build directory (like in $HOME)
- in the process, check for source packages with a non-functional clean target, meaning that the source package cannot be rebuilt in the same tree because of dpkg-source errors
- in the process, check whether the source package can be built without network access
- publish the generated file access statistics generated by fatrace for possible other uses
package lego-parts-generator: https://github.com/josch/lego-parts-generator
package img2pdf: https://github.com/josch/img2pdf
- a way to map input sources to created binaries
- benefits: copyright, embedded code copies stuff, GPL compliance verification
- can be solved either by running all compilation through wrapper scripts that do the logging or by having "structured build logs"
- structured build logs
- a structured build log would for each "line" know the originating process
- a structured build log from a parallel build could be linearized retroactively, because the process tree is known
- a structured build log could also mention which files a process used and which it produced
- ptrace and LD_PRELOAD do not work well because they break test suits
- check dose-ceve support for who_provides and source packages and poke people for a release or otherwise a git-version upload to experimental
- fix botch-build-order-from-zero but this needs dose-ceve support for who_provides and sources files (only in git right now)
- TODOs from Helmut:
- rename to something that does not include "order", because it is not about ordering at all
- desired API:
- takes a build architecture (implicit?), host architecture, binary package lists (for build and host), source package list, a set of binary (or source?) package names that are to be cross-built
- returns a set of source package names that need to be cross-built in order to obtain the requested set
- ideally, it has close to no false positives
- ideally, it something facilitates improving the input data as the archive currently does not allow build profiles, :native or is just missing M-A annotations
- allow to specify a custom source package for cross satisfyability analysis instead of defaulting to build-essential
- TODOs from Helmut:
- should be packaged for Debian
- should only retry X times and not infinite until somebody notices and lets a buildd admin set the package to failed
- if a buildd reboots or halts, then the state keeps being "building" even
- in case of reboot, if a buildd instance does another build then the prior build has to be cleaned
- in case of hanged buildd, the admin should be notified after a certain time of inactivity
- I somehow cannot get debugedit to work in the first place so no way to fix its use for reproducible builds
- generate cross.html for more arches, particularly multilib arches (armhf is not multilib) and then have an overview page showing a quick summary of packages failing on all arches and those that only fail on some
- mk-build-deps should use the binary package building capabilities of sbuild and in turn sbuild should offer these as a library
- allow to make use of all sbuild features automatically (build profiles, cross building, resolution of disjunctions, using dpkg for parsing)
- less duplication of dependency parsing code
- extend botch-cross such that cross toolchain translation exists (gcc-for-host) and then sed all gcc, gcc-X.Y, gcc-multilib to gcc-for-host (same for g++, gfortran) and pretend the availability of gcc-for-host:host and also tell it that libc6-dev:host exists already
extend debversioncomp https://gitlab.mister-muffin.de/josch/debversioncomp with tests for libdpkg-perl and libben
- some cycles have a source package (src:cups-filters, src:ijs) multiple times in a cycle. This should not happen. Look at the output of botch-stat-html or the FIXME comment in stat-html.py.
- add statistics about usage of build profiles in Debian source packages to bootstrap.debian.net
- binary packages should be reproducible across different build profiles but they currently can not because the Build-Profiles field will vary for different enabled profiles. Removing the Build-Profiles field for build profiles other than stage1 and nodoc (which allow changing binary package contents) might not be ideal because the information that a deb was produced by crossing it can be valuable to debug problems. Other info which can be valueable to be included in a deb but varies between builds is the build time or embedded signatures. Two possible solutions:
- add a third member to the deb after data.tar which contains all the non-reproducible data. Downside: checking for reproducibility requires parsing the deb and is thus less easy than just calculating the hash sum. Upside: embeddable signatures, build time and build profile info (idea of Helmut)
- add a web service shipping the non-reproducible data when given the hash sum of a deb. Or embed this data in the Packages file. Downside: no embedded signatures, requires more than just the deb itself to get to the information. Upside: trivial to check reproducibility
- in bootstrap.d.n enable profiles nocheck and nodoc during native and in addition, during cross enable cross profile for analysis
- cross.html is supposed to show the list of remaining problems of crossbuild satisfiability for source packages that have to be crossed. Ideally that set of source packages would be exactly the set required to crossbuild the coinstallation set of build-essential but that would require all these source packages to be cross satisfiable as otherwise a dependency graph cannot be generated. A chicken and egg problem. As a result and a compromise, cross.html shows the cross satisfiability problems for the strong transitive essential set. This is not optimal because many of the source packages selected by this method do never have to be crossed. Either because some build dependencies can be dropped by the cross, nodoc or nocheck profile or because the binary package they build is M-A:foreign and thus available during cross building. The situation escalated with the upload of ffms2 2.21-2 to unstable in May 2015. That version introduced a build dependency on pandoc which in turn would pull in half the haskell world and increase the size of the largest SCC by about 400 source packages or 29% compared to before. Even if ffms2 would have to be compiled during the cross-phase, it would never cross-build depend on pandoc because that can be made M-A:foreign. So those extra 400 source packages show up in cross.html for now reason. Here is a possible workaround idea from Helmut: when generating the graph which is used to compute the transitive essential set, whenever an installation set is generated, first try to create this set for the cross case and fall back to computing it for the native case if it cannot be computed for the cross case.
potential TODO from Helmut for cross.html: Some packages are considered ok by dose-builddebcheck even though they really are not. A typical case is picking the wrong arch python. It would be interesting to see a list of cases where satisfiable packages use the host arch version of an M-A:allowed package as a fair amount of these should be annotated with :any or :native.
aptitude on the buildds sometimes consumes all memory (which can take long with 32GB of RAM) and currently has to be kill with this http://anonscm.debian.org/cgit/mirror/dsa-puppet.git/tree/modules/buildd/files/buildd-schroot-aptitude-kill it would be great if sbuild could do the killing instead
- wanna-build does not preprocess Build-Depends in the same way sbuild does before giving them to dose3. Most notably it does not remove alternatives
set up a jenkins job doing diverse double compilation
- compile gcc with all compilers in Debian that can (probably currently only gcc and clang)
- use the resulting compilers to compile gcc again with each gcc compiler package
- the resulting gcc packages must be exactly equal (minus timestamp issues etc)
- choosing the right stages to compile source packages in rebootstrap:
18:17 < helmut> josch: so now that you have Packages files, let me give you wishlist examples. 18:20 < helmut> josch: the ones to look into are those with a stage (for easier search: all of them call cross_build_setup explicitly) 18:20 < helmut> josch: some of those have strange hacks such as cyrus-sasl2 selecting stages via DEB_BUILD_OPTIONS. 18:22 < helmut> josch: the ones we are interested in on the other hand are those where build profiles work as intended. e.g. systemd, util-linux, openldap, libidn. 18:23 < helmut> josch: and for util-linux we actually need two builds (stage1 and non-stage1), but only the former is being run atm. 18:25 < helmut> for now we can encode the info that util-linux needs a staged build explicitly, but in the long run, I'd like a tool to figure out whether the stage is needed.
- to minimize dose3 runtime in rebootstrap, turn build architecture Packages/Sources from the archive into a reduced universe containing the bd-transitive-essential closure without architecture:all packages. This set has to be updated every time apt-get update is run.
submit a patch for 757760 to document build profiles in policy
allow a way to introduce "private build profiles". For example gcc-N allows building fewer languages via DEB_BUILD_OPTIONS=nolang=something and cyrus-sasl2 allows building less via DEB_BUILD_OPTIONS=no-something. Maybe some prefix on the profile name space could be designated for experimentation? -- ?HelmutGrohne
- a job finding providers of virtual packages that are never used by anybody
- Write an email about the state of dependency alternative in build dependencies with statistics about their use and making the case of using build profiles instead of just reducing dependency disjunctions to the first alternative:
- earlier attempts:
- removal all but the first alternative is unexpected (see mails on -devel):
- it is inconsistent with how runtime dependencies are handled
- it is only applied to direct dependencies but not to transitive ones
- stability of build dependencies is thrashed by virtual packages with multiple providers
- create statistics of packages that do not list a real package as the first alternative of a dependency on a virtual one and are still used for build depends
- even when the first alternative is the real package, nothing makes sure that nothing conflicts with the first alternative deep inside the dependency tree or that the package itself is temporarily uninstallable. In both cases, the second (virtual) alternative is chosen with a random provider.
if it were always possible to choose the real package from a dependency disjunction, then this graph would not have holes at "universe without disjunctions": https://bootstrap.debian.net/history.svg
- build depends alternatives are used for local builds or non-debian purposes so they could be marked with a profile
- solution: forbid dependency alternatives in build dependencies via a lintian check. Use build profile annotations to still get them
- Write wrappers for dose3 tools which make it easier to select the right Package/Sources files, for example by using chdist