Differences between revisions 37 and 38
Revision 37 as of 2021-10-29 04:22:47
Size: 7156
Editor: PaulWise
Comment: formatting
Revision 38 as of 2022-02-25 08:40:21
Size: 7353
Editor: PaulWise
Comment: link to the new demo of the buildinfo approach
Deletions are marked like this. Additions are marked like this.
Line 101: Line 101:
Something based on searching for statically-linkable files and then mapping those packages to buildinfo files? This would work for all toolchains but would have the disadvantage of rebuilding too often, since having a statically-linkable file installed at build time doesn't mean it is used at build time. This could be mitigated by using things like [[https://github.com/nexB/tracecode-toolkit|TraceCode]] ([[https://github.com/nexB/tracecode-toolkit-strace|strace version]]) to trace the build, but that tracing is likely to slow down builds. Alternate tracing systems built into Linux might help reduce that overhead. Something based on searching for statically-linkable files and then mapping those packages to buildinfo files? There is a [[https://salsa.debian.org/bremner/builtin-pho/-/blob/master/demos/needs-rebuild.sh|demo]] of this approach in the [[https://salsa.debian.org/bremner/builtin-pho|builtin-pho]] project. This would work for all toolchains but would have the disadvantage of rebuilding too often, since having a statically-linkable file installed at build time doesn't mean it is used at build time. This could be mitigated by using things like [[https://github.com/nexB/tracecode-toolkit|TraceCode]] ([[https://github.com/nexB/tracecode-toolkit-strace|strace version]]) to trace the build, but that tracing is likely to slow down builds. Alternate tracing systems built into Linux might help reduce that overhead. 

Intro

In general Debian Policy allows static linking but it has various downsides.

This page aims to document the downsides and mitigations we have in place for those downsides as well as improving the situation in Debian around static linking.

Downsides

  • It requires rebuilding the world when the libraries change.
  • It is harder to track than dynamic linking.
  • It prevents memory sharing between different executables using the same code.
  • It renders some security measure less effective (ASLR for example).
  • To comply with the DFSG and GNU GPL, we need to keep old source around.
  • Possible to ship incomplete libs. Eg. foo() depends on bar() but bar() not present at link time.

Upsides

Affected

Various technology in Debian uses or is affected by static linking.

C libraries

C libraries support static linking and files are named *.a and can be unpacked with the ar tool from binutils.

Packages can declare they were built using code from other packages by using the Built-Using header and the Debian archive keeps around old sources, marking them with the Extra-Source-Only header. Debian Policy unfortunately says that Built-Using may *only* be used for the purposes of DFSG/license compliance so tracking static linking must be done using custom headers.

Lintian detects binaries that have been statically linked.

Haskell

All Haskell libraries are statically linked into the final binary.

The release team have a transition that tracks Haskell rebuilds.

OCaml

All OCaml libraries are statically linked into the final binary.

The release team have a transition that tracks OCaml rebuilds.

Go

The go tool from golang currently requires all libraries be available in source form and then builds everything into one binary. These source files in the -dev packages are the equivalent of the .a file.

When using gccgo-5 (go -gccgo), the Go runtime library is dynamically linked against an executable, however everything else is again linked "statically".

Michael Hudson-Doyle has tried building shared go libraries. Support for dynamic linking (for amd64 only) has been in Ubuntu for a while. But this plan is abandoned now. Even micro releases of the Go compiler break ABI, dynamic linking is just way too tedious, said by mwhudson.

The golang tooling generates Built-Using headers for all(both direct and indirect) dependencies.

Lisp

Lisp libraries are cl-* packages shipping the source code in /usr/share/common-lisp/, similar to Go libraries. The compiler (e.g. sbcl) builds a static binary from all used cl-* packages.

FreePascal

The ?FreePascal Compiler (fpc) packages in Debian don't seem to use dynamic linking. See also here.

Rust

The default for Rust is static linking but dynamic linking is available with rustc -C prefer-dynamic. The ABI is not stable but this is being worked on.

JavaScript

browserify and other tools merge together multiple JS files for shipping to browsers.

Some browser extensions (webext-* packages) copy their dependencies at build time instead of symlinking, because some browsers (Firefox) do not follow symlinks installed into /usr/share/webext/.

Java/Closure

Java has "uberjars" which bundle dependencies into the pre-built jar files.

Mitigation

The Debian archive keeps around old sources referenced by the Built-Using header, marking them with the Extra-Source-Only header. This is only to be used for licensing reasons though, not for tracking static linking.

Manual binNMUs can be done for packages that declare a Built-Using header.

For safety reasons, binaries should be linked dynamically to include hardening features e.g. ASLR. A user should be able to presume that binaries shipped by Debian are safe to use in front-facing (e.g. web services) scripts, etc.

More automatic detection of static linking? #698398

Make it easier to add Built-Using?

Change debian-policy & lintian to discourage static linking?

Do browserify from package postinsts?

Something based on searching for statically-linkable files and then mapping those packages to buildinfo files? There is a demo of this approach in the builtin-pho project. This would work for all toolchains but would have the disadvantage of rebuilding too often, since having a statically-linkable file installed at build time doesn't mean it is used at build time. This could be mitigated by using things like TraceCode (strace version) to trace the build, but that tracing is likely to slow down builds. Alternate tracing systems built into Linux might help reduce that overhead.

Using annobin to watermark binaries with source code hashes. This would reduce binary reproducibility for situations where source changes don't affect the binary output though. An alternative might be to have annobin write the source to binary trace data outside packages or to the buildinfo files or to files referenced by the buildinfo files. Unfortunately annobin does not have access to the source files, only the GCC internal representation.

Modify toolchains and build systems using plugins or patches to record source and binary file paths and hashes, including system headers, static libraries and so on. Write the metadata outside of the binary package, to a file referenced by the buildinfo file. Different distros will likely want different file formats, so a socket or FIFO might be an alternative since a daemon reading from it could transform to other file formats.