Status: draft
Delta upgrade debs
Delta upgrades are widely used by other projects, such as Chrome, Android, FreeBSD, Fedora. One approach exists for dpkg-based systems: debdelta. While debdelta solves the problem, it does so in a sub-optimal way:
- A debdelta contains a script that contains the instructions how to generate a new deb given the old script.
- A debdelta regenerates a deb which is unnecessarily slow.
Proposal
A new format, called "patch debs" (.pdeb) is introduced. A patch deb consists of the original members from the target deb, with the following changes:
- In data.tar, all members are replaced by binary diff files. Files that should be shipped in their entirety are simply a diff against /dev/null.
- The debian-binary member is renamed to debian-delta
Furthermore, all .deb files are assigned a unique ID at build time, stored in the "ID" field of DEBIAN/control.
Unpacking a .pdeb
Unpacking a .pdeb works identically to a normal .deb, with one exception: The line copying the data to .dpkg-new from the data.tar tarball is replaced by code that uses the diff in the data.tar tarball and the installed file to generate the .dpkg-new.
The .dpkg-new is checksummed and must match the checksum(s) specified in the control.tar member of the pdeb. Packages without checksums may not have deltas.
Repository integration v0
A new index file is established alongside Packages and Sources, called Deltas. The Deltas index has the following fields
- Package
- Architecture
- Version
- Old-Version
- ID
- Old-ID
- Size
- Filename
- SHA256 (and/or other hash fields)
When upgrading a package:
- Search a delta where (package, architecture, old-version, old-id) matches the installed package.
- If one found:
- Check that the installed files still match the hashes recorded in the dpkg db
- If files match dpkg db hashes; pick the delta, otherwise full deb
- Otherwise:
- Install the full .deb
Repository layout v2
Alongside Packages files, provide a Deltas bloom filter. The bloom filter of size b uses the concatenation of the old id and the new id as a set of hash functions, taking n-bytes of input as an integer, using it as an offset.
The numbers n and b should be chosen such that the rate of false positives for rejected deltas is about 1%. The Deltas bloom filter is stored in binary form, and not compressed.
For each source package, provide an inline signed Deltas index that contains, for each delta, the following fields:
- Package
- Old-ID
- New-ID
- Size
- SHA256
- Algorithm
the path name is fixed as
<dir same as binary deb>/name_$oldid_$newid_$algorithm.deltadeb
and not included.
Performance evaluation
All results on a ?ThinkPad X230 with a Core i5-3320M and 16 GB of RAM. Generation times are measured with a libdivsufsort-based bsdiff (or rather, a fork) and a more normal bsdiff.
Package |
Size |
Delta |
Generation Time |
Apply Time |
firefox (55.0-1 -> 55.0-2) |
38MB |
2.4MB |
40s/96s |
0.6s |
firefox (54.0-2 -> 55.0-2) |
38MB |
42MB |
- |
- |
gcc-7 (7.2.0-1 -> 7.2.0-3) |
31MB |
3.1MB |
75-90s/- |
0.6s |
Binary diffing
The bsdiff fork is maintained at https://github.com/julian-klode/ddelta.