We want to make diffoscope the universal diff software. We want to add new features to diffoscope which are currently available in other diffing software. Diffoscope can be used as a replacement or way better than Open Build Service's pkg-diff,Whisper Systems apkdiff etc. At present, Diffoscope don't have option to hide (or ignore) specific differences as per user requirements. This is a specification for --hide=profiles which allows Diffoscope to hide certain profiles.
Disclaimer for Debian reproducible-builds users: Please note that hiding differences should only be used for analysing (other) reproducibility issues, which are hidden through the noise. The goal of reproducible builds is to have bit by bit identical builds, it's not ok to say "these two are identical if you hide $this"…
- Use cases
The goal behind creating --hide=profiles specification is to allow diffoscope users to hide details which they don't want to see.
Ignore `.buildinfo` files when comparing `.changes` files
While comparing two .changes files, Jam wants to hide all the differences inside .buildinfo files referenced in .changes file. This feature is also needed for tests.reproducible-builds.org and should probably be enabled by default.
Metadata generated by gzip
As a user, Jam wants to hide metadata generated by gzip so that he can view and concentrate on other differences. He knows that hidden gzip metadata can also make his software to fail build reproducibly. Jam would like to have an option --hide-timestamp=gzip-metadata so that he can accomplish his goal.
Ignore all differences in control.tar
Joe wants to compare two same debs of different architectures. He wants to see the real differences in data.tar to check the actual differences in packages. He knows, in this particular case control.tar differences can be avoided since he already has an idea that control.tar will show the differences related to Architectures, md5sums, build-id's etc. He is willing to do this by adding option --hide-section=control.tar
Related bug #797525
Hiding debug symbols
In order to see the differences truncated by diffoscope due to "Max output size reached", Jane wants to hide debug symbols present in current output to see if there is anything which is making her package unreproducible. Right now, Jane is not sure about truncated differences. Hence, Jane would like to use option --hide-section=debug-symbols to remove clutter from a current output and confirm about other issues causing her package to build not reproducible.
Example difference Example difference
Ignoring specific fields in .buildinfo files
Albert wants to do a meaningful comparison of .buildinfo files. He wants to ignore Environment information from .buildinfo file. He thinks, Diffoscope shall provide an option to do this task for him. He may want to solve this particular problem by using --hide-section=<buildinfo-field>
Hiding timestamps generated by latex
As a diffoscope user, Chris wants to check whether his package consists of any issues other than timestamps in pdf generated by latex are making his package unreproducible. Currently, diffoscope is showing "Max output size reached" banner to Chris. Now, Chris wants something which can hide timestamps generated latex and show him remaining differences. Chris may want to try --hide-timestamp=latex option with diffoscope so that he can see other differences.
Shirish is trying to analyze the packages. He is getting a huge size of difference output and it is containing mtimes related differences.However, He wants to ignore such differences and would like to see if there is anything left. He is willing to use --hide-timestamp=mtime with Diffoscope. So that, he will be able to see the entire differences.
Related bug #814057
Ignoring build profiles
Helmut is trying to cross build packages reproducibly. Since some packages need additional dependencies under cross builds, he adds the cross build profile. Unfortunately, build profiles are always recorded in the binary package's control file in the Built-For-Profiles header. Beyond the cross build profile, there are more profiles that are supposed to leave the resulting binary packages unchanged (e.g. nocheck or noudeb). Having this hiding mechanism would help validating the correctness of these build profiles.
Jane is reviewing her package upload and using diffoscope to compare the old and new versions of a Debian source package. Since the upstream version was bumped, the directory name within the upstream tarball was changed from foo-0.1/ to foo-0.2/. The fuzzy matching takes care of most of the noise here but there is still some noise in the output. Being able to ignore top-level directory renames would help reduce that to just the changes made to the upstream source code and the debian/ directory.