- #language en
Dpkg Proposal - Automatic storing of dist conffiles (and other fun stuff)
Abstract
This proposal discusses the motivations and proposed implementation of registering the unmodified "dist" version of conffiles as packages are installed by dpkg. such a feature would make it easy to implement further and very useful features, such as 3-way merging of conffiles during a package upgrade.
Introduction
thanks largely to debian-policy, the debian's/dpkg's handling of conffiles is quite reliable and predictable. however, room exists for improvement. for example, consider the following three situations.
- local admin installs package foo
- local admin edits /etc/foo/foo.cfg
- debian maintainer updates non-conflicting lines in /etc/foo/foo.cfg
- easily resolvable conflict found during package upgrade
- local admin installs package foo
- local admin completely rewrites /etc/foo/foo.cfg
- debian maintainer updates /etc/foo/foo.cfg
- hard to resolve conflict found during package upgrade
- local admin may not easily be able to find out what changed
- local admin installs package foo
- local admin makes lots of changes to /etc/foo/foo.cfg
- local admin forgets what he/she changed, would like a diff.
in all three situations, the admin has no immediate recourse to take, apart from either having good backups, using a tool like etckeeper, or keeping a copy of the previously installed package around and examining its contents manually. none of these are critical severity situations, but they can be highly annoying nonetheless and there's no reason why dpkg couldn't help alleviate the situation.
Proposal
dpkg should keep a record of the original "dist" version of the conffiles as it installs the package. for the purpose of this proposal the term "database" or "conffile db" will be used, though it by no means implies the use of external database libraries/software. ideally it should be easy for both software (dpkg) and humans (local admin) to be able to retrieve this information in a simple/useful manner. two specific implementations of this "database" are proposed, though the overal implementation is the same.
- conffile db's are stored per-package in a subdirectory of /var/lib/dpkg/conffiles
- when installing/upgrading a package, a new (initially empty) conffile db will be created for this package.
- when unpacking the archive contents, conffiles will be intercepted and placed in the new conffile db before being copied to their final location.
- during conflict detection/resolution the "old conffile db" may be available in the expected location (though clearly not for packages installed before implementation of the feature), for any diffs/merges.
- upon completion of configuration, the old conffile db is removed and the new conffile db is moved to replace it.
configdb implementation 1: path-based configdb
files are placed in <admindir>/conffiles/<package>[_new]/<path>, where <path> is the standard path to the file as it would be installed on the system.
benefits:
- very intuitive
- no special utilities needed for local admin to manually inspect
- guaranteed uniqueness of files
drawbacks:
- slightly more complicated to code (a mkdir -p is needed)
- possibility of exceeding PATH_MAX for excessively long conffile filenames (though this is not realistic and it would be hard to build a package with such files, since you would have the same problem during the build process)
invites people to go poking around where they probably shouldn't (<admindir>)
configdb implementation 2: hash-based configdb
files are placed in <admindir>/conffiles/<package>[_new]/<hash>, where <hash> is an md5sum of the pathname of the file's installed location.
benefits:
- flat structure, simpler to code (md5sum functions already exist in dpkg)
- more or less mathematically guaranteed uniqueness of files
- no chance at exceeding PATH_MAX
drawbacks:
- not very useful for local admin (implies the creation/use of a dpkg-conffile utility)
Immediate and possible future benefits/features
(automatic) merging
the most notable benefit that you could immediately get from this system is the ability to merge simple changes with practically no effort. since you have the old, current, and new conffiles available, it's simply a system() call to diff3 and examining the exit status (modulo modifications for prompting of course).
olddist->newdist delta and olddist->installed delta
currently the local admin has the ability to see the delta between the installed and the "newdist" conffile. with this implementation, the local admin (and dpkg, during conffile prompting) has the ability to show the delta between the installed version and the "olddist" conffile, as well as the delta between the olddist and newdist versions (i.e. the actual changes)
(future work) ucf-like registration of conffiles
no reason it couldn't be done...
Implementation Status
a proof of concept has been made and published on the dpkg-devel mailing list. both conffile db methods were implemented, though the hash-based approach is used in the current version. the proof of concept includes the conffile database as discussed above, as well as an initial "attempt automatic merge" option.
work-in-progress can be tracked at http://git.debian.org/?p=users/seanius/dpkg.git;a=summary (browsing) and git://git.debian.org/git/users/seanius/dpkg.git (vcs)
unanswered questions
- resolving which database implementation should be used
- for merging should we attempt the merge first, and then display the option if we know it's possible?
- for non-automatic merges could we do something like git mergetool?