Differences between revisions 5 and 6
Revision 5 as of 2015-07-18 15:09:57
Size: 6899
Editor: GuillemJover
Comment: Unify into the new Spec namespace
Revision 6 as of 2015-07-18 15:17:57
Size: 6933
Editor: GuillemJover
Comment: Mark clearly as a draft, and other minor cleanups
Deletions are marked like this. Additions are marked like this.
Line 8: Line 8:
= Dpkg Proposal - Automatic storing of dist conffiles (and other fun stuff) =
= Automatic storing of dist conffiles =

{{{#!wiki import
ant
'''Sta
tus: draft'''
}}}
Line 10: Line 16:
This proposal discusses the motivations and proposed implementation of registering the unmodified "dist" version of conffiles as packages are installed by dpkg.  such a feature would make it easy to implement further and very useful features, such as 3-way merging of conffiles during a package upgrade.
This proposal discusses the motivations and proposed implementation of registering the unmodified "dist" version of conffiles as packages are installed by dpkg. Such a feature would make it easy to implement further and very useful features, such as 3-way merging of conffiles during a package upgrade.
Line 13: Line 20:
thanks largely to debian-policy, the debian's/dpkg's handling of conffiles is quite reliable and predictable.  however, room exists for improvement.  for example, consider the following three situations.
T
hanks largely to debian-policy, the Debian's/dpkg's handling of conffiles is quite reliable and predictable. However, room exists for improvement. For example, consider the following three situations.
Line 27: Line 35:
in all three situations, the admin has no immediate recourse to take, apart from either having good backups, using a tool like etckeeper, or keeping a copy of the previously installed package around and examining its contents manually.  none of these are critical severity situations, but they can be highly annoying nonetheless and there's no reason why dpkg couldn't help alleviate the situation.
I
n all three situations, the admin has no immediate recourse to take, apart from either having good backups, using a tool like etckeeper, or keeping a copy of the previously installed package around and examining its contents manually. None of these are critical severity situations, but they can be highly annoying nonetheless and there's no reason why dpkg couldn't help alleviate the situation.
Line 30: Line 39:
dpkg should keep a record of the original "dist" version of the conffiles as it installs the package. for the purpose of this proposal the term "database" or "conffile db" will be used, though it by no means implies the use of external database libraries/software.  ideally it should be easy for both software (dpkg) and humans (local admin) to be able to retrieve this information in a simple/useful manner.  two specific implementations of this "database" are proposed, though the overal implementation is the same.
dpkg should keep a record of the original "dist" version of the conffiles as it installs the package. For the purpose of this proposal the term "database" or "conffile db" will be used, though it by no means implies the use of external database libraries/software. Ideally it should be easy for both software (dpkg) and humans (local admin) to be able to retrieve this information in a simple/useful manner. Two specific implementations of this "database" are proposed, though the overall implementation is the same.
Line 37: Line 47:
Line 38: Line 49:
files are placed in <admindir>/conffiles/<package>[_new]/<path>, where <path> is the standard path to the file as it would be installed on the system.
Line 40: Line 50:
benefits: Files are placed in <admindir>/conffiles/<package>[_new]/<path>, where <path> is the standard path to the file as it would be installed on the system.

Benefits:
Line 45: Line 57:
drawbacks:
D
rawbacks:
Line 50: Line 63:
Line 51: Line 65:
files are placed in <admindir>/conffiles/<package>[_new]/<hash>, where <hash> is an md5sum of the pathname of the file's installed location.
Line 53: Line 66:
benefits: Files are placed in <admindir>/conffiles/<package>[_new]/<hash>, where <hash> is an md5sum of the pathname of the file's installed location.

Benefits:
Line 58: Line 73:
drawbacks:
D
rawbacks:
Line 61: Line 77:
Line 62: Line 79:
Line 63: Line 81:
the most notable benefit that you could immediately get from this system is the ability to merge simple changes with practically no effort.  since you have the old, current, and new conffiles available, it's simply a system() call to diff3 and examining the exit status (modulo modifications for prompting of course).
T
he most notable benefit that you could immediately get from this system is the ability to merge simple changes with practically no effort. Since you have the old, current, and new conffiles available, it's simply a system() call to diff3 and examining the exit status (modulo modifications for prompting of course).
Line 66: Line 85:
currently the local admin has the ability to see the delta between the installed and the "newdist" conffile.  with this implementation, the local admin (and dpkg, during conffile prompting) has the ability to show the delta between the installed version and the "olddist" conffile, as well as the delta between the olddist and newdist versions (i.e. the actual changes)
C
urrently the local admin has the ability to see the delta between the installed and the "newdist" conffile. With this implementation, the local admin (and dpkg, during conffile prompting) has the ability to show the delta between the installed version and the "olddist" conffile, as well as the delta between the olddist and newdist versions (i.e. the actual changes)
Line 69: Line 89:
no reason it couldn't be done...
Line 71: Line 90:
== Implementation Status ==
a proof of concept has been made and published on the dpkg-devel mailing list. both conffile db methods were implemented, though the hash-based approach is used in the current version. the proof of concept includes the conffile database as discussed above, as well as an initial "attempt automatic merge" option.
No reason it couldn't be done...
Line 74: Line 92:
work-in-progress can be tracked at http://git.debian.org/?p=users/seanius/dpkg.git;a=summary (browsing) and git://git.debian.org/git/users/seanius/dpkg.git (vcs) == Implementation status ==
Line 76: Line 94:
the current version works, though the treatment of symlink'd conffiles needs to be validated--it is suspected that it may be problematic. A proof of concept has been made and published on the dpkg-devel mailing list. Both conffile db methods were implemented, though the hash-based approach is used in the current version. The proof of concept includes the conffile database as discussed above, as well as an initial "attempt automatic merge" option.
Line 78: Line 96:
== unanswered questions == Work-in-progress can be tracked at https://git.debian.org/cgit/users/seanius/dpkg.git (browsing) and git://git.debian.org/git/users/seanius/dpkg.git (vcs).

The current version works, though the treatment of symlink'd conffiles needs to be validated--it is suspected that it may be problematic.

== Unanswered questions ==

Translation(s): none

(!) /Discussion


Automatic storing of dist conffiles

Status: draft

Abstract

This proposal discusses the motivations and proposed implementation of registering the unmodified "dist" version of conffiles as packages are installed by dpkg. Such a feature would make it easy to implement further and very useful features, such as 3-way merging of conffiles during a package upgrade.

Introduction

Thanks largely to debian-policy, the Debian's/dpkg's handling of conffiles is quite reliable and predictable. However, room exists for improvement. For example, consider the following three situations.

  1. local admin installs package foo
    • local admin edits /etc/foo/foo.cfg
    • debian maintainer updates non-conflicting lines in /etc/foo/foo.cfg
    • easily resolvable conflict found during package upgrade
  2. local admin installs package foo
    • local admin completely rewrites /etc/foo/foo.cfg
    • debian maintainer updates /etc/foo/foo.cfg
    • hard to resolve conflict found during package upgrade
    • local admin may not easily be able to find out what changed
  3. local admin installs package foo
    • local admin makes lots of changes to /etc/foo/foo.cfg
    • local admin forgets what he/she changed, would like a diff.

In all three situations, the admin has no immediate recourse to take, apart from either having good backups, using a tool like etckeeper, or keeping a copy of the previously installed package around and examining its contents manually. None of these are critical severity situations, but they can be highly annoying nonetheless and there's no reason why dpkg couldn't help alleviate the situation.

Proposal

dpkg should keep a record of the original "dist" version of the conffiles as it installs the package. For the purpose of this proposal the term "database" or "conffile db" will be used, though it by no means implies the use of external database libraries/software. Ideally it should be easy for both software (dpkg) and humans (local admin) to be able to retrieve this information in a simple/useful manner. Two specific implementations of this "database" are proposed, though the overall implementation is the same.

  • conffile db's are stored per-package in a subdirectory of /var/lib/dpkg/conffiles
  • when installing/upgrading a package, a new (initially empty) conffile db will be created for this package.
  • when unpacking the archive contents, conffiles will be intercepted and placed in the new conffile db before being copied to their final location.
  • during conflict detection/resolution the "old conffile db" may be available in the expected location (though clearly not for packages installed before implementation of the feature), for any diffs/merges.
  • upon completion of configuration, the old conffile db is removed and the new conffile db is moved to replace it.

configdb implementation 1: path-based configdb

Files are placed in <admindir>/conffiles/<package>[_new]/<path>, where <path> is the standard path to the file as it would be installed on the system.

Benefits:

  • very intuitive
  • no special utilities needed for local admin to manually inspect
  • guaranteed uniqueness of files

Drawbacks:

  • slightly more complicated to code (a mkdir -p is needed)
  • possibility of exceeding PATH_MAX for excessively long conffile filenames (though this is not realistic and it would be hard to build a package with such files, since you would have the same problem during the build process)
  • invites people to go poking around where they probably shouldn't (<admindir>)

configdb implementation 2: hash-based configdb

Files are placed in <admindir>/conffiles/<package>[_new]/<hash>, where <hash> is an md5sum of the pathname of the file's installed location.

Benefits:

  • flat structure, simpler to code (md5sum functions already exist in dpkg)
  • more or less mathematically guaranteed uniqueness of files
  • no chance at exceeding PATH_MAX

Drawbacks:

  • not very useful for local admin (implies the creation/use of a dpkg-conffile utility)

Immediate and possible future benefits/features

(automatic) merging

The most notable benefit that you could immediately get from this system is the ability to merge simple changes with practically no effort. Since you have the old, current, and new conffiles available, it's simply a system() call to diff3 and examining the exit status (modulo modifications for prompting of course).

olddist->newdist delta and olddist->installed delta

Currently the local admin has the ability to see the delta between the installed and the "newdist" conffile. With this implementation, the local admin (and dpkg, during conffile prompting) has the ability to show the delta between the installed version and the "olddist" conffile, as well as the delta between the olddist and newdist versions (i.e. the actual changes)

(future work) ucf-like registration of conffiles

No reason it couldn't be done...

Implementation status

A proof of concept has been made and published on the dpkg-devel mailing list. Both conffile db methods were implemented, though the hash-based approach is used in the current version. The proof of concept includes the conffile database as discussed above, as well as an initial "attempt automatic merge" option.

Work-in-progress can be tracked at https://git.debian.org/cgit/users/seanius/dpkg.git (browsing) and git://git.debian.org/git/users/seanius/dpkg.git (vcs).

The current version works, though the treatment of symlink'd conffiles needs to be validated--it is suspected that it may be problematic.

Unanswered questions

  • resolving which database implementation should be used
    • currently i'm leaning towards reverting to the first approach, assuming we can make the provision that dpkg isn't guaranteed to handle up to PATH_MAX length files (which it doesn't anyway afaik) and that errors are handled gracefully (i.e. the conffile is still treated as a conffile, and behaviour degrades to the current treatment)
  • for merging should we attempt the merge first, and then display the option if we know it's possible?
    • easy enough to do, might as well
  • for non-automatic merges could we do something like git mergetool?
    • left "for future work" :)