Revision 1 as of 2016-10-10 22:03:07
how about improving automated upgrades in debian
|Deletions are marked like this.||Additions are marked like this.|
|Line 8:||Line 8:|
|Line 44:||Line 44:|
|== Extensible ==||=== Extensible ===|
Major upgrades in Debian are not automated and, while they are usually trivial on basic setups, can be time-consuming with large number of machines and become harder in non-trivial setups. Users unfamiliar with Debian and/or the commandline may also fail to discover new releases and/or fail to upgrade because the upgrade process is too hard to follow.
Automated upgrades are distinct from DebianUpgrade or UnattendedUpgrades in that they are intended to automate, as much as possible, upgrades between major releases of Debian (for example between DebianJessie and DebianStretch).
There are already some tools in this problem space: UnattendedUpgrades aim mostly at supporting minor updates like security upgrades or point releases. There is also a well documented, tried-and-true DebianUpgrade procedure which is currently manual.
This is a problem, especially for users that are less used to operating in a terminal. Those users may even disregard major upgrades completely, partly because they are completely unaware of their existence (as they are not notified in any way), but also probably partly because upgrading is more difficult than just letting things go as they are.
The upgrade procedure documented in the release notes is also surprisingly long and error-prone. It leads to many teams implementing their own script-like documentation. For example, here's Koumbit's documentation on the upgrade to jessie. Even the ?DSA team has its own distinct upgrade procedure. The reason for this, of course, is that the upgrade procedure is fine if you have one or a few servers to upgrade, but completely breaks down if you scale up to tens or hundreds of servers, as then time needed to upgrade one machine becomes critical.
All this means that we then need to support stable releases for longer (hence the ?DebianLts project) or end up abandoning users by the wayside. This is a security liability as well, as older releases do not release timely security upgrades, if at all. It also means users are less autonomous in managing their own computers, whereas we usually want to empower users to avoid consulting with an expert to do basic maintenance on their machines.
In my experience, a major upgrade can be largely automated, or at least semi-automated. Streamlining upgrades allowed me to simultaneously upgrades tens of machines at a time, greatly reducing the time to upgrade clusters and machines.
An automated upgrade system will be:
The current upgrade process is not streamlined: it needs multiple manual interventions, at specific points in time, to be completed. For example, one needs to edit sources.list, run apt-get upgrade, wait for packages to download, answer some questions, wait some more, answer some questions, wait for the upgrade to complete, then restart the process with dist-upgrade. Then one needs to cleanup, and so on. I'm even skipping some of the documented steps here. Those steps take time for no good reason.
A major hurdle is silencing the upgrade process: modal dialogs often get asked to the user that block the upgrade completely, waiting for user input while packages could be downloaded or installed in the background. While this is usually a bug that can be fixed in the package, those keep propping up and there should be a way to do major upgrades while working around those prompts.
Furthermore those prompts are often very confusing to the user, being highly package-specific. For example, some users have seen prompts about their PAM configuration being changed, which can be a traumatic experience if you have no idea what PAM is. Similarly, users often get prompted about whether "services should be restarted", when they had no idea they even *had* services (e.g. exim) running on their desktop machine.
One way to do this is through the use of preseeds: answers from one machine should be saved in a well known location and could be ported and extended to other machines. That way, the first upgrade in a cluster may be painful, but then a curated list of preseeds can be shared in the cluster, removing a bunch of questions.
Obviously, trivial questions should be removed from the upgrade process, filed as bugs and resolved in point releases, but there should be a way to completely silence those prompts, and that should be the default in major upgrades.
Another part of the problem is to automate things like backing up relevant files. This varies from one system to the other: some people may just create a tarball of /var and /etc, others may already have systems backups in place that they trust, others will rely on etckeeper to checkin files in /etc. Those could be autoconfigured as well...
Sometimes, even though the package maintainer went through great lenghts to make the upgrade transparent, some things just need to be managed manually. Some platforms may not be supported anymore, or configurations need to be upgraded by hand. Key examples include:
- the Apache 2.2 to 2.4 upgrade that requires manual intervention to modify config files
- file integrity checkers (e.g. Samhain) that should be stopped before major upgrades
- config management tools (e.g. Puppet) that needs to be disable before major upgrades
Those problems could also reside in third-party or vendor packages not under the control of Debian folks at all. In fact, some things are simply policy decisions that need to be made (e.g. if Samhain should be stopped or not). Some sites may need to lower a firewall to download new packages.
Specific quirks can be fixed with scripts or configuration management. It should be possible to hook into the upgrade process automatically to easily deploy such fixes without having to go through a stable point release, as this may not be accessible to system administrators that are not part of the Debian project. Point releases are also slow to deploy, and one may not want to wait for a point release before completing a deployment.
Obviously, some issues should be reported as bugs and patches that become ultimately fixed in point releases, but the key idea here is to allow workarounds for known issues in major upgrades and allow admins to hook into the automated upgrades to run their own hooks at critical points in the upgrade process.
Another requirement is reliability: a full upgrade can leave a system in a fairly broken state if interrupted halfway through. Therefore, the upgrade should run under screen if ran from an SSH connexion (for exmaple).
I have also found it useful to run it under ttyrec or script as well, to have a clear log of everything that happens.
Here are the known solutions to this problem so far. Feel free to add your own here. The idea here would be to either write from scratch a script that would do implement the specification above, or reuse something that is already there and improve it to comply with the specification.
Home made scripts
Back at Koumbit, I made this pseudo-script list of what an upgrade script should be doing:
- run under screen / ttyrec / etc
- make backups
- run pre-upgrade hooks (puppet, extra configs, e.g. samhain for DSA)
- finish pending upgrades, check consistency (apt-get upgrade, dpkg --audit, etc)
- change sources.list(s)
- download *all* packages
- pre-load preseeds
- apt-get upgrade, handle errors or failures and repeat
- apt-get dist-upgrade, same
- save preseed for future upgrades on other machines
- autoremove/purge orphan packages
- hooks post-upgrade (e.g. puppet, package removal lists)
- cleanup old kernels
A key component of running the upgrades non-interactively was this nasty commandline:
env DEBIAN_FRONTEND=noninteractive APT_LISTCHANGES_FRONTEND=mail apt-get upgrade -y -o Dpkg::Options::='--force-confdef' -o Dpkg::Options::='--force-confold'
This ensures that:
- debconf doesn't prompt at all - those should be fed by preseeds, as needed
- apt-listchanges are sent by email instead of prompting
- apt doesn't prompt at all
- dpkg keeps old configs
debconf questions can be enabled for the first few servers, but questions like PAM config changes should probably be hidden from the user. those preseeds could be baked in the upgrade tool to silence questions that crept into certain packages, and admins could use that system to silence questions that the release team still wants to leave in but that can be automated safely.
same for dpkg configurations: resolution can happen at the end of the upgrade process, in one pass, instead of during the configure phase of every package, because that also slows the process of configuring all packages. since the machine needs to be rebooted with the new kernel, it doesn't harm to run on the old configs anyways. i have written a small shell script named clean_conflicts that processes all the cruft left by dpkg during upgrades, in one pass. it also supports doing three-way merges of files with sdiff which is quite useful. obviously, that is not quite intuitive for non-commandline users, but those users rarely modify stuff in /etc so that tool is more for server-side upgrades.
obviously, this could be improved and turned into a full script, but I was hesitant in doing so because some servers require *two* reboots instead of only one which made scripting more ackward.
Ubuntu's release upgrades
Ubuntu has been working on this problem for over a decade. In Ubuntu-land, there is a do-release-upgrade which does many things, including automatically modifying sources.list, removing orphaned packages, running under screen, but also handling proprietary drivers like Nvidia's. It's part of a proposal to make major upgrades more intuitive. do-release-upgrade is part of the update-release-upgrader package which depends on update-manager. The upgrades can be run with a GNOME, KDE or text frontends, which fulfills the usability requirements as well.
All this code is written in Python, and is pretty Ubuntu-specific. For example, it relies on a meta-release file which specifies which releases of Ubuntu are out there, which are supported and so on.
Oddly enough, that meta-release file also ships a *copy* of DistUpgrade, the Python library that actually performs the upgrade on the fly, called the UpgradeTool. This is presumably to be able to bootstrap new upgrade code on top of previous releases, as when an stable release is first published, we don't know how to upgrade to the *next* stable release already. This may be worked around by stable updates and the debian-security-support package, although that has its own set of issues as well.
Update-manager *used* to be in Debian, but was removed during the Jessie release cycle. There are some coupling issues between update-manager and ubuntu-release-upgrader: there's at least one build-dependency loop that would need to be resolved at the very least. I have also worked on fixing the dependencies on the package to make them installable on Debian and after a bit of coercing, they can more or less be installed on jessie, but it doesn't seem the python-apt library in jessie has the right API for everything to work correctly.
I have opened the discussion with Ubuntu folks to see if they would be opened in collaborating on the project.
Apart from porting, maintenance and implementation issues, what remains unclear regarding the earlier stated requirements is the capacity of the upgrade scripts to manage pre/post hooks. It seems everything is driven by a few config files and there there is no easy way to inject things in there.
It seems to me, however, that it would be better to try and improve that tool than reinvent the whole wheel from scratch.
Random comments can be added here, if you are hesitant in editing the proposal directly. Thanks for any feedback! -- TheAnarcat