On Thu, Mar 26, 2020 at 02:00:49PM +0200, Panu Matilainen wrote:
I left it open on purpose (note the "probably" in there) as there would be any number of ways to achieve the rebuild with varying degrees of automation and opt-out opportunities, depending on what is actually desireable for Fedora.
One possibility could be adding a rebuild step to dnf system-upgrade plugin, rebuilding the db after distro upgrades is not a bad idea regardless of db format changes (at least BDB performance would gradually degrade unless rebuilt every now and then). That would leave people doing the (unspeakable) distro-sync upgrade to deal with database format manually, which might be just the right balance of freedom. Or not, I dunno. Other possibilities include planting a one-shot service that does the db rebuild on the next reboot and decommissions itself afterwards in the rpm package itself. Other variations certainly exist.
Suggestions welcome, just as long as you don't suggest rebuilding from rpm %posttrans :)
Right. I realize %posttrans is not a good idea. But *some* mechanism is necessary, because without that the change will mostly be a noop for most users. So I think this needs to be decided somehow.
Note that rpm will nag about the inconsistency between what's on disk and configuration until resolved one way or the other (the message could suggest --rebuilddb as well), so this wouldn't be an invisible thing.
OK. That's good, but I still think we should strive for a fully automatic handling of this. In particular, this message will not be visible with graphical updates.
Quoting from the FESCo ticket: About the various implementation options:
- in dnf system-upgrade: this does not cover normal 'dnf --releasever=33 upgrade' upgrades (as you mentioned earlier), but it also does not cover packagekit upgrades. It'd also create a
And which of these upgrade paths we actually "support", or maybe the term here is "recommend" to the average user?
Both 'dnf system-upgrade' and gnome-software/packagekit.
This is the single biggest reason I left it so open: I got lost in the "maze of upgrade tools, all alike" years ago. There's not much point for me in devising a fancy scheme if it doesn't match what is expected in Fedora. Hence this conversation (which is good!)
previous-release-blocker(s) and previous-previous-release-blockers(s), since the changes would need to be deployed in F32 and F31. Also note that the last time when the upgrade plugins run code is in upgrade phase between two reboots, and the plugin is running pre-upgrade code. This code would then invoke post-upgrade rpm. It's certainly doable, but seems a bit funky.
Right, requiring changes to previous versions is not okay. I seem to be thinking our upgrade tooling had gotten fixed at some point to perform the upgrade on the target distro packaging management stack as it would really need to be, but guess that was just a dream.
Relying on the target distro management stack sound nice, but is actually problematic: how do you run the next version before you install the next version? Sure, you can install stuff to some temporary location and run the tools from there, but then you are running in a very custom franken-environment. Such a mode of running would face the same issue as anaconda installer: it would only get tested during the upgrade season, languishing otherwise.
So nowadays we have a much simpler mechanism: reboot to a special system target without most daemons running (to avoid interference during the upgrade), run the update there, reboot into the new environment. The biggest advantage is that this way we reduce the amount of "custom": we're running normal installed dnf + rpm in a normal boot environment, we just stop the boot from progressing all the way to the usual graphical environment.
I think it's fair to say that amount of bugs related to the upgrade mechanism has been greatly reduced compared to previous schemes. We still have various upgrade issues, but they are in the rpms themselves, and not how we install them.
- a one-shot service: this is easier to implement, it just needs to happen in one place. The hard part is making sure that the machine does not get reboot while the upgrade is happening. This is in particular a problem with VMs and containers. The rebuild should be wrapped with systemd-inhibit and other guards to make it hard to interrupt.
An interrupted database rebuild is harmless, has always been. Just as long as the one-shot service only decommissions itself once successfully completed, there's no damage done, there will always be the next reboot.
OK, then I think this is the way to go. (A libdnf plugin as suggested elsewhere in the thread would work too, but a one-shot service seems much easier to implement and test.)
No matter how it wrapped, is the upgrade itself atomic? Having the new db built under a temporary file name and then atomically rename(2)d into place would be ideal.
The new database is built in another directory and only if that completes successfully, the old directory is moved out of the way and replaced with the new. So it's as atomic as it can be.
Zbyszek