On Fri, Mar 27, 2020 at 10:34:49AM +0200, Panu Matilainen wrote:
On 3/27/20 9:55 AM, Zbigniew Jędrzejewski-Szmek wrote:
On Fri, Mar 27, 2020 at 09:04:53AM +0200, Panu Matilainen wrote:
On 3/26/20 2:35 PM, Zbigniew Jędrzejewski-Szmek wrote:
On Thu, Mar 26, 2020 at 02:00:49PM +0200, Panu Matilainen wrote:
previous-release-blocker(s) and previous-previous-release-blockers(s), since the changes would need to be deployed in F32 and F31. Also note that the last time when the upgrade plugins run code is in upgrade phase between two reboots, and the plugin is running pre-upgrade code. This code would then invoke post-upgrade rpm. It's certainly doable, but seems a bit funky.
Right, requiring changes to previous versions is not okay. I seem to be thinking our upgrade tooling had gotten fixed at some point to perform the upgrade on the target distro packaging management stack as it would really need to be, but guess that was just a dream.
Relying on the target distro management stack sound nice, but is actually problematic: how do you run the next version before you install the next version? Sure, you can install stuff to some temporary location and run the tools from there, but then you are running in a very custom franken-environment. Such a mode of running would face the same issue as anaconda installer: it would only get tested during the upgrade season, languishing otherwise.
Mock has this cool bootstrap image thing now. It seems to me we could use that image to run the system upgrades too [*]. And if/when we get koji to use that, it'll solve a number of ages old problems on the build system, AND that image will get heavily tested 24/7 so it wouldn't be any once in a full moon franken-thing.
[*] Mount the host filesystem from mock and perform a dnf --instalroot=... distro-upgrade on that, turning the whole landscape inside out.
Where would mock be executing from? The same filesystem it is modifying?
Where is the offline upgrade executing from? How's this fundamentally different?
It's not — the point I was trying to make that IF we are running from the the host filesystem, it is easier to run directly from it.
This subject has a long history of different approaches. Things that are more like what you describe than what we're currently using have been used in the past. And at least for Fedora, it seems that the simplicity of the current approach wins over the limitations. For RHEL the best solution may need to be different.
Oh come on. Running from a bootstrap image allows using full native capabilities of rpm/dnf in any new version, without having to consider what the previous versions support. How's that "not much"?
Yes, that is an important hurdle that Fedora generally doesn't encounter at all. Fedora usually waits until the new rpm functionality is released in older versions of Fedora before allowing it to be used in rawhide. I think this should be a viable approach for RHEL too — after all, rpm is very good at keeping backwards compatibility.
Another approach could be to perform the upgrade in two steps: have a rpm+dnf stack compiled for the old version, install it, and then do the upgrade to the real target version. Dunno, that's quickly getting complex.
Let's consider a concrete example that came up recently: grub wants to rewrite something in the bootloader area on disk to help upgrades from very old installations. In current "offline upgrade" scheme, the upgrade tools are running on the real system, with udev active. They can query and touch hardware, can see all the disks as they are, etc. If we went through mock, it'd be running in an nspawn environment w/o access to hardware.
And still that offline upgrade will be running on the old systems kernel which will simply *prevent* certain types of actions to be performed in an upgrade, just like using host system packaging stack *prevents* use of native capabilities in the next version, just because the old version doesn't support them, which is just totally a** backwards. Really.
Note that I'm talking about a high-level idea here. I haven't looked at what a mock bootstrap image looks like, I haven't looked at what offline upgrade looks like. Sure there would be technical details, perhaps obstacles even to sort out.
(Something like os-tree's atomic replacement of things, that's of course a completely different story. But so far we're talking about traditional systems.)
So nowadays we have a much simpler mechanism: reboot to a special system target without most daemons running (to avoid interference during the upgrade), run the update there, reboot into the new environment. The biggest advantage is that this way we reduce the amount of "custom": we're running normal installed dnf + rpm in a normal boot environment, we just stop the boot from progressing all the way to the usual graphical environment.
I think it's fair to say that amount of bugs related to the upgrade mechanism has been greatly reduced compared to previous schemes. We still have various upgrade issues, but they are in the rpms themselves, and not how we install them.
Such a scheme may be feasible in a fast-moving distro like Fedora where you can always afford to sit out the next six months waiting for the new thing to become available also in rawhide-1 version, but it's totally non-feasible in something like RHEL. RHEL/CentOS 7 to 8 upgrades with such a scheme only happen to "work" because of bugs such as missing rpmlib() dependency on file triggers kinda let things stumble through the cracks.
The premise is that the upgrade is really a normal dnf upgrade, i.e. a normal 'rpm -U' operation under the hood. The differences are: 1: package count, 2: setting up the machine in a mode where the graphical env. and other non-essential daemons are not active. So if requirements in rpms are specified correctly, the upgrade always should go through. If they are specified incorrectly — then the same problems would occur on smaller updates. So the upgrade path is something to test to catch such issues in packaging. (And in general, the less scriptlets, the better. This solves the issue even better.)
You're missing the point. Missing capabilities in the older version rpm can and will PREVENT you from doing the update AT ALL.
Even in Fedora people have seen occasional glimpses of this, in enterprise distros this is a complete show-stopper.
I want a normal dnf upgrade just as much as you do, it's just that it needs to be run from the new version, not the old. One way or the other.
I see your point.
Zbyszek