On Fri, Mar 27, 2020 at 09:04:53AM +0200, Panu Matilainen wrote:
On 3/26/20 2:35 PM, Zbigniew Jędrzejewski-Szmek wrote:
On Thu, Mar 26, 2020 at 02:00:49PM +0200, Panu Matilainen wrote:
previous-release-blocker(s) and previous-previous-release-blockers(s), since the changes would need to be deployed in F32 and F31. Also note that the last time when the upgrade plugins run code is in upgrade phase between two reboots, and the plugin is running pre-upgrade code. This code would then invoke post-upgrade rpm. It's certainly doable, but seems a bit funky.
Right, requiring changes to previous versions is not okay. I seem to be thinking our upgrade tooling had gotten fixed at some point to perform the upgrade on the target distro packaging management stack as it would really need to be, but guess that was just a dream.
Relying on the target distro management stack sound nice, but is actually problematic: how do you run the next version before you install the next version? Sure, you can install stuff to some temporary location and run the tools from there, but then you are running in a very custom franken-environment. Such a mode of running would face the same issue as anaconda installer: it would only get tested during the upgrade season, languishing otherwise.
Mock has this cool bootstrap image thing now. It seems to me we could use that image to run the system upgrades too [*]. And if/when we get koji to use that, it'll solve a number of ages old problems on the build system, AND that image will get heavily tested 24/7 so it wouldn't be any once in a full moon franken-thing.
[*] Mount the host filesystem from mock and perform a dnf --instalroot=... distro-upgrade on that, turning the whole landscape inside out.
Where would mock be executing from? The same filesystem it is modifying? Somehow it seems that this doesn't change much, but just brings in another layer. Or will a complete copy of the system be made in memory to execute the upgrade tools from?
Let's consider a concrete example that came up recently: grub wants to rewrite something in the bootloader area on disk to help upgrades from very old installations. In current "offline upgrade" scheme, the upgrade tools are running on the real system, with udev active. They can query and touch hardware, can see all the disks as they are, etc. If we went through mock, it'd be running in an nspawn environment w/o access to hardware.
(Something like os-tree's atomic replacement of things, that's of course a completely different story. But so far we're talking about traditional systems.)
So nowadays we have a much simpler mechanism: reboot to a special system target without most daemons running (to avoid interference during the upgrade), run the update there, reboot into the new environment. The biggest advantage is that this way we reduce the amount of "custom": we're running normal installed dnf + rpm in a normal boot environment, we just stop the boot from progressing all the way to the usual graphical environment.
I think it's fair to say that amount of bugs related to the upgrade mechanism has been greatly reduced compared to previous schemes. We still have various upgrade issues, but they are in the rpms themselves, and not how we install them.
Such a scheme may be feasible in a fast-moving distro like Fedora where you can always afford to sit out the next six months waiting for the new thing to become available also in rawhide-1 version, but it's totally non-feasible in something like RHEL. RHEL/CentOS 7 to 8 upgrades with such a scheme only happen to "work" because of bugs such as missing rpmlib() dependency on file triggers kinda let things stumble through the cracks.
The premise is that the upgrade is really a normal dnf upgrade, i.e. a normal 'rpm -U' operation under the hood. The differences are: 1: package count, 2: setting up the machine in a mode where the graphical env. and other non-essential daemons are not active. So if requirements in rpms are specified correctly, the upgrade always should go through. If they are specified incorrectly — then the same problems would occur on smaller updates. So the upgrade path is something to test to catch such issues in packaging. (And in general, the less scriptlets, the better. This solves the issue even better.)
Zbyszek