Summary----------
Most all Fedora variants (except Cloud) have a GRUB menu entry
containing the word "rescue". This kernel+initramfs pair are never
updated for the life of a Fedora installation. And they quickly become
stale as a Fedora installation ages. This kernel's modules are
eventually deleted, and if selected at boot time, the typical user
experience is a dracut shell.
Basic background-------------
(skip this section if you know how it works)
During a new installation, a single kernel version is installed. e.g.
vmlinuz-5.17.0-0.rc4.96.fc36.x86_64 which is then duplicated as e.g.
vmlinuz-0-rescue-3a86878de5d649a983916543ece7bb7e.
Each of those (identical) kernels has an initramfs file:
initramfs-5.17.0-0.rc4.96.fc36.x86_64.img
initramfs-0-rescue-3a86878de5d649a983916543ece7bb7e.img
The sole difference is the first one is a smaller host-only initramfs,
the second one is a larger no host-only initramfs created with `dracut
-N`. The bigger one just contains a bunch of extra kernel modules and
dracut scripts, ostensibly to make it more likely to boot a system
with some change in hardware that the host-only initramfs doesn't
contain. The size of this rescue initramfs is around 100 MiB, with the
common day to day "host only" initramfs being around 33 MiB. [1]
As the system is updated, additional kernel versions are installed.
dnf.conf contains installonly_limit=3, which results in a maximum of
three kernel versions being installed at a time. Once a fourth kernel
is installed, the first kernel and its modules are removed from
/usr/lib/modules. The rescue kernel+initramfs pair are never updated
or upgraded, even during system upgrades.
Observations------------
This has been discussed by the Workstation working group [2] but since
this functionality is present in all of Fedora, we're moving the
discussion for greater visibility.
There's two separate complaints, if you will: (a) that the
kernel+initramfs pair are never update or upgraded for the life of the
installation; and (b) that even during one release cycle, the user
experience when booting the rescue entry, changes, i.e. when the
matching /usr/lib/modules for the rescue entry are present early on,
you do get a full runtime behavior, you will get to a graphical
environment. But then once the version matched /usr/lib/modules are
removed, you get a completely different behavior when booting the
rescue entry.
An important note from that ticket from Justin Forbes, the Fedora
kernel maintainer: " Remember, the only real purpose of the rescue
kernel is to get your system out of something completely unusable. It
isn't meant to be a full runtime."
Questions------------
* Considering the very narrow purpose of the entry, maybe the current
behavior is adequate?
* Does the rescue entry reliably get users to a dracut prompt, rather
than indefinite hang? I don't know whether it does.
* Is there any way to improve the situation without increasing the
risk that the rescue entry becomes totally non-functional?
* The chosen kernel version needs to be based on one that is known
to boot. Currently we know the kernel+initramfs pair work because it's
the same version used to boot the installation media when doing the
initial provisioning. We don't actually know an updated replacement
"no host-only" initramfs will work until it's tried. Is it possible to
automate this? And is it worth the risk, or even figuring out how to
assess the risk?
* At Flock 2021, Zbyszek proposed "Building Initrd Images from
RPMs" to reduce the complexity of building initramfs, maybe there's a
role for it here? More:
https://www.youtube.com/watch?v=GATg_bqmASc
* What happens if we accept some scope creep, and go for many
improvements that make the extra work worth it?
* What about the unsigned nature of the initramfs? Should we be
creating initramfs's in Fedora infra and signing them?
* Stuff a graphical rescue environment into the initramfs? (This
might be ten leaps too far, but it's intended to encourage thinking
with a vivid imagination.)
[1] both values from a recent Fedora 36 Workstation installation
[2]
https://pagure.io/fedora-workstation/issue/259
--
Chris Murphy