Hiding the grub menu by default on single OS installs

older

Orphaning metamorphose2

F29 System Wide Change: Zchunk...

Hans de Goede

Thursday, 31 May 2018 Thu, 31 May '18

5:23 a.m.

Show replies by date

Stephen Gallagher

Thursday, 31 May Thu, 31 May

5:36 a.m.

On Thu, May 31, 2018 at 6:24 AM Hans de Goede <hdegoede(a)redhat.com> wrote:

...

I think part of the reason is that non-technical people might not know how to recover if a kernel update had a regression leaving their system unbootable. At least with the boot config screen there, it offers them something to try. I would be concerned if we drop this without instituting an alternative way to (perhaps automatically) revert to an older kernel if boot failed to reach some sensible systemd target.

Hans de Goede

5:52 a.m.

Hi, On 31-05-18 12:36, Stephen Gallagher wrote:

...

On Thu, May 31, 2018 at 6:24 AM Hans de Goede <hdegoede(a)redhat.com <mailto:hdegoede@redhat.com>> wrote: Hi All, I'm working on improving the Fedora boot experience, with the end goal being a user pressing the on button and then going to the graphical login manager without him seeing any text messages / menus filled with technical jargon. IIRC we used to hide the grub-menu by default on single OS installs, but we seemed to have stopped doing that, for new Fedora 29 installs I would like us to start hiding the menu by default on single OS installs again, see: https://fedoraproject.org/wiki/Changes/HiddenGrubMenu The goal if this email is to: 1) Give people an advance warning about the plan to change this so we can discuss this early on 2) See if anyone knows why we stopped doing this, I think we may simply have stopped doing this to simplify to bootconfig code in anaconda and because we did not always identify the single OS case correctly, but I wonder if there were other reasons? I think part of the reason is that non-technical people might not know how to recover if a kernel update had a regression leaving their system unbootable. At least with the boot config screen there, it offers them something to try. I would be concerned if we drop this without instituting an alternative way to (perhaps automatically) revert to an older kernel if boot failed to reach some sensible systemd target.

Revert to the older kernel, or show the menu? I also have working on fastboot support on my TODO, which means not checking for a keypress in grub *at all* because that check will cause EFI firmware to scan all USB busses for a keyboard which can be quite slow. This indeed involves setting a "boot_success" grub environment variable, which grubs clears at boot and if not re-set the next boot grub will not fastboot. The fastboot stuff is more of a Fedora 30 then 29 thing, but I guess I could bring the bits which signal a successful boot forwards to 29 and use that to decide between showing the menu with our default 5 second timeout and hiding it and waiting 1 sec. The plan for fastboot was to show the menu, not to auto fallback as there can be many reasons why the boot has failed. This will basically get us back the F28 behavior of showing the menu but only after a failed boot, I think that is a good solution, do you agree? Regards, Hans

Stephen Gallagher

6:59 a.m.

On Thu, May 31, 2018 at 6:53 AM Hans de Goede <hdegoede(a)redhat.com> wrote:

...

Hi, On 31-05-18 12:36, Stephen Gallagher wrote: > > On Thu, May 31, 2018 at 6:24 AM Hans de Goede <hdegoede(a)redhat.com <mailto:hdegoede@redhat.com>> wrote: > > Hi All, > > I'm working on improving the Fedora boot experience, with the > end goal being a user pressing the on button and then going > to the graphical login manager without him seeing any > text messages / menus filled with technical jargon. > > IIRC we used to hide the grub-menu by default on single > OS installs, but we seemed to have stopped doing that, > for new Fedora 29 installs I would like us to start > hiding the menu by default on single OS installs again, > see: > > https://fedoraproject.org/wiki/Changes/HiddenGrubMenu > > The goal if this email is to: > 1) Give people an advance warning about the plan to change > this so we can discuss this early on > > 2) See if anyone knows why we stopped doing this, I think > we may simply have stopped doing this to simplify to bootconfig > code in anaconda and because we did not always identify the > single OS case correctly, but I wonder if there were other > reasons? > > > > I think part of the reason is that non-technical people might not know how to recover if a kernel update had a regression leaving their system unbootable. At least with the boot config screen there, it offers them something to try. > > I would be concerned if we drop this without instituting an alternative way to (perhaps automatically) revert to an older kernel if boot failed to reach some sensible systemd target. Revert to the older kernel, or show the menu?

Showing the menu provides the user a way to revert to the older kernel, so it's fine with me.

...

I also have working on fastboot support on my TODO, which means not checking for a keypress in grub *at all* because that check will cause EFI firmware to scan all USB busses for a keyboard which can be quite slow. This indeed involves setting a "boot_success" grub environment variable, which grubs clears at boot and if not re-set the next boot grub will not fastboot.

Interesting. How slow are we talking about? Measured in milliseconds or seconds?

...

If we are hiding it and have no detected keyboard, what's the value of waiting one second anyway? Shouldn't we skip the wait entirely?

...

The plan for fastboot was to show the menu, not to auto fallback as there can be many reasons why the boot has failed.

...

This will basically get us back the F28 behavior of showing the menu but only after a failed boot, I think that is a good solution, do you agree?

Yeah, that would be fine with me.

Hans de Goede

7:25 a.m.

Hi, On 31-05-18 13:59, Stephen Gallagher wrote:

...

On Thu, May 31, 2018 at 6:53 AM Hans de Goede <hdegoede(a)redhat.com <mailto:hdegoede@redhat.com>> wrote: Hi, On 31-05-18 12:36, Stephen Gallagher wrote: > > On Thu, May 31, 2018 at 6:24 AM Hans de Goede <hdegoede(a)redhat.com <mailto:hdegoede@redhat.com> <mailto:hdegoede@redhat.com <mailto:hdegoede@redhat.com>>> wrote: > > Hi All, > > I'm working on improving the Fedora boot experience, with the > end goal being a user pressing the on button and then going > to the graphical login manager without him seeing any > text messages / menus filled with technical jargon. > > IIRC we used to hide the grub-menu by default on single > OS installs, but we seemed to have stopped doing that, > for new Fedora 29 installs I would like us to start > hiding the menu by default on single OS installs again, > see: > > https://fedoraproject.org/wiki/Changes/HiddenGrubMenu > > The goal if this email is to: > 1) Give people an advance warning about the plan to change > this so we can discuss this early on > > 2) See if anyone knows why we stopped doing this, I think > we may simply have stopped doing this to simplify to bootconfig > code in anaconda and because we did not always identify the > single OS case correctly, but I wonder if there were other > reasons? > > > > I think part of the reason is that non-technical people might not know how to recover if a kernel update had a regression leaving their system unbootable. At least with the boot config screen there, it offers them something to try. > > I would be concerned if we drop this without instituting an alternative way to (perhaps automatically) revert to an older kernel if boot failed to reach some sensible systemd target. Revert to the older kernel, or show the menu? Showing the menu provides the user a way to revert to the older kernel, so it's fine with me.

Ok.

...

Up to multiple seconds (depending on the hardware and amount of attached USB devices).

...

The fastboot stuff is more of a Fedora 30 then 29 thing, but I guess I could bring the bits which signal a successful boot forwards to 29 and use that to decide between showing the menu with our default 5 second timeout and hiding it and waiting 1 sec. If we are hiding it and have no detected keyboard, what's the value of waiting one second anyway? Shouldn't we skip the wait entirely?

For F29 the plan is to just hide it (unless a previous boot failed) the not checking for a keypress is the full fastboot implementation which is best left for Fedora 30 I think. Once we get the full fastboot implementation then the 1 second delay indeed will be removed. So for F29, single OS install we get: 1) grub menu hidden by default with a 1 second timeout to press ESC or F8 to show it 2) grub menu shown with 5 sec timeout after a failed boot And for F30, single OS install we get: 1) grub menu not shown, 0 second timeout, no way to get to the menu 2) grub menu shown with 5 sec timeout after a failed boot Originally I was planning on doing the failed-boot detect only for F30, but I agree it makes sense to have it for F29 and this will also give us some field testing of this while we still have a fallback in the form of the 1 sec wait for ESC / F8. This is all defaults btw and can all be overridden by the user if so desired. Regards, Hans

Chris Adams

8:08 a.m.

Once upon a time, Hans de Goede <hdegoede(a)redhat.com> said:

...

And for F30, single OS install we get: 1) grub menu not shown, 0 second timeout, no way to get to the menu 2) grub menu shown with 5 sec timeout after a failed boot

If I know I want the menu (say I need to boot single-user to fix something), how would I do that in this setup? -- Chris Adams <linux(a)cmadams.net>

Gerald B. Cox

8:42 a.m.

On Thu, May 31, 2018 at 6:08 AM, Chris Adams <linux(a)cmadams.net> wrote:

...

Once upon a time, Hans de Goede <hdegoede(a)redhat.com> said: > And for F30, single OS install we get: > > 1) grub menu not shown, 0 second timeout, no way to get to the menu > 2) grub menu shown with 5 sec timeout after a failed boot If I know I want the menu (say I need to boot single-user to fix something), how would I do that in this setup?

I'm fine with changing the default - I understand that under normal circumstances most people could care less about seeing the screen - but I do strongly agree with the comment above. When things sometimes go south and you need that menu, there needs to be a simple, well documented way to get it... easily... without having to go on a google treasure hunt to find the instructions. Otherwise, don't do it.

Michael Cronenworth

9:26 a.m.

On 05/31/2018 08:42 AM, Gerald B. Cox wrote:

...

Replace the grub menu with "Press any key to access GRUB" or something similar at the bottom of the screen to at least be a visual clue.

Panu Matilainen

Friday, 1 June Fri, 1 Jun

2:16 a.m.

On 05/31/2018 05:26 PM, Michael Cronenworth wrote:

...

On 05/31/2018 08:42 AM, Gerald B. Cox wrote: > I'm fine with changing the default - I understand that under normal > circumstances most people could care less about seeing the screen - but I > do strongly agree with the comment above. When things sometimes go > south and you need that menu, there needs to be a simple, well documented > way to get it... easily... without having to go on a google treasure > hunt to find the instructions. Otherwise, don't do it. Replace the grub menu with "Press any key to access GRUB" or something similar at the bottom of the screen to at least be a visual clue.

Yes! That's a simple and straightforward (and thus reliable) solution to the "issue" at hand. I don't care about the *menu* either, but hidden key combos that need to be hit at some machine-dependent time window are *terrible*. - Panu -

Louis Lagendijk

Thursday, 31 May Thu, 31 May

12:11 p.m.

On Thu, 2018-05-31 at 06:42 -0700, Gerald B. Cox wrote:

...

On Thu, May 31, 2018 at 6:08 AM, Chris Adams <linux(a)cmadams.net> wrote: > Once upon a time, Hans de Goede <hdegoede(a)redhat.com> said: > > And for F30, single OS install we get: > > > > 1) grub menu not shown, 0 second timeout, no way to get to the > menu > > 2) grub menu shown with 5 sec timeout after a failed boot > > If I know I want the menu (say I need to boot single-user to fix > something), how would I do that in this setup? > I'm fine with changing the default - I understand that under normal circumstances most people could care less about seeing the screen - but I do strongly agree with the comment above. When things sometimes go south and you need that menu, there needs to be a simple, well documented way to get it... easily... without having to go on a google treasure hunt to find the instructions. Otherwise, don't do it.

How would this feature interact with things like /.autorelabel? Would the presentation of the grub menu in that case still depend on the previous boot being marked as successful? Is a boot that only does a relabel successful? And how about differences between upgrades (I think that the boot loader is in that case not re-installed IIRC) vs. new installations. Could that cause issues? /Louis

Hans de Goede

10:31 a.m.

Hi, On 31-05-18 15:08, Chris Adams wrote:

...

Hopefully what ever you want to fix will count as a "failed boot" if it requires single user mode. I can and certainly will add a commandline utility to force showing the menu on the next boot, but that assumes a somewhat working system. I guess the plan is to have a few daemons which are considered critical (if enabled) say sshd and gdm and if any of them don't start, consider the boot failed. This also means that if you ctrl+alt+del early on, causing the system to reboot without ever starting those that will also give you the grub menu. Note that this is exactly why this is a F30 thing, to give us a chance to figure out how exactly to detect a failed boot. Also I would like to note that Windows has been doing more or less the same since Vista and it does not seem to cause any problems for Windows. Regards, Hans

Rob Clark

10:41 a.m.

On Thu, May 31, 2018 at 11:31 AM, Hans de Goede <hdegoede(a)redhat.com> wrote:

...

Hi, On 31-05-18 15:08, Chris Adams wrote: > > Once upon a time, Hans de Goede <hdegoede(a)redhat.com> said: >> >> And for F30, single OS install we get: >> >> 1) grub menu not shown, 0 second timeout, no way to get to the menu >> 2) grub menu shown with 5 sec timeout after a failed boot > > > If I know I want the menu (say I need to boot single-user to fix > something), how would I do that in this setup? Hopefully what ever you want to fix will count as a "failed boot" if it requires single user mode. I can and certainly will add a commandline utility to force showing the menu on the next boot, but that assumes a somewhat working system.

I was going to bring up one sort of niche use-case, of installing a -debug kernel (either while working on the kernel or helping to debug an issue that the kernel developer cannot reproduce), since debug kernels end up a lower priority and wouldn't normally be the default selection.. but a reboot-grubmenu command would totally cover that use-case. Side note, android's 'reboot' cmd can take an argument, like 'reboot fastboot' or 'reboot recovery'.. that might be one of the few features from android worth copying ;-) BR, -R > I guess the plan is to have a few daemons which are considered > critical (if enabled) say sshd and gdm and if any of them don't > start, consider the boot failed. This also means that if you > ctrl+alt+del early on, causing the system to reboot without > ever starting those that will also give you the grub menu. > > Note that this is exactly why this is a F30 thing, to give us > a chance to figure out how exactly to detect a failed boot. > > Also I would like to note that Windows has been doing more or > less the same since Vista and it does not seem to cause any > problems for Windows. > > Regards, > > Hans > _______________________________________________ > devel mailing list -- devel(a)lists.fedoraproject.org > To unsubscribe send an email to devel-leave(a)lists.fedoraproject.org > Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: > https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.o...

Gerd Hoffmann

Friday, 1 June Fri, 1 Jun

4:27 a.m.

Hi,

...

Side note, android's 'reboot' cmd can take an argument, like 'reboot fastboot' or 'reboot recovery'.. that might be one of the few features from android worth copying ;-)

I'm still missing something simliar to "lilo -R <cmdline>" in the world of modern boot loaders. This used to set the lilo command line for the next boot. lilo command line is boot entry name plus additional kernel parameters. So you say ... "lilo -R default single && reboot" ... for a single user boot. Or ... "lilo -R testkernel && reboot" ... to boot a different kernel once. Seems at least for the second use case some out-of-tree grub2 patches are floating around, adding a --once switch to grub-set-default. Some linux systems have it, some don't ... cheers, Gerd

Kevin Fenzi

Sunday, 3 June Sun, 3 Jun

4:03 p.m.

On 06/01/2018 02:27 AM, Gerd Hoffmann wrote:

...

Hi, > Side note, android's 'reboot' cmd can take an argument, like 'reboot > fastboot' or 'reboot recovery'.. that might be one of the few features > from android worth copying ;-) I'm still missing something simliar to "lilo -R <cmdline>" in the world of modern boot loaders. This used to set the lilo command line for the next boot. lilo command line is boot entry name plus additional kernel parameters. So you say ... "lilo -R default single && reboot" ... for a single user boot. Or ... "lilo -R testkernel && reboot" ... to boot a different kernel once. Seems at least for the second use case some out-of-tree grub2 patches are floating around, adding a --once switch to grub-set-default. Some linux systems have it, some don't .

grub2-reboot 1 will set it to boot entry 1 (it starts counting from 0) the next time you boot only. It doesn't help passing args, but it does let you boot a particular kernel entry. kevin

Chris Adams

Thursday, 31 May Thu, 31 May

12:09 p.m.

Once upon a time, Hans de Goede <hdegoede(a)redhat.com> said:

...

On 31-05-18 15:08, Chris Adams wrote: >Once upon a time, Hans de Goede <hdegoede(a)redhat.com> said: >>And for F30, single OS install we get: >> >>1) grub menu not shown, 0 second timeout, no way to get to the menu >>2) grub menu shown with 5 sec timeout after a failed boot > >If I know I want the menu (say I need to boot single-user to fix >something), how would I do that in this setup? Hopefully what ever you want to fix will count as a "failed boot" if it requires single user mode.

Hmm... not really. For example (just off the top of my head): lost root password. Without it, you won't be able to set any "next boot is special" option, so resetting root's password would now require rescue media. I'd say I'm against this part of your proposal. I understand the reasoning, but it just seems like too much restriction to shave off a small amount of time. You mentioned it "could take several seconds" for EFI to initialize USB, but what is the normal time, for a typical desktop or notebook with just a keyboard and mouse attached? -- Chris Adams <linux(a)cmadams.net>

Chuck Anderson

3:18 p.m.

On Thu, May 31, 2018 at 12:09:45PM -0500, Chris Adams wrote:

...

Once upon a time, Hans de Goede <hdegoede(a)redhat.com> said: > On 31-05-18 15:08, Chris Adams wrote: > >Once upon a time, Hans de Goede <hdegoede(a)redhat.com> said: > >>And for F30, single OS install we get: > >> > >>1) grub menu not shown, 0 second timeout, no way to get to the menu > >>2) grub menu shown with 5 sec timeout after a failed boot > > > >If I know I want the menu (say I need to boot single-user to fix > >something), how would I do that in this setup? > > Hopefully what ever you want to fix will count as a "failed boot" > if it requires single user mode. Hmm... not really. For example (just off the top of my head): lost root password. Without it, you won't be able to set any "next boot is special" option, so resetting root's password would now require rescue media. I'd say I'm against this part of your proposal. I understand the reasoning, but it just seems like too much restriction to shave off a small amount of time. You mentioned it "could take several seconds" for EFI to initialize USB, but what is the normal time, for a typical desktop or notebook with just a keyboard and mouse attached?

I agree. Another use case is that failed graphics initialization may not count as "failed boot".

Chris Murphy

Friday, 1 June Fri, 1 Jun

1:54 p.m.

On Thu, May 31, 2018 at 11:09 AM, Chris Adams <linux(a)cmadams.net> wrote:

...

Reminder, starting with Fedora 28, the root user does not have a passphrase set by default - so it's effectively disabled. And that means emergency and rescue targets are kinda useless. One of those is single user mode (I always forget which one).

...

I'd say I'm against this part of your proposal. I understand the reasoning, but it just seems like too much restriction to shave off a small amount of time. You mentioned it "could take several seconds" for EFI to initialize USB, but what is the normal time, for a typical desktop or notebook with just a keyboard and mouse attached?

Will anyone even benefit from non-initialization of USB if they don't disable such initialization in the firmware? Based on prior conversations a while ago with GRUB folks, GRUB can (optionally) initialize USB if the firmware doesn't (the feature sometimes called fast boot). But I don't know whether Fedora's GRUB does this initialization. And for sure the feature can't modify the firmware's behavior so... I'm not clear where the savings comes from other than there's no wait time for the boot menu. -- Chris Murphy

Joonas Sarajärvi

Saturday, 2 June Sat, 2 Jun

10:20 a.m.

Chris Adams kirjoitti 31.05.2018 klo 20:09:

...

Once upon a time, Hans de Goede <hdegoede(a)redhat.com> said: > On 31-05-18 15:08, Chris Adams wrote: >> Once upon a time, Hans de Goede <hdegoede(a)redhat.com> said: >>> And for F30, single OS install we get: >>> >>> 1) grub menu not shown, 0 second timeout, no way to get to the menu >>> 2) grub menu shown with 5 sec timeout after a failed boot >> >> If I know I want the menu (say I need to boot single-user to fix >> something), how would I do that in this setup? > > Hopefully what ever you want to fix will count as a "failed boot" > if it requires single user mode. Hmm... not really. For example (just off the top of my head): lost root password. Without it, you won't be able to set any "next boot is special" option, so resetting root's password would now require rescue media.

This is pretty much my concern, too, about this change. So far, using a few extra boot parameters I have been able to recover at least from these ways of getting locked out from my computer: - lost root password - bad PAM configuration - bad SSSD configuration - broken networking and thus unable to log in with correctly set up SSH keys Then there are the cases already mentioned by others, like user interface not showing up due to driver issues. There are so many ways to lock oneself out from a Fedora system. Will the hypothetical automatic detection for successful boot be able to detect all such cases? How does one automate an adjustment for this new default, if it turns out that the default is indeed changed to one where it is not possible to access the menu without first convincing the system that the boot is now failing? AFAIK Fedora's grub configuration is set up in a quite unfortunate way so that the initial configuration comes from grub2-mkconfig and then later ones are made by grubby based on some heuristics. It does not exactly invite me to add a third program into the mix that does the updates I want, hopefully not stepping on toes of grubby and not breaking on changes to grub2-mkconfig of new Fedora versions.

Chris Murphy

12:22 p.m.

On Thu, May 31, 2018 at 9:31 AM, Hans de Goede <hdegoede(a)redhat.com> wrote:

...

Note that this is exactly why this is a F30 thing, to give us a chance to figure out how exactly to detect a failed boot. Also I would like to note that Windows has been doing more or less the same since Vista and it does not seem to cause any problems for Windows.

A few things differ, in important ways, between Windows and Fedora. Windows provides, when using shift+restart, an additional dialog to get to things like EFI shell, firmware setup, and Windows boot manager. In fact this same *Windows* reboot menu shows Fedora. Ergo, they are putting boot menu options in Windows. The communicate all of this to the firmware with an NVRAM entry, so yeah it's kindof a UEFI only thing. Point here is, this behavior in effect standardizes via GUI, across all their supported (UEFI) hardware, how to get to firmware setup, booting off USB sticks or other boot options, rather than doing it via non-standard keyboard shortcuts at the front end - where on quite a bit of hardware now, the keyboard isn't going to work anyway because of fast boot by default. Windows made most of the user facing behavioral changes in one whack, rather than in stages. I'm concerned less about this particular feature change, than additional changes that end up giving the user the experience of being jerked around. We've got a lot of bootloader related things up in the air right now: the traditional editions depend on grubby, Atomic Host and Silverblue don't use it at all, and behind the scenes I've seen changes to GRUB that suggest we're about to abandon modifying grub.cfg when new kernels are installed and instead using bootloaderspec drop-in scriptlets. If all of these bootloader domain related things change in sequence, I think people are gonna get really sick of it, and confused. I think this is a case where monolithic change might be better. I'd rather see some kind of opt in for either Fedora 29. And make it the default (opt out) for Fedora 30. -- Chris Murphy

DJ Delorie

Thursday, 31 May Thu, 31 May

3:23 p.m.

Chris Adams <linux(a)cmadams.net> writes:

...

If I know I want the menu (say I need to boot single-user to fix something), how would I do that in this setup?

Ah, that reminds me of the good old days of looking up on the internet which of the many keys on the keyboard gets me into the BIOS setup menu...

Sam Varshavchik

9:58 a.m.

Hans de Goede writes:

...

For F29 the plan is to just hide it (unless a previous boot failed)

What is the exact criteria for "previous boot failed", I'm wondering. Even if you reach as far as the GDM screen it's still possible that something is so horked up to the point that you can't log in, and you can't shut down nicely. I would also suggest that the criteria must include "something was not unmounted cleanly, so if we proceed we will be doing a fsck".

Ian Pilcher

1:06 p.m.

On 05/31/2018 07:25 AM, Hans de Goede wrote:

...

So for F29, single OS install we get: 1) grub menu hidden by default with a 1 second timeout to press ESC or F8 to show it 2) grub menu shown with 5 sec timeout after a failed boot

5 seconds seems like an awfully short timeout after a failed boot. -- ======================================================================== Ian Pilcher arequipeno(a)gmail.com -------- "I grew up before Mark Zuckerberg invented friendship" -------- ========================================================================

Nicolas Mailhot

Friday, 29 June Fri, 29 Jun

10:46 a.m.

Le 2018-05-31 14:25, Hans de Goede a écrit :

...

Originally I was planning on doing the failed-boot detect only for F30, but I agree it makes sense to have it for F29 and this will also give us some field testing of this while we still have a fallback in the form of the 1 sec wait for ESC / F8.

Do please make sure that: 1. there is a way to demand the next boot will provide the full boot menu with working display and keyboard 2. there is a way to demand all boots provide the full boot menu with working display and keyboard 3. those ways are easily discoverable by laymen (typically, a notice on the default gfx or cli login screen) 4. you check every single bit needed to use them works before declaring a boot successful As long as everything works, quick boot implementations are awesome, but too many of those forget about failure modes, and expect you to type a magic key during a microsecond window, documented in text that stops being displayed before the display ends its initialization. -- Nicolas Mailhot

Hans de Goede

Saturday, 30 June Sat, 30 Jun

7:32 a.m.

Hi, On 29-06-18 17:46, Nicolas Mailhot wrote:

...

Le 2018-05-31 14:25, Hans de Goede a écrit : > Originally I was planning on doing the failed-boot detect only > for F30, but I agree it makes sense to have it for F29 and this > will also give us some field testing of this while we still have > a fallback in the form of the 1 sec wait for ESC / F8. Do please make sure that:

So there are 2 components involved in fastboot, the firmware and grub, if the firmware sucks, there is nothing we can do (and that already is the case today). E.g. I've several machines where if I enable the fastboot option to not scan the USB bus, the USB bus will not be scanned once grub makes a text input EFI protocol "read key stroke" call. IOW if I enable that fastboot option today, with grub as is in F28, I cannot navigate the grub menu, I believe that the firmware should delay scanning the USB bus until the first "read key stroke" call in this case, but in practice on some systems it seems to simply not bother to scan the USB bus *ever* if this fastboot option is enabled. Now let me answer your questions, with the caveat that my answers are only valid assuming sane firmware, if things are already broken with F28 grub, we cannot fix them.

...

1. there is a way to demand the next boot will provide the full boot menu with working display and keyboard

sudo grub2-set-bootflag menu_show_once Will do this in (F29+) the plan is to also change the "Restart" option in the GNOME3 shutdown modal dialog to "Boot Options" when alt is pressed and then set that flag before rebooting if the user clicks the "Restart/Boot Options" button with alt pressed (similar to how the poweroff icon which gives this menu changes to a pause icon / suspend button when alt is pressed). This will all be documented in the the admin guide and a link to that part of the admin guide will be added to the release-notes.

...

2. there is a way to demand all boots provide the full boot menu with working display and keyboard

This requires running this command *once* : sudo grub2-editenv - unset menu_auto_hide This will also be documented in the admin guide.

...

3. those ways are easily discoverable by laymen (typically, a notice on the default gfx or cli login screen)

See above for the plans to make this discoverable, we believe these are advanced options which should not be visible by default.

...

4. you check every single bit needed to use them works before declaring a boot successful

A boot is declared successful if a user logs in (or the user session starts if autologin is enabled) and the usersession lasts at least 2 minutes. So even if login works, but then for some reason the session crashes immediately afterwards, that still will NOT count as a boot success. This means that we may get a few false positive failed boot detects (e.g. reboot/shutdown within 2 minutes), but the side-effects of that are harmless (menu shown for 5 seconds) where as a false-negative could be troublesome. IOW I agree with you that we need to be careful when we mark a boot successful. Regards, Hans

Nicolas Mailhot

10:12 a.m.

Le samedi 30 juin 2018 à 14:32 +0200, Hans de Goede a écrit : Hi

...

> 4. you check every single bit needed to use them works before > declaring a boot successful A boot is declared successful if a user logs in (or the user session starts if autologin is enabled) and the usersession lasts at least 2 minutes. So even if login works, but then for some reason the session crashes immediately afterwards, that still will NOT count as a boot success.

It'd be nice if there was a way to check grub2-editenv works (some dummy action that is tested on boot). I've lost the number of times I had to re-run anaconda on a system just to reinstall the boot stack, because it tends to bork itself on hardware or selinux changes and there is no clear way to reinit it. Regards, -- Nicolas Mailhot

Christian Glombek

Tuesday, 10 July Tue, 10 Jul

7:42 a.m.

Hello Everyone, Regarding boot success determination: For the current Fedora GSoC project that I am participating in, I wrote greenboot, a generic health check framework for systemd: https://github.com/LorbusChris/greenboot In greenboot, health checks can be defined in the form of scripts and/or systemd units. This could be useful in this case for determining boot success, allowing for more sophisticated checks than just a timer. Maybe it's a little late to get this into F29, as the project is not in the Fedora repos, yet (it is being build on copr: lorbus/greenboot) and there are still some improvements to be made, but it'll certainly be ready by the time F30 is coming. WDYT? Thank you Javier to pointing me to this. Regards, Christian Glombek FAS Lorbus Am Sa., 30. Juni 2018 um 17:12 Uhr schrieb Nicolas Mailhot < nicolas.mailhot(a)laposte.net>:

...

Le samedi 30 juin 2018 à 14:32 +0200, Hans de Goede a écrit : Hi > > 4. you check every single bit needed to use them works before > > declaring a boot successful > > A boot is declared successful if a user logs in (or the > user session starts if autologin is enabled) and the > usersession lasts at least 2 minutes. So even if login > works, but then for some reason the session crashes immediately > afterwards, that still will NOT count as a boot success. It'd be nice if there was a way to check grub2-editenv works (some dummy action that is tested on boot). I've lost the number of times I had to re-run anaconda on a system just to reinstall the boot stack, because it tends to bork itself on hardware or selinux changes and there is no clear way to reinit it. Regards, -- Nicolas Mailhot _______________________________________________ devel mailing list -- devel(a)lists.fedoraproject.org To unsubscribe send an email to devel-leave(a)lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.o...

Robert Marcano

Thursday, 31 May Thu, 31 May

8:20 a.m.

On 05/31/2018 06:52 AM, Hans de Goede wrote:

...

... This will basically get us back the F28 behavior of showing the menu but only after a failed boot, I think that is a good solution, do you agree?

What is the definition of a successful boot? I ask because a machine could boot perfectly, and when you try to interact with it on the login screen, bugs on the display driver can change the screen to garbage (I have seen this kind on bug long time ago), or lockup. So, the user will be unable to activate any kind of restart with menu enabled in order to try an older kernel, or boot to rescue mode. I think instead of only detecting a successful boot, a machine that wasn't properly shutdown should enable the menu > > Regards, > > Hans

Hans de Goede

10:47 a.m.

Hi, On 31-05-18 15:20, Robert Marcano wrote:

...

On 05/31/2018 06:52 AM, Hans de Goede wrote: > ... > This will basically get us back the F28 behavior of showing the > menu but only after a failed boot, I think that is a good > solution, do you agree? What is the definition of a successful boot? I ask because a machine could boot perfectly, and when you try to interact with it on the login screen, bugs on the display driver can change the screen to garbage (I have seen this kind on bug long time ago), or lockup. So, the user will be unable to activate any kind of restart with menu enabled in order to try an older kernel, or boot to rescue mode. I think instead of only detecting a successful boot, a machine that wasn't properly shutdown should enable the menu

A broken install may still shutdown properly after the using pressing the power-button and/or trying ctrl+alt+del. But this is an interesting suggestion, I think we should track both separately, so successful shutdown and successful boot and show the menu if either one is not true. That should make the chance of not being able to get the menu a lot smaller. Regards, Hans

Björn Persson

1:31 p.m.

Hans de Goede wrote:

...

On 31-05-18 15:20, Robert Marcano wrote: > What is the definition of a successful boot? I ask because a machine could boot perfectly, and when you try to interact with it on the login screen, bugs on the display driver can change the screen to garbage (I have seen this kind on bug long time ago), or lockup. So, the user will be unable to activate any kind of restart with menu enabled in order to try an older kernel, or boot to rescue mode. > > I think instead of only detecting a successful boot, a machine that wasn't properly shutdown should enable the menu A broken install may still shutdown properly after the using pressing the power-button and/or trying ctrl+alt+del. But this is an interesting suggestion, I think we should track both separately, so successful shutdown and successful boot and show the menu if either one is not true. That should make the chance of not being able to get the menu a lot smaller.

This is starting to sound rather complex, and complex code is prone to bugs. I want my bootloader as simple and straightforward as possible to minimize the risk of problems. I would hate to run into some bug that renders the system unusable, and then find that I can't do anything about it because a separate bug causes Grub to not display the boot menu. Björn Persson

Peter Jones

Friday, 1 June Fri, 1 Jun

1:03 p.m.

On Thu, May 31, 2018 at 05:47:36PM +0200, Hans de Goede wrote:

...

Hi, On 31-05-18 15:20, Robert Marcano wrote: > On 05/31/2018 06:52 AM, Hans de Goede wrote: > > ... > > This will basically get us back the F28 behavior of showing the > > menu but only after a failed boot, I think that is a good > > solution, do you agree? > > What is the definition of a successful boot? I ask because a machine > could boot perfectly, and when you try to interact with it on the > login screen, bugs on the display driver can change the screen to > garbage (I have seen this kind on bug long time ago), or lockup. So, > the user will be unable to activate any kind of restart with menu > enabled in order to try an older kernel, or boot to rescue mode. > > I think instead of only detecting a successful boot, a machine that > wasn't properly shutdown should enable the menu A broken install may still shutdown properly after the using pressing the power-button and/or trying ctrl+alt+del. But this is an interesting suggestion, I think we should track both separately, so successful shutdown and successful boot and show the menu if either one is not true. That should make the chance of not being able to get the menu a lot smaller.

In my mind, the mechanism here looks like what I've sketched out below, and I think it encapsulates the above as well as most of what I've seen on this thread already. The workflow is something like this: - user updates the OS[0] - we automatically set the new OS to be booted /once/. - we have a successful-boot-test.service that depends on [getty.target or graphical.target]. Upon starting, it sets a timer for some relatively long amount of time, like say 5 minutes, and at the end of that time it decides if booting worked and sets some state to let us know. - we also provide a tool for an admin to set a specific state, since they know best. - if a user logs in and starts doing stuff before the timer expires, we booted successfully, and we set the new OS to be default and mark it as having succeeded. - if the machine is rebooted *unexpectedly*[1] without any successful login before the timer expires, we reboot and get the previous OS, and we can detect that it failed during that boot and take whatever appropriate action - if the timer expires without user activity, or if there's an expected intermediate reboot we need to do, it's indeterminate if it worked or not; we set the one-shot again[5]. - in the case where it's an expected reboot, we re-set the count of how many times we've reached the indeterminate state - otherwise we add one to the count - if the count is above some threshold (say 3) in some amount of time (say a day), set a one-shot variable that says to show the menu. - on server[2] we're going to want some indicator of "is successfully doing it's job" instead of login; that's probably a separate feature. - It probably is worth having the power button be an indicator of how we shut down, and make that be a reason to show the menu, at least in some cases, if you haven't done things like gone into settings and told the power button to do nothing. And then concerning the actual menu+countdown (or more importantly, when to probe for the keyboard), we don't show the menu or probe for key state unless one of the following is the case: - a persistent grub environment variable that says /not/ to show the menu is /absent/ or set to false. (i.e. the user or some install class[3] disabled this feature, or if grubenv has been corrupted, or if we're on an architecture that insists on not having nice things[4], etc.) - a one-shot grub environment variable, that says to show the menu, is set to true. (i.e. user asked for the menu when they rebooted the machine) - indeterminate boot count is > 1 - the previous boot is not marked as indeterminate or success [ 0] I'm being deliberately vague here because I think I mean "updates stuff that runs between (inclusively) the bootloader and [getty.target, graphical.target]" for the traditional OS, and not exactly the same criteria for Atomic, but both can reasonably be captured in one description. [ 1] There are cases like if we do an selinux relabel during boot and then reboot the machine, or other situations analogous to that, where the reboot is known to be unrelated to the success or failure of the update. [ 2] We could reasonably ship this enabled on workstation+desktop+laptop environments with servers disabled until there's some less wishy-washy description here. Despite what mattdm said above in this thread, I think ultimately we do want it on server, even though we care less about flicker-free booting there - the countdown and probing aren't an insignificant chunk of the boot time, and the time it takes to reboot can come to dominate downtime. [ 3] See [2]. [ 4] As a for-instance, IBM ppc* machines nerf out the block device write() call in their firmware, so we don't have one-shot variables there at all and can't do any of this. [ 5] I might be able to be convinced there's a case for local config policy to be injected here, but I think the tool mentioned earlier is probably enough. Now you all get to tell me all the ways I'm wrong ;) -- Peter

Kyle Marek

Sunday, 3 June Sun, 3 Jun

5:53 p.m.

On 06/01/2018 02:03 PM, Peter Jones wrote:

...

On Thu, May 31, 2018 at 05:47:36PM +0200, Hans de Goede wrote: > Hi, > > On 31-05-18 15:20, Robert Marcano wrote: >> On 05/31/2018 06:52 AM, Hans de Goede wrote: >>> ... >>> This will basically get us back the F28 behavior of showing the >>> menu but only after a failed boot, I think that is a good >>> solution, do you agree? >> What is the definition of a successful boot? I ask because a machine >> could boot perfectly, and when you try to interact with it on the >> login screen, bugs on the display driver can change the screen to >> garbage (I have seen this kind on bug long time ago), or lockup. So, >> the user will be unable to activate any kind of restart with menu >> enabled in order to try an older kernel, or boot to rescue mode. >> >> I think instead of only detecting a successful boot, a machine that >> wasn't properly shutdown should enable the menu > A broken install may still shutdown properly after the using pressing > the power-button and/or trying ctrl+alt+del. > > But this is an interesting suggestion, I think we should track both > separately, so successful shutdown and successful boot and show the > menu if either one is not true. That should make the chance of not > being able to get the menu a lot smaller. In my mind, the mechanism here looks like what I've sketched out below, and I think it encapsulates the above as well as most of what I've seen on this thread already. The workflow is something like this: - user updates the OS[0] - we automatically set the new OS to be booted /once/. - we have a successful-boot-test.service that depends on [getty.target or graphical.target]. Upon starting, it sets a timer for some relatively long amount of time, like say 5 minutes, and at the end of that time it decides if booting worked and sets some state to let us know. - we also provide a tool for an admin to set a specific state, since they know best. - if a user logs in and starts doing stuff before the timer expires, we booted successfully, and we set the new OS to be default and mark it as having succeeded. - if the machine is rebooted *unexpectedly*[1] without any successful login before the timer expires, we reboot and get the previous OS, and we can detect that it failed during that boot and take whatever appropriate action - if the timer expires without user activity, or if there's an expected intermediate reboot we need to do, it's indeterminate if it worked or not; we set the one-shot again[5]. - in the case where it's an expected reboot, we re-set the count of how many times we've reached the indeterminate state - otherwise we add one to the count - if the count is above some threshold (say 3) in some amount of time (say a day), set a one-shot variable that says to show the menu. - on server[2] we're going to want some indicator of "is successfully doing it's job" instead of login; that's probably a separate feature. - It probably is worth having the power button be an indicator of how we shut down, and make that be a reason to show the menu, at least in some cases, if you haven't done things like gone into settings and told the power button to do nothing. And then concerning the actual menu+countdown (or more importantly, when to probe for the keyboard), we don't show the menu or probe for key state unless one of the following is the case: - a persistent grub environment variable that says /not/ to show the menu is /absent/ or set to false. (i.e. the user or some install class[3] disabled this feature, or if grubenv has been corrupted, or if we're on an architecture that insists on not having nice things[4], etc.) - a one-shot grub environment variable, that says to show the menu, is set to true. (i.e. user asked for the menu when they rebooted the machine) - indeterminate boot count is > 1 - the previous boot is not marked as indeterminate or success [ 0] I'm being deliberately vague here because I think I mean "updates stuff that runs between (inclusively) the bootloader and [getty.target, graphical.target]" for the traditional OS, and not exactly the same criteria for Atomic, but both can reasonably be captured in one description. [ 1] There are cases like if we do an selinux relabel during boot and then reboot the machine, or other situations analogous to that, where the reboot is known to be unrelated to the success or failure of the update. [ 2] We could reasonably ship this enabled on workstation+desktop+laptop environments with servers disabled until there's some less wishy-washy description here. Despite what mattdm said above in this thread, I think ultimately we do want it on server, even though we care less about flicker-free booting there - the countdown and probing aren't an insignificant chunk of the boot time, and the time it takes to reboot can come to dominate downtime. [ 3] See [2]. [ 4] As a for-instance, IBM ppc* machines nerf out the block device write() call in their firmware, so we don't have one-shot variables there at all and can't do any of this. [ 5] I might be able to be convinced there's a case for local config policy to be injected here, but I think the tool mentioned earlier is probably enough. Now you all get to tell me all the ways I'm wrong ;)

I am also opposed to the logistics of relying on some boot failure indication to show the menu because of failing storage media preventing the variable from being set. Depending on the storage failure, it is not unreasonable to boot with "ro init=/bin/sh" on the cmdline to get to some read-only environment to begin recovering data, but it would become cumbersome by F30 if the timeout is set to 0 and the environment is BIOS where there's no EFI variables to influence GRUB.

Hans de Goede

Monday, 4 June Mon, 4 Jun

2:16 a.m.

Hi, Note I've dropped the fedora-devel list (-ETOOMUCHBIKESHED) and added Javier and Jan to the Cc. On 01-06-18 20:03, Peter Jones wrote:

...

Hmm, I see you also refer to atomic and there this makes sense, but in the traditional distro model how would we implement this? We could implement boot a new kernel once, but since a xserver / mesa / gnome update might break things just as easily as a kernel update can break things I'm not sure if adding boot-once functionality to the traditional model is really helpful. Reverting to the old kernel might help in some cases, but we are also going to get false-positives. I've a feeling this is going to become really messy. As such I don't think this is a change we can "sell" easily. Some people really don't seem to like the idea of any changes to the grub config / menu at all. I've a feeling that selling the hidden menu by itself is enough of a hassle without adding in booting a new kernel once to test it. I realize that this in a way is a way to lessen the impact of the menu being hidden, but I'm not 100% sold on this. I would rather just show the menu after a failed boot and have reverting to the kernel be a conscious choice of the user. I have a number of reasons for this: 1) Don't revert to older kernel on false-positive failed boot detects (limit the result of a false-positive failed boot detect to showing the menu without any side 2) Updates typically come in batches and the boot failure may well be caused by something else, so we're not necessarily helping the user here, even if the user manages to fix things he will now be running an older kernel for no good reason. 3) Since reverting to the old kernel may not be enough, we still need to show the menu after a failed boot 4) Principle of least surprise, we are now making unrequested changes to the users system and not (really) notifying the user of this. For Atomic I envision that after switching back to the old snapshot / release the UI will show a dialog after login along the lines of: "The new 20190214 release did not work, we've reverted your machine to the 20190207 release" (but then better worded). We could do something similar for the kernel, assuming reverting to the old kernel will allow us to show the dialog, but we again have the whole false positive thing, so now we end up showing a scary dialog because of a false-positive failed-boot detect. So all in all I'm not a big fan of the boot once concept for the traditional Fedora version. I think it makes a lot of sense for Atomic and we should do it there, but not for Fedora. Another thing to keep in mind is that we don't really have much time to get things in place for F29, so especially for F29 this seems too complex and I would prefer to only add a "GRUB_AUTO_HIDE" option to /etc/default/grub which when set will make grub2-mkconfig generate a grub.cfg which will hides the menu unless a failed boot is detected and not make any changes wrt which kernel to boot when. This also has the added advantage that it avoids me touching the default selection code, which would collide with Javier's BLS work I think. Regards, Hans > - we have a successful-boot-test.service that depends on [getty.target > or graphical.target]. Upon starting, it sets a timer for some > relatively long amount of time, like say 5 minutes, and at the end of > that time it decides if booting worked and sets some state to let us > know. > - we also provide a tool for an admin to set a specific state, since > they know best. > - if a user logs in and starts doing stuff before the timer expires, > we booted successfully, and we set the new OS to be default and mark > it as having succeeded. > - if the machine is rebooted *unexpectedly*[1] without any successful > login before the timer expires, we reboot and get the previous OS, and > we can detect that it failed during that boot and take whatever > appropriate action > - if the timer expires without user activity, or if there's an > expected intermediate reboot we need to do, it's indeterminate if it > worked or not; we set the one-shot again[5]. > - in the case where it's an expected reboot, we re-set the count of > how many times we've reached the indeterminate state > - otherwise we add one to the count > - if the count is above some threshold (say 3) in some amount of time > (say a day), set a one-shot variable that says to show the menu. > - on server[2] we're going to want some indicator of "is successfully > doing it's job" instead of login; that's probably a separate > feature. > - It probably is worth having the power button be an indicator of how > we shut down, and make that be a reason to show the menu, at least > in some cases, if you haven't done things like gone into settings > and told the power button to do nothing. > > And then concerning the actual menu+countdown (or more importantly, when > to probe for the keyboard), we don't show the menu or probe for key > state unless one of the following is the case: > > - a persistent grub environment variable that says /not/ to show the > menu is /absent/ or set to false. (i.e. the user or some install > class[3] disabled this feature, or if grubenv has been corrupted, or > if we're on an architecture that insists on not having nice things[4], > etc.) > - a one-shot grub environment variable, that says to show the menu, is > set to true. (i.e. user asked for the menu when they rebooted the > machine) > - indeterminate boot count is > 1 > - the previous boot is not marked as indeterminate or success > > [ 0] I'm being deliberately vague here because I think I mean "updates > stuff that runs between (inclusively) the bootloader and > [getty.target, graphical.target]" for the traditional OS, and not > exactly the same criteria for Atomic, but both can reasonably be > captured in one description. > [ 1] There are cases like if we do an selinux relabel during boot and > then reboot the machine, or other situations analogous to that, > where the reboot is known to be unrelated to the success or failure > of the update. > [ 2] We could reasonably ship this enabled on workstation+desktop+laptop > environments with servers disabled until there's some less > wishy-washy description here. Despite what mattdm said above in > this thread, I think ultimately we do want it on server, even > though we care less about flicker-free booting there - the > countdown and probing aren't an insignificant chunk of the boot > time, and the time it takes to reboot can come to dominate > downtime. > [ 3] See [2]. > [ 4] As a for-instance, IBM ppc* machines nerf out the block device > write() call in their firmware, so we don't have one-shot variables > there at all and can't do any of this. > [ 5] I might be able to be convinced there's a case for local config > policy to be injected here, but I think the tool mentioned earlier > is probably enough. > > Now you all get to tell me all the ways I'm wrong ;) >

Hans de Goede

8:15 a.m.

Hi, On 04-06-18 09:16, Hans de Goede wrote:

...

Hi, Note I've dropped the fedora-devel list (-ETOOMUCHBIKESHED) and added Javier and Jan to the Cc.

Ugh, so clearly I failed to remove fedora-devel from the CC. Ah well. I hope this mistake shows that there is nothing nefarious going on here and that Javier, Peter and I are really just working on trying making the boot experience nicer for Workstation users, while at the same time very thoroughly keeping in mind the rescue / things broke scenario. Regards, Hans > On 01-06-18 20:03, Peter Jones wrote: >> On Thu, May 31, 2018 at 05:47:36PM +0200, Hans de Goede wrote: >>> Hi, >>> >>> On 31-05-18 15:20, Robert Marcano wrote: >>>> On 05/31/2018 06:52 AM, Hans de Goede wrote: >>>>> ... >>>>> This will basically get us back the F28 behavior of showing the >>>>> menu but only after a failed boot, I think that is a good >>>>> solution, do you agree? >>>> >>>> What is the definition of a successful boot? I ask because a machine >>>> could boot perfectly, and when you try to interact with it on the >>>> login screen, bugs on the display driver can change the screen to >>>> garbage (I have seen this kind on bug long time ago), or lockup. So, >>>> the user will be unable to activate any kind of restart with menu >>>> enabled in order to try an older kernel, or boot to rescue mode. >>>> >>>> I think instead of only detecting a successful boot, a machine that >>>> wasn't properly shutdown should enable the menu >>> >>> A broken install may still shutdown properly after the using pressing >>> the power-button and/or trying ctrl+alt+del. >>> >>> But this is an interesting suggestion, I think we should track both >>> separately, so successful shutdown and successful boot and show the >>> menu if either one is not true. That should make the chance of not >>> being able to get the menu a lot smaller. >> >> In my mind, the mechanism here looks like what I've sketched out below, >> and I think it encapsulates the above as well as most of what I've seen >> on this thread already. >> >> The workflow is something like this: >> >> - user updates the OS[0] >> - we automatically set the new OS to be booted /once/. > > Hmm, I see you also refer to atomic and there this makes sense, but > in the traditional distro model how would we implement this? > > We could implement boot a new kernel once, but since a xserver / > mesa / gnome update might break things just as easily as a kernel > update can break things I'm not sure if adding boot-once functionality > to the traditional model is really helpful. > > Reverting to the old kernel might help in some cases, but we are > also going to get false-positives. I've a feeling this is going to > become really messy. As such I don't think this is a change we > can "sell" easily. Some people really don't seem to like the idea of > any changes to the grub config / menu at all. > > I've a feeling that selling the hidden menu by itself is enough > of a hassle without adding in booting a new kernel once to test it. > I realize that this in a way is a way to lessen the impact of the > menu being hidden, but I'm not 100% sold on this. > > I would rather just show the menu after a failed boot and have > reverting to the kernel be a conscious choice of the user. I have > a number of reasons for this: > > 1) Don't revert to older kernel on false-positive failed boot detects > (limit the result of a false-positive failed boot detect to showing > the menu without any side > > 2) Updates typically come in batches and the boot failure may well be > caused by something else, so we're not necessarily helping the user > here, even if the user manages to fix things he will now be running > an older kernel for no good reason. > > 3) Since reverting to the old kernel may not be enough, we still need > to show the menu after a failed boot > > 4) Principle of least surprise, we are now making unrequested changes to > the users system and not (really) notifying the user of this. > For Atomic I envision that after switching back to the old snapshot / > release the UI will show a dialog after login along the lines of: > "The new 20190214 release did not work, we've reverted your machine > to the 20190207 release" (but then better worded). We could do > something similar for the kernel, assuming reverting to the old > kernel will allow us to show the dialog, but we again have the whole > false positive thing, so now we end up showing a scary dialog because > of a false-positive failed-boot detect. > > So all in all I'm not a big fan of the boot once concept for the > traditional Fedora version. I think it makes a lot of sense for Atomic > and we should do it there, but not for Fedora. > > Another thing to keep in mind is that we don't really have much time > to get things in place for F29, so especially for F29 this seems > too complex and I would prefer to only add a "GRUB_AUTO_HIDE" > option to /etc/default/grub which when set will make grub2-mkconfig > generate a grub.cfg which will hides the menu unless a failed boot > is detected and not make any changes wrt which kernel to boot when. > > This also has the added advantage that it avoids me touching the > default selection code, which would collide with Javier's BLS work I think. > > Regards, > > Hans > > > >> - we have a successful-boot-test.service that depends on [getty.target >> or graphical.target]. Upon starting, it sets a timer for some >> relatively long amount of time, like say 5 minutes, and at the end of >> that time it decides if booting worked and sets some state to let us >> know. >> - we also provide a tool for an admin to set a specific state, since >> they know best. >> - if a user logs in and starts doing stuff before the timer expires, >> we booted successfully, and we set the new OS to be default and mark >> it as having succeeded. >> - if the machine is rebooted *unexpectedly*[1] without any successful >> login before the timer expires, we reboot and get the previous OS, and >> we can detect that it failed during that boot and take whatever >> appropriate action >> - if the timer expires without user activity, or if there's an >> expected intermediate reboot we need to do, it's indeterminate if it >> worked or not; we set the one-shot again[5]. >> - in the case where it's an expected reboot, we re-set the count of >> how many times we've reached the indeterminate state >> - otherwise we add one to the count >> - if the count is above some threshold (say 3) in some amount of time >> (say a day), set a one-shot variable that says to show the menu. >> - on server[2] we're going to want some indicator of "is successfully >> doing it's job" instead of login; that's probably a separate >> feature. >> - It probably is worth having the power button be an indicator of how >> we shut down, and make that be a reason to show the menu, at least >> in some cases, if you haven't done things like gone into settings >> and told the power button to do nothing. >> >> And then concerning the actual menu+countdown (or more importantly, when >> to probe for the keyboard), we don't show the menu or probe for key >> state unless one of the following is the case: >> >> - a persistent grub environment variable that says /not/ to show the >> menu is /absent/ or set to false. (i.e. the user or some install >> class[3] disabled this feature, or if grubenv has been corrupted, or >> if we're on an architecture that insists on not having nice things[4], >> etc.) >> - a one-shot grub environment variable, that says to show the menu, is >> set to true. (i.e. user asked for the menu when they rebooted the >> machine) >> - indeterminate boot count is > 1 >> - the previous boot is not marked as indeterminate or success >> >> [ 0] I'm being deliberately vague here because I think I mean "updates >> stuff that runs between (inclusively) the bootloader and >> [getty.target, graphical.target]" for the traditional OS, and not >> exactly the same criteria for Atomic, but both can reasonably be >> captured in one description. >> [ 1] There are cases like if we do an selinux relabel during boot and >> then reboot the machine, or other situations analogous to that, >> where the reboot is known to be unrelated to the success or failure >> of the update. >> [ 2] We could reasonably ship this enabled on workstation+desktop+laptop >> environments with servers disabled until there's some less >> wishy-washy description here. Despite what mattdm said above in >> this thread, I think ultimately we do want it on server, even >> though we care less about flicker-free booting there - the >> countdown and probing aren't an insignificant chunk of the boot >> time, and the time it takes to reboot can come to dominate >> downtime. >> [ 3] See [2]. >> [ 4] As a for-instance, IBM ppc* machines nerf out the block device >> write() call in their firmware, so we don't have one-shot variables >> there at all and can't do any of this. >> [ 5] I might be able to be convinced there's a case for local config >> policy to be injected here, but I think the tool mentioned earlier >> is probably enough. >> >> Now you all get to tell me all the ways I'm wrong ;) >>

Sheogorath

Thursday, 31 May Thu, 31 May

5:41 a.m.

On 05/31/2018 12:23 PM, Hans de Goede wrote:

...

The goal if this email is to: 1) Give people an advance warning about the plan to change this so we can discuss this early on

Actually I'm not a fan of this change. While it was easy to explain end users they can boot to an older, usually working version by just selecting the other entry, this makes it more complicated. Especially with the latest change to easily allow people to install external repositories for NVIDIA graphic drivers which are known to cause trouble with latest kernels. If we want to boot faster, lowering the timeout to 1 second sounds fine. -- Signed Sheogorath

Colin Walters

8:08 a.m.

On Thu, May 31, 2018, at 6:23 AM, Hans de Goede wrote:

...

Seems like this is implictly saying "Fedora" to mean (classic) "desktop", but we have different editions now. Further, one of those editions, Atomic Host, has fully transactional updates via rpm-ostree that are reflected in the bootloader order today - it's not just the kernel. And we like that feature =) There's also a GSoC project to write a boot health check service that integrates with this: https://pagure.io/fedora-iot/issue/2

Hans de Goede

10:35 a.m.

Hi, On 31-05-18 15:08, Colin Walters wrote:

...

On Thu, May 31, 2018, at 6:23 AM, Hans de Goede wrote: > Hi All, > > I'm working on improving the Fedora boot experience, with the > end goal being a user pressing the on button and then going > to the graphical login manager without him seeing any > text messages / menus filled with technical jargon. Seems like this is implictly saying "Fedora" to mean (classic) "desktop", but we have different editions now. Further, one of those editions, Atomic Host, has fully transactional updates via rpm-ostree that are reflected in the bootloader order today - it's not just the kernel. And we like that feature =)

I've this on my radar, but I would expect SilverBlue to also not want to show the menu by default, so although the menu is used a bit differently the fundamental problems (hide by default, still allow the user access, pop up automatically on bootfail) apply AFAICT.

...

There's also a GSoC project to write a boot health check service that integrates with this: https://pagure.io/fedora-iot/issue/2

Oh, interesting. Regards, Hans

Matthew Miller

11:53 a.m.

On Thu, May 31, 2018 at 09:08:28AM -0400, Colin Walters wrote:

...

> I'm working on improving the Fedora boot experience, with the > end goal being a user pressing the on button and then going > to the graphical login manager without him seeing any > text messages / menus filled with technical jargon. Seems like this is implictly saying "Fedora" to mean (classic) "desktop", but we have different editions now. Further, one of those editions, Atomic Host, has fully transactional updates via rpm-ostree that are reflected in the bootloader order today - it's not just the kernel. And we like that feature =)

+1 -- I think we would probably also want the menu by default on server. (Although, conversely, it's pretty useless in cloud environments where there isn't an interactive console.) -- Matthew Miller <mattdm(a)fedoraproject.org> Fedora Project Leader

Chris Murphy

Friday, 1 June Fri, 1 Jun

1:22 p.m.

On Thu, May 31, 2018 at 10:53 AM, Matthew Miller <mattdm(a)fedoraproject.org> wrote:

...

On Thu, May 31, 2018 at 09:08:28AM -0400, Colin Walters wrote: > > I'm working on improving the Fedora boot experience, with the > > end goal being a user pressing the on button and then going > > to the graphical login manager without him seeing any > > text messages / menus filled with technical jargon. > Seems like this is implictly saying "Fedora" to mean (classic) "desktop", but > we have different editions now. Further, one of those editions, > Atomic Host, has fully transactional updates via rpm-ostree that are reflected > in the bootloader order today - it's not just the kernel. And we like that feature =) +1 -- I think we would probably also want the menu by default on server. (Although, conversely, it's pretty useless in cloud environments where there isn't an interactive console.)

Ironically, server, cloud, and VMs all benefit the most from the feature. VMs are the least likely to run into kernel related regressions, followed by bare metal servers. Workstation is more likely. And maybe ARM and IoT related products even more likely (I'm basing that on the much wider assortment of hardware, less standardization, very active development, and less testing coverage). -- Chris Murphy

Sven Kieske

Tuesday, 5 June Tue, 5 Jun

10:04 a.m.

Am 01.06.2018 um 20:22 schrieb Chris Murphy:

...

Ironically, server, cloud, and VMs all benefit the most from the feature.

I'm just a user/bystander for the most part of this discussion but I feel I have to correct this statement: For the most part, server boot time is determined by firmware stuff, before even grub gets loaded, so reduced startup time is always nice, but when your HP DL380 Gen10 Server already need 5-15 Minutes until POST is complete you really do not care about 5 Second grub display. you really _need_ grub menus in production DCs where you still run pet workloads and shit hits the fan (emergency boot into old kernel, tweaking your already custom kernel cmdline etc), it happens really rare, but when, you absolutely need it. HTH to clarify what actual datacenter users _do_ care about (well, this might vary, depending which DC OPs person you ask). That said, it's less of a concern because we can of course recreate the old behaviour, but I must admit, I like sane defaults. But as I understand it, this change is limited to the workstation edition of fedora, so I don't know why people come up with the topic of servers in the first place. -- Mit freundlichen Grüßen / Regards Sven Kieske Systemadministrator Mittwald CM Service GmbH & Co. KG Königsberger Straße 4-6 32339 Espelkamp T: +495772 293100 F: +495772 293333 https://www.mittwald.de Geschäftsführer: Robert Meyer, Maik Behring St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217 HRA 6640, AG Bad Oeynhausen Komplementärin: Robert Meyer Verwaltungs GmbH HRB 13260, AG Bad Oeynhausen

stan

Thursday, 31 May Thu, 31 May

9:40 a.m.

On Thu, 31 May 2018 12:23:35 +0200 Hans de Goede <hdegoede(a)redhat.com> wrote:

...

I *like* seeing all the stuff flow by, and I boot into multi-user before starting the graphical user interface. I like seeing what is going on under the hood. Will I still be able to do this? Or will I have to hack the install after it is done? Saying this is an improvement is a value judgement. I agree that many people might consider this an improvement, but not all. What is the rationale for doing it? Imitation of Mac or Windows? Trying to make it easier for users of Mac or Windows to switch to Fedora? What about existing users? Is it just assumed that they want this improvement? It seems clear to me that this change will happen. I'm just trying to get you to consider it from different perspectives, to implement it in such a way that those who don't consider it an improvement have an easy way to revert to prior behavior.

Hans de Goede

10:43 a.m.

Hi, On 31-05-18 16:40, stan wrote:

...

On Thu, 31 May 2018 12:23:35 +0200 Hans de Goede <hdegoede(a)redhat.com> wrote: > Hi All, > > I'm working on improving the Fedora boot experience, with the > end goal being a user pressing the on button and then going > to the graphical login manager without him seeing any > text messages / menus filled with technical jargon. I *like* seeing all the stuff flow by, and I boot into multi-user before starting the graphical user interface. I like seeing what is going on under the hood. Will I still be able to do this? Or will I have to hack the install after it is done?

As the "by default" in the Subject implies, this is about setting a config option, one which is already available today, but the plan is to change its value, this config option lives in /etc/default/grub which gets written once during install and then never touched again. TL;DR: Yes you will still be able to do this with a simple 1 time configfile change. Regards, Hans > > Saying this is an improvement is a value judgement. I agree that many > people might consider this an improvement, but not all. > > What is the rationale for doing it? Imitation of Mac or Windows? > Trying to make it easier for users of Mac or Windows to switch to > Fedora? What about existing users? Is it just assumed that they want > this improvement? > > It seems clear to me that this change will happen. I'm just trying to > get you to consider it from different perspectives, to implement it in > such a way that those who don't consider it an improvement have an easy > way to revert to prior behavior. > _______________________________________________ > devel mailing list -- devel(a)lists.fedoraproject.org > To unsubscribe send an email to devel-leave(a)lists.fedoraproject.org > Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.o... >

stan

12:36 p.m.

On Thu, 31 May 2018 17:43:13 +0200 Hans de Goede <hdegoede(a)redhat.com> wrote:

...

TL;DR: Yes you will still be able to do this with a simple 1 time configfile change.

Thanks, seems you have all your ducks in a row.

Kyle Marek

11:52 p.m.

With respect, I am opposed to the proposal. In essence, I think this boils down to: function > form. I've been in too many situations where hidden GRUB menus resulted in having to "guess" when the firmware has finally started the bootloader and what would be a quick 5 second cmdline change turns into several minutes of rebooting due to ESC entering the firmware configuration... I've had similar experiences with F8 in Windows environments. While this may be a "simple 1 time configfile change," having to enter the menu is often an unexpected scenario so remembering to do this *before* you run into some boot-related issue is in-itself an issue, especially when the user has many machines/installations. As pointed out, the plans to have a 0 second timeout in F30, requiring some userspace preparation to enable the menu, is going to result in issues where additional boot media is necessary if the system boots fine but the user cannot actually log into the system. Furthermore, the GRUB menu's "only function" is not just to allow booting older kernels. In fact, I've personally never had to boot an older kernel; I've only ever used the menu for cmdline edits to address all kinds of random issues (including but not limited to graphics issues), or to address things that weren't even "issues" (enabling intel_iommu, for example). I often try several cmdline edits before progress is made on a specific issue. I think going through this effort to shave a few seconds off of the boot time is just going to make everything harder *especially* when the shit hits the fan... I don't think it's worth perpetuating the Windows-ification of Fedora, either. Regards, Kyle

Michael Watters

Friday, 1 June Fri, 1 Jun

10:57 a.m.

Well said. Seems like Fedora is slowly turning into Fisher Price My First Linux instead of being a distro that actually respects its users. IME people that run Fedora usually know what they're doing and trying to obfuscate and hide things simply makes the distro *harder* to use. On 06/01/2018 12:52 AM, Kyle Marek wrote:

...

I think going through this effort to shave a few seconds off of the boot time is just going to make everything harder *especially* when the shit hits the fan... I don't think it's worth perpetuating the Windows-ification of Fedora, either. Regards, Kyle _______________________________________________ devel mailing list -- devel(a)lists.fedoraproject.org To unsubscribe send an email to devel-leave(a)lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.o...

Akarshan Biswas

Thursday, 31 May Thu, 31 May

10:42 p.m.

100% agreed. how ever enable an option to automatic enable grub bootloader menu when something went wrong with the boot process, or system crash. This will help alot to users like me. :)

Hans de Goede

Friday, 1 June Fri, 1 Jun

3:04 a.m.

Hi All, First of all I want to thank everyone for their input. I also want to make clear that the hide the menu + not listening for a keypress at all (aka fastboot) is a Fedora 30 thing, quoting myself: "For F29, single OS Fedora Workstation install we get: 1) grub menu hidden by default with a 1 second timeout to press ESC or F8 to show it 2) grub menu shown with 5 sec timeout after a failed boot For F30, single OS Fedora Workstation install install we get: 1) grub menu not shown, 0 second timeout, no way to get to the menu 2) grub menu shown with 5 sec timeout after a failed boot" I understand that some people are worried to not be able to get to the grub menu when they need to. I hear you and I share your worries about this. With that said I want to emphasize out that for F29 you will still be able to always get the grub menu by pressing F8 at boot (or ESC on some Asus and Lenovo machines where the firmware has hijacked F8). This assumes your keyboard works in grub at all, but if it doesn't then nothing changes compared to F28. And we will also show the menu as we used to do in F28 when the previous boot has either failed, or the system was not shutdown cleanly. There has been some discussion about what defines a successful boot. I've been thinking a bit about this and my plan is to set the boot_success flag (which grub itself will clear each boot) from a systemd timer which is part of the users gnome systemd user session and runs after 2 minutes. So we will check that the user successfully logged in and that his gnome3 session has lasted at least 2 minutes. This means that the user will be able to get the grub menu by simply rebooting from the gdm screen rather then logging in, or if gdm does not work just shutting down the machine either by a short press and letting systemd do its thing, or by a forced-power off. Last but not least several people have mentioned that this all needs to be documented properly. I completely agree and I plan to write docs about all of this, but I need to do the code first because of the various freezes and because it is easier to document things once they are finished. Note I hereby _promise_ that I write some proper documentation on this once the code is done. Regards, Hans

Tomas Kovar

4:54 a.m.

Hi all, I have two suggestions: - on UEFI systems, would it be possible to use an EFI variable to force grub menu? That way, it would be possible to enter the menu from UEFI boot loader or shell, even if the system itself is in non-working state or on read-only device. - this one is on the polish side of things: on UEFI system, when GRUB isn't going to display anything, it should not set the text mode or clear the screen. Currently, when UEFI runs the bootloader, it does it with graphic framebuffer. GRUB then switches to text mode, when quiet it does nothing just displays the blinking cursor at the mid-bottom of the screen and then the kernel takes over and switches back to graphic mode. The user gets two ugly flashes as the modes change. Windows doesn't set or clear the framebuffer, it displays its progress indicator on top of whatever was left by firmware there (mostly computer manufacturer logo). Regards, Tomas

Hans de Goede

5:53 a.m.

Hi, On 01-06-18 11:54, Tomas Kovar wrote:

...

That is a good idea I've added looking into this to my TODO list.

...

- this one is on the polish side of things: on UEFI system, when GRUB isn't going to display anything, it should not set the text mode or clear the screen. Currently, when UEFI runs the bootloader, it does it with graphic framebuffer. GRUB then switches to text mode, when quiet it does nothing just displays the blinking cursor at the mid-bottom of the screen and then the kernel takes over and switches back to graphic mode. The user gets two ugly flashes as the modes change. Windows doesn't set or clear the framebuffer, it displays its progress indicator on top of whatever was left by firmware there (mostly computer manufacturer logo).

Yes that is the boot experience which we eventually want to accomplish. Regards, Hans

Dominik 'Rathann' Mierzejewski

Wednesday, 6 June Wed, 6 Jun

2:04 a.m.

On Friday, 01 June 2018 at 12:53, Hans de Goede wrote:

...

On 01-06-18 11:54, Tomas Kovar wrote: > Hi all, > > I have two suggestions: > > - on UEFI systems, would it be possible to use an EFI variable to force grub menu? That way, it would be possible to enter the menu from UEFI boot loader or shell, even if the system itself is in non-working state or on read-only device. That is a good idea I've added looking into this to my TODO list.

What about UEFI systems where there's no way to get into EFI shell because firmware is deliberately broken by the vendor to prevent booting anything apart from Windows? I managed to get Fedora booting with SecureBoot enabled on my system only by manually setting the name of its boot entry to "Windows Boot Manager". There's no option to boot anything else because any BootNext/BootOrder options set with efibootmgr get ignored. I'm sure my machine is not an exception in this regard, so you cannot rely on users being able to do anything before GRUB comes up. Disabling the GRUB menu and shortening timeouts is a bad idea in my opinion. You don't boot so often these days and when you need to access the GRUB menu, you usually do because something is broken and you don't want to poke around the internet to find out how to do that in Fedora because you might even have no internet access. There needs to be a clear hint on the screen how to access the menu visible long enough for the user to read it. Regards, Dominik -- Fedora https://getfedora.org | RPMFusion http://rpmfusion.org There should be a science of discontent. People need hard times and oppression to develop psychic muscles. -- from "Collected Sayings of Muad'Dib" by the Princess Irulan

Kyle Marek

5:53 a.m.

On 06/06/2018 03:04 AM, Dominik 'Rathann' Mierzejewski wrote:

...

On Friday, 01 June 2018 at 12:53, Hans de Goede wrote: > On 01-06-18 11:54, Tomas Kovar wrote: >> Hi all, >> >> I have two suggestions: >> >> - on UEFI systems, would it be possible to use an EFI variable to force grub menu? That way, it would be possible to enter the menu from UEFI boot loader or shell, even if the system itself is in non-working state or on read-only device. > That is a good idea I've added looking into this to my TODO list. What about UEFI systems where there's no way to get into EFI shell because firmware is deliberately broken by the vendor to prevent booting anything apart from Windows? I managed to get Fedora booting with SecureBoot enabled on my system only by manually setting the name of its boot entry to "Windows Boot Manager". There's no option to boot anything else because any BootNext/BootOrder options set with efibootmgr get ignored. I'm sure my machine is not an exception in this regard, so you cannot rely on users being able to do anything before GRUB comes up. Disabling the GRUB menu and shortening timeouts is a bad idea in my opinion. You don't boot so often these days and when you need to access the GRUB menu, you usually do because something is broken and you don't want to poke around the internet to find out how to do that in Fedora because you might even have no internet access. There needs to be a clear hint on the screen how to access the menu visible long enough for the user to read it.

I especially think that if you're going to show *anything*, it might as well be the menu itself.

Jason L Tibbitts III

Friday, 1 June Fri, 1 Jun

noon

...

>>>> "TK" == Tomas Kovar <tomas(a)kovar.sk> writes:

TK> - this one is on the polish side of things: [don't keep bouncing to text mode] I might also add that as part of this, we'd also need to get rid of the very early message about EFI secure boot being enabled. Then we'd be left only with the random kernel message spew that some machines have just before X starts up. (For me it's usually something complaining about ACPI tables or somesuch.) But I'm sure Hans has already thought of all of that. - J<

Vít Ondruch

Monday, 4 June Mon, 4 Jun

10:24 a.m.

Dne 1.6.2018 v 19:00 Jason L Tibbitts III napsal(a):

...

>>>>> "TK" == Tomas Kovar <tomas(a)kovar.sk> writes: TK> - this one is on the polish side of things: [don't keep bouncing to text mode] I might also add that as part of this, we'd also need to get rid of the very early message about EFI secure boot being enabled. Then we'd be left only with the random kernel message spew that some machines have just before X starts up. (For me it's usually something complaining about ACPI tables or somesuch.)

Good point, I see such messages + messages about high temperature and throttling CPU every boot. V.

...

But I'm sure Hans has already thought of all of that. - J< _______________________________________________ devel mailing list -- devel(a)lists.fedoraproject.org To unsubscribe send an email to devel-leave(a)lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.o...

Chris Adams

Friday, 1 June Fri, 1 Jun

8:26 a.m.

Once upon a time, Hans de Goede <hdegoede(a)redhat.com> said:

...

For F30, single OS Fedora Workstation install install we get: 1) grub menu not shown, 0 second timeout, no way to get to the menu

What I haven't seen answered is this: what do we really gain from this? Your initial message said that the EFI firmware scanning USB for a keyboard "can be quite slow", but that's not explained. For the typical use cases, how long are we actually talking about? -- Chris Adams <linux(a)cmadams.net>

Hans de Goede

8:51 a.m.

Hi, On 01-06-18 15:26, Chris Adams wrote:

...

Once upon a time, Hans de Goede <hdegoede(a)redhat.com> said: > For F30, single OS Fedora Workstation install install we get: > > 1) grub menu not shown, 0 second timeout, no way to get to the menu What I haven't seen answered is this: what do we really gain from this? Your initial message said that the EFI firmware scanning USB for a keyboard "can be quite slow", but that's not explained. For the typical use cases, how long are we actually talking about?

It varies but it can easily be a couple of seconds. Regards, Hans

Zdenek Kabelac

9:22 a.m.

Dne 1.6.2018 v 15:51 Hans de Goede napsal(a):

...

Hi, On 01-06-18 15:26, Chris Adams wrote: > Once upon a time, Hans de Goede <hdegoede(a)redhat.com> said: >> For F30, single OS Fedora Workstation install install we get: >> >> 1) grub menu not shown, 0 second timeout, no way to get to the menu > > What I haven't seen answered is this: what do we really gain from this? > Your initial message said that the EFI firmware scanning USB for a > keyboard "can be quite slow", but that's not explained. For the typical > use cases, how long are we actually talking about? It varies but it can easily be a couple of seconds.

It sounds like every Fedora user is doing nothing else then rebooting, and 2 seconds is going to be a killer feature..... Personally I reboot once maybe twice a week (and just because I need to track recent kernels) Is really the 2 second 'speedup' worth all the trouble with rescue ?? Regard Zdenek

Ken Coar

9:36 a.m.

On 06/01/2018 04:04 AM, Hans de Goede wrote:

...

For F30, single OS Fedora Workstation install install we get: 1) grub menu not shown, 0 second timeout, no way to get to the menu

^^^^^^^^^^^^^^^^^^^^^^^^^ This scares me and I would not like to see it implemented. What is the impetus for this change? What outside requests are satisfied by it? Or is this from internal developer opinions and discussions with no input from end users? (*Not* meant as an insult!) -- #ken B-|} Ken, Baron Coar RHCA, RHCVA, Sanagendamgagwedweinini Red Hat IT Infrastructure

Andrew Lutomirski

1:38 p.m.

...

On Jun 1, 2018, at 1:04 AM, Hans de Goede <hdegoede(a)redhat.com> wrote: Hi All, First of all I want to thank everyone for their input. I also want to make clear that the hide the menu + not listening for a keypress at all (aka fastboot) is a Fedora 30 thing, quoting myself: "For F29, single OS Fedora Workstation install we get: 1) grub menu hidden by default with a 1 second timeout to press ESC or F8 to show it

As discussed, this isn’t so great. Can we at least let users hold down a key rather than having to press it at the correct magic time?

...

2) grub menu shown with 5 sec timeout after a failed boot For F30, single OS Fedora Workstation install install we get: 1) grub menu not shown, 0 second timeout, no way to get to the menu 2) grub menu shown with 5 sec timeout after a failed boot"

I think this is a severe regression. There are multiple use cases that you’re breaking: 1. Nothing failed per se, but I want to test a boot option. I shouldn’t need to reconfigure grub. 2. The system booted successfully but is unusable (due to a graphical glitch caused by a kernel regression, a lost driver due to a dracut issue, or maybe some filesystem issue causing login to fail or the session post-login to be unusable). It would be fixable by booting an older kernel or entering an appropriate recovery mode, but if the menu is entirely gone, then it can’t. 3. The boot failed outright and the “failed boot” logic is busted. I think this is asking for far more trouble than the benefit is worth. I’m not on FESCo, but if I were, I would definitely vote -1. Please at least do the bare minimum and teach grub to notice that some key is held down and show the menu in response.

Wells, Roger K.

1:48 p.m.

New subject: EXTERNAL: Re: Hiding the grub menu by default on single OS installs

On 06/01/2018 02:39 PM, Andrew Lutomirski wrote:

...

> On Jun 1, 2018, at 1:04 AM, Hans de Goede <hdegoede(a)redhat.com> wrote: > > Hi All, > > First of all I want to thank everyone for their input. > > I also want to make clear that the hide the menu + > not listening for a keypress at all (aka fastboot) is a > Fedora 30 thing, quoting myself: > > "For F29, single OS Fedora Workstation install we get: > > 1) grub menu hidden by default with a 1 second timeout to press ESC > or F8 to show it As discussed, this isn’t so great. Can we at least let users hold down a key rather than having to press it at the correct magic time? > 2) grub menu shown with 5 sec timeout after a failed boot > > For F30, single OS Fedora Workstation install install we get: > > 1) grub menu not shown, 0 second timeout, no way to get to the menu > 2) grub menu shown with 5 sec timeout after a failed boot" > I think this is a severe regression. There are multiple use cases that you’re breaking: 1. Nothing failed per se, but I want to test a boot option. I shouldn’t need to reconfigure grub. 2. The system booted successfully but is unusable (due to a graphical glitch caused by a kernel regression, a lost driver due to a dracut issue, or maybe some filesystem issue causing login to fail or the session post-login to be unusable). It would be fixable by booting an older kernel or entering an appropriate recovery mode, but if the menu is entirely gone, then it can’t. 3. The boot failed outright and the “failed boot” logic is busted. I think this is asking for far more trouble than the benefit is worth. I’m not on FESCo, but if I were, I would definitely vote -1. Please at least do the bare minimum and teach grub to notice that some key is held down and show the menu in response.

I have to agree here. Personally, I would keep the menu as is.

...

_______________________________________________ devel mailing list -- devel(a)lists.fedoraproject.org To unsubscribe send an email to devel-leave(a)lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.o...

-- Roger Wells, P.E. leidos 221 Third St Newport, RI 02840 401-847-4210 (voice) 401-849-1585 (fax) roger.k.wells(a)leidos.com

Hans de Goede

Sunday, 3 June Sun, 3 Jun

1:05 p.m.

Hi, On 01-06-18 20:38, Andrew Lutomirski wrote:

...

Because detecting modifiers with UEFI is iffy and with serial consoles is outright impossible.

...

> 2) grub menu shown with 5 sec timeout after a failed boot > > For F30, single OS Fedora Workstation install install we get: > > 1) grub menu not shown, 0 second timeout, no way to get to the menu > 2) grub menu shown with 5 sec timeout after a failed boot" > I think this is a severe regression. There are multiple use cases that you’re breaking: 1. Nothing failed per se, but I want to test a boot option. I shouldn’t need to reconfigure grub.

There will be a commandline tool to request to show grub the next boot. Elsewhere in the thread someone mentioned that windows now shows it boot menu when doing shift + reboot, I think it would be nice to do something similar in GNOME.

...

2. The system booted successfully but is unusable (due to a graphical glitch caused by a kernel regression, a lost driver due to a dracut issue, or maybe some filesystem issue causing login to fail or the session post-login to be unusable). It would be fixable by booting an older kernel or entering an appropriate recovery mode, but if the menu is entirely gone, then it can’t. 3. The boot failed outright and the “failed boot” logic is busted.

2 and 3 really are the same. As mentioned in the part of my mail which has been snipped in the reply you are replying to, the plan is to start a systemd timer as part of the user session which considers the boot successful if the user session stays alive for 2 minutes. So if you reboot or force-poweroff within 2 minutes next boot you will get the boot menu.

...

I think this is asking for far more trouble than the benefit is worth. I’m not on FESCo, but if I were, I would definitely vote -1. Please at least do the bare minimum and teach grub to notice that some key is held down and show the menu in response.

See above why modifiers cannot work. Regards, Hans

Andrew Lutomirski

3:31 p.m.

On Sun, Jun 3, 2018 at 11:05 AM, Hans de Goede <hdegoede(a)redhat.com> wrote:

...

Hi, On 01-06-18 20:38, Andrew Lutomirski wrote: >> >> On Jun 1, 2018, at 1:04 AM, Hans de Goede <hdegoede(a)redhat.com> wrote: >> >> Hi All, >> >> First of all I want to thank everyone for their input. >> >> I also want to make clear that the hide the menu + >> not listening for a keypress at all (aka fastboot) is a >> Fedora 30 thing, quoting myself: >> >> "For F29, single OS Fedora Workstation install we get: >> >> 1) grub menu hidden by default with a 1 second timeout to press ESC >> or F8 to show it > > > As discussed, this isn’t so great. Can we at least let users hold down > a key rather than having to press it at the correct magic time? Because detecting modifiers with UEFI is iffy and with serial consoles is outright impossible.

I think that, if we have a serial console, we should have a minimum 250ms delay or so such that, if I hold down a key, I get a grub menu. (Does UEFI buffer keystrokes? It might be sufficient to just check *once* for buffered keystrokes.) On systems where I have a serial console, I want to be able to rescue the system, full stop. On UEFI, we should at least try, I think. And checking once for buffered keystrokes would be really nice too.

DJ Delorie

Friday, 1 June Fri, 1 Jun

2:07 p.m.

Hans de Goede <hdegoede(a)redhat.com> writes:

...

1) . . ., no way to get to the menu

I think this steps over a line we should not cross. There's a huge difference between HIDING grub's functionality, and essentially DISABLING it. While I'm opposted to hiding the grub menu in general, as long as there's some obvious way to access it, it's only a small annoyance. But I boot rarely, and when I do, it's usually because something has gone horribly wrong and I need as much control over the boot process as possible to get the system running again. Making it difficult for me to even find the tools I need only makes a bad day worse. And the benefit of a few seconds of boot time is no benefit at all for me. And don't say "well you can change it if you want to" if my use case represents a significant portion of Fedora users. Do we even know how many users will end up changing it? Or would prefer it available? Vs how many users really need that extra 1-2 seconds of boot time reduction? And don't say "it will show a menu when it thinks you need it" because that's just plain hubris. I can pretty much guarantee that its idea of when *I* need it, does not match *my* idea. Perhaps boot time is a concern for some Fedora users, like laptops (why aren't they just sleeping?) or VMs/containers (kickstart can change the defaults anyway), but for others it's an impedement (servers, desktops). Let's not go so far to please one group of users that we aggravate (or even alienate) another.

John Florian

Saturday, 2 June Sat, 2 Jun

9:53 a.m.

On 06/01/2018 03:07 PM, DJ Delorie wrote:

...

Hans de Goede <hdegoede(a)redhat.com> writes: > 1) . . ., no way to get to the menu I think this steps over a line we should not cross. There's a huge difference between HIDING grub's functionality, and essentially DISABLING it. While I'm opposted to hiding the grub menu in general, as long as there's some obvious way to access it, it's only a small annoyance. But I boot rarely, and when I do, it's usually because something has gone horribly wrong and I need as much control over the boot process as possible to get the system running again. Making it difficult for me to even find the tools I need only makes a bad day worse. And the benefit of a few seconds of boot time is no benefit at all for me. And don't say "well you can change it if you want to" if my use case represents a significant portion of Fedora users. Do we even know how many users will end up changing it? Or would prefer it available? Vs how many users really need that extra 1-2 seconds of boot time reduction? And don't say "it will show a menu when it thinks you need it" because that's just plain hubris. I can pretty much guarantee that its idea of when *I* need it, does not match *my* idea. Perhaps boot time is a concern for some Fedora users, like laptops (why aren't they just sleeping?) or VMs/containers (kickstart can change the defaults anyway), but for others it's an impedement (servers, desktops). Let's not go so far to please one group of users that we aggravate (or even alienate) another.

Well said. I really wish Fedora had a way of polling its user base democratically for such polarizing changes BEFORE imposing them. Such feedback ideally would be built into and deployed as part of the OS and not require users to routinely go check if their opinion is needed. Rather some client software would do this periodically and then request the user participate. -- John Florian

David Sommerseth

Friday, 1 June Fri, 1 Jun

9:29 a.m.

On 31/05/18 12:23, Hans de Goede wrote:

...

Making the boot process less "magical" by not presenting "text messages / menus filled with technical jargon" sounds like a nice user experience - when everything works. But we need to account for all the times things do not work too well. Diving into a menu by pressing a button or key within a reasonable time window is challenging for many non-techs. So this would be a worse user experience if they struggle to enter this "hidden menu". I'd rather consider a different approach. Rather have a closer look at this "technical jargon" being presented. Just a quick example from one of my Fedora VMs: 'Fedora (4.15.8-300.fc27.x86_64) 27 (Cloud Edition)' Why not just say: "Fedora 27" Or even just "Fedora", as it's not possible to boot into, say, "Fedora 26" unless lots of tweaks at the install/upgrade time has been done. Then have a sub-menu called "Recovery options", where you can list older kernels specifying kernel versions - but make that simpler too. Instead of "4.15.8-300.fc27.x86_64" just say: "4.15.8-300" So the menu could look something like ----------------------------------------------------- Fedora Recovery options |`- Fedora (older kernel, 4.13.0-103) |`- Fedora (older kernel, 4.14.5-300) |`- Fedora (older kernel, 4.15.3-304) \-- System recover mode (expert) ----------------------------------------------------- Just my 2cents. -- kind regards, David Sommerseth

Ken Coar

9:41 a.m.

On 06/01/2018 10:29 AM, David Sommerseth wrote:

...

So the menu could look something like ----------------------------------------------------- Fedora Recovery options |`- Fedora (older kernel, 4.13.0-103) |`- Fedora (older kernel, 4.14.5-300) |`- Fedora (older kernel, 4.15.3-304) \-- System recover mode (expert) -----------------------------------------------------

%s/Fedora/Fedora %version/ and I like it better. However, I think this is trending away from the 'don't show/allow grub menu with single kernel' patch discussion> <grin/> -- #ken B-|} Ken, Baron Coar RHCA, RHCVA, Sanagendamgagwedweinini Red Hat IT Infrastructure

Neal Gompa

Sunday, 3 June Sun, 3 Jun

1:37 p.m.

On Fri, Jun 1, 2018 at 10:42 AM Ken Coar <kcoar(a)redhat.com> wrote:

...

On 06/01/2018 10:29 AM, David Sommerseth wrote: > > So the menu could look something like > > ----------------------------------------------------- > Fedora > Recovery options > |`- Fedora (older kernel, 4.13.0-103) > |`- Fedora (older kernel, 4.14.5-300) > |`- Fedora (older kernel, 4.15.3-304) > \-- System recover mode (expert) > ----------------------------------------------------- %s/Fedora/Fedora %version/ and I like it better. However, I think this is trending away from the 'don't show/allow grub menu with single kernel' patch discussion> <grin/>

This is what we do in Mageia, and I would not be opposed to this. However, I don't want it to be _hard_ to actually get to the menu. When we transitioned from GRUB Legacy to GRUB 2, we lost a bunch of things: * Styled GRUB boot menus that don't look like garbage * Timeout to auto-boot without showing the full menu * Nested menus to move more "advanced" and "less used" items out Most of these things exist in other distributions, just not Fedora. For example, it's a staple in both Mageia and openSUSE. Ubuntu does most of these things too. Fedora is a weird outlier in that we've been lazy with the presentation of our boot process for a while now. -- 真実はいつも一つ！/ Always, there's only one truth!

Tomasz Torcz

1:57 p.m.

On Sun, Jun 03, 2018 at 02:37:30PM -0400, Neal Gompa wrote:

...

On Fri, Jun 1, 2018 at 10:42 AM Ken Coar <kcoar(a)redhat.com> wrote: > > > > ----------------------------------------------------- > > Fedora > > Recovery options > > |`- Fedora (older kernel, 4.13.0-103) > > |`- Fedora (older kernel, 4.14.5-300) > > |`- Fedora (older kernel, 4.15.3-304) > > \-- System recover mode (expert) > > ----------------------------------------------------- > > patch discussion> <grin/> This is what we do in Mageia, and I would not be opposed to this. However, I don't want it to be _hard_ to actually get to the menu. When we transitioned from GRUB Legacy to GRUB 2, we lost a bunch of things: * Styled GRUB boot menus that don't look like garbage * Timeout to auto-boot without showing the full menu * Nested menus to move more "advanced" and "less used" items out

Actually, we have two grub2 themes packaged – breeze and starfield. I'm using starfield and I have nice graphical GRUB menu. And I have menu like proposed in this thread: Fedora which boots latest kernel and "Advanced" submenu with older kernels. -- Tomasz Torcz Morality must always be based on practicality. xmpp: zdzichubg(a)chrome.pl -- Baron Vladimir Harkonnen

Neal Gompa

2:01 p.m.

On Sun, Jun 3, 2018 at 2:58 PM Tomasz Torcz <tomek(a)pipebreaker.pl> wrote:

...

On Sun, Jun 03, 2018 at 02:37:30PM -0400, Neal Gompa wrote: > On Fri, Jun 1, 2018 at 10:42 AM Ken Coar <kcoar(a)redhat.com> wrote: > > > > > > ----------------------------------------------------- > > > Fedora > > > Recovery options > > > |`- Fedora (older kernel, 4.13.0-103) > > > |`- Fedora (older kernel, 4.14.5-300) > > > |`- Fedora (older kernel, 4.15.3-304) > > > \-- System recover mode (expert) > > > ----------------------------------------------------- > > > > patch discussion> <grin/> > > This is what we do in Mageia, and I would not be opposed to this. > > However, I don't want it to be _hard_ to actually get to the menu. > > When we transitioned from GRUB Legacy to GRUB 2, we lost a bunch of things: > * Styled GRUB boot menus that don't look like garbage > * Timeout to auto-boot without showing the full menu > * Nested menus to move more "advanced" and "less used" items out Actually, we have two grub2 themes packaged – breeze and starfield. I'm using starfield and I have nice graphical GRUB menu. And I have menu like proposed in this thread: Fedora which boots latest kernel and "Advanced" submenu with older kernels.

Starfield was removed in Fedora 27 (Cf. rhbz#1519051). Breeze does exist and it's nice. The menu is possible to set up, but it's not that way by default. IIRC, grubby was never adapted to handle menus well, which is why we had to drop support for auto-generating the snapshot menu when the SUSE boot to snapshot patches were added to grub2 in Fedora 27. -- 真実はいつも一つ！/ Always, there's only one truth!

Michael Watters

Friday, 1 June Fri, 1 Jun

10:53 a.m.

What about users that don't use a graphical login manager? Personally I *like* seeing boot messages so that I know what is going on. Having the menu available is also quite useful for booting into rescue mode or selecting a different kernel. On 05/31/2018 06:23 AM, Hans de Goede wrote:

...

Hi All, I'm working on improving the Fedora boot experience, with the end goal being a user pressing the on button and then going to the graphical login manager without him seeing any text messages / menus filled with technical jargon. IIRC we used to hide the grub-menu by default on single OS installs, but we seemed to have stopped doing that, for new Fedora 29 installs I would like us to start hiding the menu by default on single OS installs again, see: https://fedoraproject.org/wiki/Changes/HiddenGrubMenu The goal if this email is to: 1) Give people an advance warning about the plan to change this so we can discuss this early on 2) See if anyone knows why we stopped doing this, I think we may simply have stopped doing this to simplify to bootconfig code in anaconda and because we did not always identify the single OS case correctly, but I wonder if there were other reasons? Regards, Hans _______________________________________________ devel mailing list -- devel(a)lists.fedoraproject.org To unsubscribe send an email to devel-leave(a)lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.o...

Rex Dieter

11:10 a.m.

Michael Watters wrote:

...

Note, this is all about defaults. If *you* like something, you can always modify the configuration to bring it back. -- Rex

John Florian

Saturday, 2 June Sat, 2 Jun

10 a.m.

On 06/01/2018 12:10 PM, Rex Dieter wrote:

...

Michael Watters wrote: > What about users that don't use a graphical login manager? Personally I > *like* seeing boot messages so that I know what is going on. > > Having the menu available is also quite useful for booting into rescue > mode or selecting a different kernel. Note, this is all about defaults. If *you* like something, you can always modify the configuration to bring it back.

That still overlooks my chief concern here... what if you need the classic behavior during an install attempt? Are we going to be forced to produce custom live images just so we can see what the hell is going wrong? I love that a can make Fedora my Fedora, but this sounds like a proposal to drive me away the first time I'm bitten by it. Also, I'm entirely unclear on the scope. Is the affecting Fedora Workstation (GNOME) only or all spins? The proposal says Workstation but it doesn't explicitly exempt the others, unless I missed something. If it's confined to the Workstation/GNOME install, I'm unaffected because that spin already has removed so much choice that I'll never consider it again. -- John Florian

Chris Murphy

11:41 a.m.

On Sat, Jun 2, 2018 at 9:00 AM, John Florian <jflorian(a)doubledog.org> wrote:

...

Also, I'm entirely unclear on the scope. Is the affecting Fedora Workstation (GNOME) only or all spins? The proposal says Workstation but it doesn't explicitly exempt the others, unless I missed something. If it's confined to the Workstation/GNOME install, I'm unaffected because that spin already has removed so much choice that I'll never consider it again.

It only affects Workstation, and work is being done on the installer side to make sure it only affects Workstation. -- Chris Murphy

Daniel P. Berrangé

Friday, 1 June Fri, 1 Jun

11:09 a.m.

On Thu, May 31, 2018 at 12:23:35PM +0200, Hans de Goede wrote:

...

I vaguely recall we lost the hidden menu feature during the grub1 -> grub2 transition, but it has been so long I can't be sure. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

2169

days inactive

2209

days old

devel@lists.fedoraproject.org

Manage subscription

71 comments

43 participants

tags (0)

participants (43)

Akarshan Biswas
Andrew Lutomirski
Björn Persson
Chris Adams
Chris Murphy
Christian Glombek
Chuck Anderson
Colin Walters
Daniel P. Berrangé
David Sommerseth
DJ Delorie
Dominik 'Rathann' Mierzejewski
Gerald B. Cox
Gerd Hoffmann
Hans de Goede
Ian Pilcher
Jason L Tibbitts III
John Florian
Joonas Sarajärvi
Ken Coar
Kevin Fenzi
Kyle Marek
Louis Lagendijk
Matthew Miller
Michael Cronenworth
Michael Watters
Neal Gompa
Nicolas Mailhot
Panu Matilainen
Peter Jones
Rex Dieter
Rob Clark
Robert Marcano
Sam Varshavchik
Sheogorath
stan
Stephen Gallagher
Sven Kieske
Tomas Kovar
Tomasz Torcz
Vít Ondruch
Wells, Roger K.
Zdenek Kabelac

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Hiding the grub menu by default on single OS installs