Hi Everyone,
I started an update of my F29 system today. Everything seemed to be going fine until the system locked up. I waited for a few minutes and then, hesitantly, I power cycled the computer. Sure enough, the boot didn't start and instead dropped into the grub shell. :/
I managed to manually boot Fedora. I reviewed the dnf history to see which packages were getting upgraded. I ended up redoing the entire operation, rewrote the grub2 config for good measure and rebooted: the same problem occurred again. Drat!
Once the desktop was running again, I decided to only reinstall the kernel packages that were upgraded. That's when I saw this:
Running transaction Preparing : 1/1 Reinstalling : kernel-core-5.0.10-200.fc29.x86_64 1/2 Running scriptlet: kernel-core-5.0.10-200.fc29.x86_64 1/2 Running scriptlet: kernel-core-5.0.10-200.fc29.x86_64 2/2 grubby fatal error: unable to find a suitable template grubby: doing this would leave no kernel entries. Not writing out new config.
I must have missed those errors when I redid the update earlier.
What I've found online so far doesn't reference F29 so I'm not sure what I can do to fix the problem.
I did find the following two broken symlinks:
[ranbir@master ~]$ ls -l /etc/grub* lrwxrwxrwx. 1 root root 22 Oct 4 2018 /etc/grub2.cfg -> ../boot/grub2/grub.cfg lrwxrwxrwx. 1 root root 31 Oct 4 2018 /etc/grub2-efi.cfg -> ../boot/efi/EFI/fedora/grub.cfg
I don't know if they're supposed to be broken or not.
Does anyone have any suggestions on how I can fix my now broken grub config?
On Sun, May 5, 2019 at 3:21 PM Ranbir m3freak@thesandhufamily.ca wrote:
Running transaction Preparing : 1/1 Reinstalling : kernel-core-5.0.10-200.fc29.x86_64 1/2 Running scriptlet: kernel-core-5.0.10-200.fc29.x86_64 1/2 Running scriptlet: kernel-core-5.0.10-200.fc29.x86_64 2/2 grubby fatal error: unable to find a suitable template grubby: doing this would leave no kernel entries. Not writing out new config.
Grubby is either not finding the grub.cfg or it can't read it.
I did find the following two broken symlinks:
[ranbir@master ~]$ ls -l /etc/grub* lrwxrwxrwx. 1 root root 22 Oct 4 2018 /etc/grub2.cfg -> ../boot/grub2/grub.cfg lrwxrwxrwx. 1 root root 31 Oct 4 2018 /etc/grub2-efi.cfg -> ../boot/efi/EFI/fedora/grub.cfg
I don't know if they're supposed to be broken or not.
They should not be broken. It implies a problem with either /boot or /boot/efi depending on the type of firmware you have. What do you get for:
# cat /etc/fstab # blkid # efibootmgr -v
On Sun, 2019-05-05 at 22:16 -0600, Chris Murphy wrote:
They should not be broken. It implies a problem with either /boot or /boot/efi depending on the type of firmware you have. What do you get for:
# cat /etc/fstab # blkid # efibootmgr -v
Here you go:
https://paste.ofcode.org/Tds5vBCXswJ2sHUu64S9G4
On Mon, May 6, 2019 at 7:55 AM Ranbir m3freak@thesandhufamily.ca wrote:
On Sun, 2019-05-05 at 22:16 -0600, Chris Murphy wrote:
They should not be broken. It implies a problem with either /boot or /boot/efi depending on the type of firmware you have. What do you get for:
# cat /etc/fstab # blkid # efibootmgr -v
Here you go:
I see two minor anomalies:
/boot is XFS, should not be a problem. NVRAM contains dup entries, Boot0001 and Boot0008.
Boot0001 is pointing to the new naming for shim, and it's also the default and current boot. Therefore, Boot0008 can be deleted. It's not causing the problem, so you can also just leave it alone.
I don't think /boot on XFS is related either, it should work fine.
What's the URL you get for this command? $ sudo cat /boot/efi/EFI/fedora/grub.cfg | fpaste
And also this:
$ mount | grep vfat
A faster way to get where I'm going, but will erase any evidence of why you're stuck, is to just create a new grub.cfg and then reinstall the kernel you want.
$ sudo grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg
Also, question, have you recently run 'grub2-install' on this computer? Don't do it. I'm just wondering if you have done it, it could be related.
On Mon, 2019-05-06 at 17:00 -0600, Chris Murphy wrote:
I see two minor anomalies:
/boot is XFS, should not be a problem. NVRAM contains dup entries, Boot0001 and Boot0008.
Boot0001 is pointing to the new naming for shim, and it's also the default and current boot. Therefore, Boot0008 can be deleted. It's not causing the problem, so you can also just leave it alone.
I had a hell of a time getting Windows 10 installed and then Fedora working. Originally I was doing the Fedora install first, but eventually I realized I was setting up the dual boot incorrectly. I wiped the drives and started again with Windows 10 first.
I have no idea how the dupe entries got there except maybe I didn't wipe the drives correctly.
What's the URL you get for this command? $ sudo cat /boot/efi/EFI/fedora/grub.cfg | fpaste
Empty. That is, I get back nothing.
And also this:
$ mount | grep vfat
https://paste.ofcode.org/JbTd7BHQ9aNAwtprsLBEee
A faster way to get where I'm going, but will erase any evidence of why you're stuck, is to just create a new grub.cfg and then reinstall the kernel you want.
$ sudo grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg
I do want to just fix it, but I want to know what's going on more. :)
Also, question, have you recently run 'grub2-install' on this computer? Don't do it. I'm just wondering if you have done it, it could be related.
No, I didn't do that. All I've done is the release upgrade from F28 to F29 (no errors, no left over rpms) and then my first F29 update, which caused the grub problem.
On 5/7/19 7:10 AM, Ranbir wrote:
On Mon, 2019-05-06 at 17:00 -0600, Chris Murphy wrote:
NVRAM contains dup entries, Boot0001 and Boot0008.
Boot0001 is pointing to the new naming for shim, and it's also the default and current boot. Therefore, Boot0008 can be deleted. It's not causing the problem, so you can also just leave it alone.
I had a hell of a time getting Windows 10 installed and then Fedora working. Originally I was doing the Fedora install first, but eventually I realized I was setting up the dual boot incorrectly. I wiped the drives and started again with Windows 10 first.
I have no idea how the dupe entries got there except maybe I didn't wipe the drives correctly.
Wiping the drives doesn't affect UEFI boot entries. They are stored in the internal flash memory.
And always install Windows first. It is usually not very friendly about sharing with other operating systems.
On Tue, May 7, 2019 at 8:11 AM Ranbir m3freak@thesandhufamily.ca wrote:
On Mon, 2019-05-06 at 17:00 -0600, Chris Murphy wrote:
I see two minor anomalies:
/boot is XFS, should not be a problem. NVRAM contains dup entries, Boot0001 and Boot0008.
Boot0001 is pointing to the new naming for shim, and it's also the default and current boot. Therefore, Boot0008 can be deleted. It's not causing the problem, so you can also just leave it alone.
I had a hell of a time getting Windows 10 installed and then Fedora working. Originally I was doing the Fedora install first, but eventually I realized I was setting up the dual boot incorrectly. I wiped the drives and started again with Windows 10 first.
I have no idea how the dupe entries got there except maybe I didn't wipe the drives correctly.
What's the URL you get for this command? $ sudo cat /boot/efi/EFI/fedora/grub.cfg | fpaste
Empty. That is, I get back nothing.
That is the likely reason why grubby spits back an error message.
And also this:
$ mount | grep vfat
OK the EFI system partition is present and mounted rw.
Is the grub.cfg even present? Maybe it's zero length?
# ls -l /boot/efi/EFI/fedora/grub.cfg
A faster way to get where I'm going, but will erase any evidence of why you're stuck, is to just create a new grub.cfg and then reinstall the kernel you want.
$ sudo grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg
I do want to just fix it, but I want to know what's going on more. :)
If it's missing or zero length, is curious, but after we know that, it must be be fixed so you can run the grub2-mkconfig command now.
Also, question, have you recently run 'grub2-install' on this computer? Don't do it. I'm just wondering if you have done it, it could be related.
No, I didn't do that. All I've done is the release upgrade from F28 to F29 (no errors, no left over rpms) and then my first F29 update, which caused the grub problem.
Weird. I'm not sure what happened to the grub.cfg.
You could dig through the journal and find the upgrade portion of the log and search for grub. You can use 'sudo journalctl list-boots' to help narrow down which boot is the offline boot in which the upgrade was performed, all of those boots are negative values so you use them like 'sudo journalctl -b -5' and then you could grep your suspects for grub and see if maybe something crashed or spat out an error during the upgrade.
On 5/7/19 10:42 AM, Chris Murphy wrote:
If it's missing or zero length, is curious, but after we know that, it must be be fixed so you can run the grub2-mkconfig command now.
It must be empty because otherwise the "cat" command would have given an error. A very surprising situation though.
On Tue, May 7, 2019 at 11:53 AM Samuel Sieb samuel@sieb.net wrote:
On 5/7/19 10:42 AM, Chris Murphy wrote:
If it's missing or zero length, is curious, but after we know that, it must be be fixed so you can run the grub2-mkconfig command now.
It must be empty because otherwise the "cat" command would have given an error. A very surprising situation though.
I've seen zero length grub.cfg before, but it was on XFS, and it was because of an unclean umount, and GRUB doesn't support journal replay, so it could only read stale file system metadata. After coaxing normal boot, kernel XFS code did journal replay and the grub.cfg was no longer zero length.
That isn't what's going on here, FAT doesn't have a journal. It might not be a bad idea to unmount /boot/efi and do an fsck on the ESP using -av flags. On mount, the vfat driver sets the dirty bit on the file system; and /etc/fstab as created by anaconda calls for running fsck if the dirty bit is present. So...I doubt the fsck will find anything that it can fix, but ... wheee!
On Tue, 2019-05-07 at 11:42 -0600, Chris Murphy wrote:
Is the grub.cfg even present? Maybe it's zero length?
# ls -l /boot/efi/EFI/fedora/grub.cfg
Zero length
Weird. I'm not sure what happened to the grub.cfg.
My desktop did lockup during the update. I couldn't switch to a tty or even ssh into the box. I had to power cycle it.
You can use 'sudo journalctl list-boots' to help narrow down which boot is the offline boot in which the upgrade was performed, all of those boots are negative values so you use them like 'sudo journalctl -b -5' and then you could grep your suspects for grub and see if maybe something crashed or spat out an error during the upgrade.
I found the boots that were for the upgrade and found no grub errors. :/
On Mon, 2019-05-06 at 17:00 -0600, Chris Murphy wrote:
A faster way to get where I'm going, but will erase any evidence of why you're stuck, is to just create a new grub.cfg and then reinstall the kernel you want.
$ sudo grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg
I did the above and now grub is OK: rebooting brings up the grub boot menu. However, the /etc/grub* symlinks are still broken, though there appears to be no detrimental effects: I've done another kernel upgrade which went off without a hitch.
Le 05/05/2019 à 23:20, Ranbir a écrit :
Hi Everyone,
I started an update of my F29 system today. Everything seemed to be going fine until the system locked up. I waited for a few minutes and then, hesitantly, I power cycled the computer. Sure enough, the boot didn't start and instead dropped into the grub shell. :/
I have a similar problem since the kernel update uses grubby instead of grub2-mkconfig: systematically, grubby chooses a wrong partition as the / partition and I have to run manually grub2-mkconfig if I want to boot my machine after a kernel update.
Is there a way to tell dnf (or whatever...) to use grub2-mkconfig instead of grubby when there is a kernel update...
Thank you
On Mon, May 6, 2019 at 8:25 AM François Patte francois.patte@mi.parisdescartes.fr wrote:
Le 05/05/2019 à 23:20, Ranbir a écrit :
Hi Everyone,
I started an update of my F29 system today. Everything seemed to be going fine until the system locked up. I waited for a few minutes and then, hesitantly, I power cycled the computer. Sure enough, the boot didn't start and instead dropped into the grub shell. :/
I have a similar problem since the kernel update uses grubby instead of grub2-mkconfig: systematically, grubby chooses a wrong partition as the / partition and I have to run manually grub2-mkconfig if I want to boot my machine after a kernel update.
Is there a way to tell dnf (or whatever...) to use grub2-mkconfig instead of grubby when there is a kernel update...
No, it's not dnf that does this, it's the kernel post-install scripts. And I think it runs grubby twice with different arguments, once for kernel and once for initramfs. So you'd need to replace grubby with a wrapper script of your own design to ignore the first grubby invocation, and upon second invocation to instead just call grub2-mkconfig and output to the EFI system partition.
Alternatively upgrade to Fedora 30, where grubby is no longer used, and chance are you won't have this problem anymore. The grubby upstream package is replaced by a grubby wrapper script strictly for compatibility purposes for users mainly, it's not the real deal, and isn't normally used anymore. Instead, kernel post-install scripts write out per kernel bootloaderspec compatible snippets into /boot/loader/entries
The only failure I'm aware of for grubby on Fedora 29 and older is when /boot is on btrfs *and* the top level subvolume is mounted somewhere at the time grubby runs. If you make sure the top level of the Btrfs volume isn't mounted when doing kernel updates, the problem is avoided.