Did you do a lazy unmount? Any other umount will fail if there are
open files, and lazy does not actually unmount it until whatever has
the files open closes them, which generally means it never gets
unmounted.
On Fri, May 8, 2020 at 5:53 PM John Mellor <john.mellor(a)gmail.com> wrote:
>
> I like the idea of just removing the journal tag, but I don't think that
> I can modify the /boot filesystem. Doing a umount works, but tune2fs
> claims that an e2fsck is required. Running e2fsck says that the
> filesystem is still mounted, even though it is not. Doing a fuser
> /dev/sda1 shows a large number of /proc/fd entries using it, even though
> it successfully umounted.
>
> So, something is referencing the filesystem in a very bad way, and not
> as a mounted fs.
>
> So, I'm still stuck. How do you successfully modify /boot to not be
> journalled?
>
>
> On 2020-05-08 3:25 p.m., Roger Heflin wrote:
> > You have to hit the timing right. ie install the kernel package and
> > as quickly as possible reboot (automated, or very efficient).
> >
> > And if the update is more than just kernel, that may slowdown the
> > process enough that the immediate reboot won't be quick enough.
> >
> > I have seen it 3-5 times and that is over a huge number of machines,
> > in those the machines were booted and failed multiple times before
> > someone livecd booted it ran fsck'ed and/or mounted /boot, and it
> > found the files after that.
> >
> > On Fri, May 8, 2020 at 2:00 PM Mauricio Tavares <raubvogel(a)gmail.com>
wrote:
> >> On Fri, May 8, 2020 at 12:12 PM Roger Heflin <rogerheflin(a)gmail.com>
wrote:
> >>> A sync will flush the writes to the journal were the data is safe. It
> >>> will not force a replay of the journal.
> >>>
> >>> Nothing except removing the journal from the ext4 filesystem will fix
it.
> >>>
> >>> This is not a fedora bug, this is a long standing
> >>> kernel/grub/filesystem interaction bug (all who use a journaled
> >>> filesystem have this bug).
> >>>
> >>> See tune2fs and something like -O ^has_journal will turn off the
> >>> journal. It has to be done unmounted and verify that your fstab entry
> >>> will remounted it.
> >>>
> >>> Check /proc/mounts having data=XXX (probably ordered) says you have a
> >>> journal, after the umount+above tune2fs+remount the data=ordered will
> >>> be gone.
> >>>
> >> Interesting. I have a box which have been running for years with
> >>
> >> /dev/sdb1 /boot ext4
> >> rw,seclabel,noatime,barrier=1,stripe=32,data=ordered,discard 0 0
> >>
> >> and so far never borked on me.
> >>
> >>> On Fri, May 8, 2020 at 10:53 AM John Mellor
<john.mellor(a)gmail.com> wrote:
> >>>> Interesting! This machine does reboot in about 5secs and the other
> >>>> machines take longer, so it makes sense. My /boot is mounted just
like
> >>>> /home and / as follows:
> >>>>
> >>>> /dev/sda1 on /boot type ext4 (rw,relatime,seclabel)
> >>>>
> >>>> I assume that a symple sync would flush the journal. Its pretty
easy to
> >>>> do a sync;sync if updating using the CLI, but not possible when
using
> >>>> the GUI. Is this a Fedora bug where the journal is not correctly
> >>>> flushed on the reboot? Should I modify that mount entry or do
achattr
> >>>> change to workaround the bug?
> >>>>
> >>>>
> >>>> On 2020-05-08 11:11 a.m., Roger Heflin wrote:
> >>>>> What you are saying does not exactly match what I have
previously
> >>>>> seen, but there is a known feature with using a journaling
filesystem
> >>>>> (ext4-journal, or xfs) for /boot, if only the journal is updated
and
> >>>>> if it is not yet replayed into the non-journal then grub will
not be
> >>>>> able to find the new files/updated files (grub filesystem code
is
> >>>>> simple and does not process the journal so if critical updates
are
> >>>>> still in the journal then those updates(changed file, new
files)
> >>>>> cannot be seen). To get this one generally has to do the update
and
> >>>>> almost immediately reboot (within a few minutes though in some
cases,
> >>>>> note syncing the does not replay the journal). The fix is to
boot up
> >>>>> with a kernel that it can still find and/or livecd and mount
/boot so
> >>>>> that the journal gets replayed, or fsck boot so that the journal
gets
> >>>>> replayed.
> >>>>>
> >>>>> Long term the solution is to move boot to a non-journaled fs
(ext
> >>>>> without a journal) or after each update umount/mount
/boot(before
> >>>>> reboot).. If /boot is not separated then you cannot
umount/mount it
> >>>>> to get the journal to replay. There is a second method to force
a
> >>>>> journal replay, but reports say that one often "hangs"
when /boot is
> >>>>> not separate so is not a reliable solution. There were some
> >>>>> detailed posts on this several years ago with reliable
commenters
> >>>>> confirming the behavior. I have also personally seen the issue
a
> >>>>> number of times and mount /boot and/or fscking corrects it
(replays
> >>>>> journal).
> >>>>>
> >>>>> On Fri, May 8, 2020 at 8:52 AM John Mellor
<john.mellor(a)gmail.com> wrote:
> >>>>>> I have one completely stock workstation F32 machine where
kernel updates
> >>>>>> almost always cause a multiple-reboot panic problem. This
problem also
> >>>>>> occurred on F31, but not on releases before that. I'm
stumped and need
> >>>>>> some help in figuring it out.
> >>>>>>
> >>>>>> The symptoms vary in the number of reboots and the type of
tertiary
> >>>>>> error, but are otherwise pretty similar. It does not matter
whether I
> >>>>>> use the Gnome update app or the CLI dnf method. After a
number of
> >>>>>> reboots, the upgrade succeeds and Fedora behaves nortmally
again. I
> >>>>>> think that this only happens whenever the kernel is
upgraded.
> >>>>>>
> >>>>>> What I observe is that the machine is rebooted and on
reboot, grub (I
> >>>>>> think) gets a halt for a 32-bit relocation error. This
sequence may
> >>>>>> happen twice. Its an i7 with plenty of memory and an SSD
boot disk, so
> >>>>>> the 32-bit thing is confusing. To get around this error, I
powercycle
> >>>>>> the box and get into the next stage of the problem. One the
2nd or 3rd
> >>>>>> reboot, I usually see a halt with an access outside of the
kernel space,
> >>>>>> although with the update this morning, I had a kernel panic
instead.
> >>>>>> Cold-booting again, and the update is installed, and the
last reboot and
> >>>>>> I'm up on the new updates.
> >>>>>>
> >>>>>> After that, the machine behaves normally until the next
kernel updates.
> >>>>>> I assume that there is some incorrectly-asynchronous
operation in grub
> >>>>>> related to the update entry, but I can find no grub logs to
dig into
> >>>>>> this problem. I have several other machines that do not see
this
> >>>>>> problem. I dug around in the fedora bugs, but not knowing
what to look
> >>>>>> for, I'm basically blind. Its a pretty serious bug,
especially if the
> >>>>>> machine is remote. Does anyone have a way out of this?
> >>>>>>
> >>>>>> --
> >>>>>>
> >>>>>> John Mellor
> >>>>>> _______________________________________________
> >
> _______________________________________________
> users mailing list -- users(a)lists.fedoraproject.org
> To unsubscribe send an email to users-leave(a)lists.fedoraproject.org
> Fedora Code of Conduct:
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines:
https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives:
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org