Not sure how to title this issue but I'd appreciate advice. A laptop running F34 crashed last night and won't start properly since. The only errors I can see and find in the logs indicate some unknown issue mounting the /home filesystem. The system has /boot and an LVM partition with / and /home. / and /boot mount successfully but the startup drops to emergency mode. After I enter the root password, I can run "vgchange -a y; mount /home" and /home is immediately mounted successfully, no problem. I can then issue ^D and the boot seems to complete. However, the network is not started and no gettys are running on other PTYs.
It seems apparent to me that there is no problem with the LVM partition or the /home filesystem. So I don't understand why startup is failing nor how to discover the true cause.
BTW, and not likely related, but if I try to boot from the latest kernel (5.14.11), the screen goes very dim after the mode is changed, making it very hard to see what is going on, and the keys to brighten the display seem inoperative. Booting from kernel 5.13.19 doesn't have that effect though the same startup problem happens.
Off to get a COVID booster in the morning so further investigation will continue after I return.
On 10/20/21 10:01 PM, Dave Close wrote:
It seems apparent to me that there is no problem with the LVM partition or the /home filesystem. So I don't understand why startup is failing nor how to discover the true cause.
I'm not familiar with LVM, but I'm sure there's an equivalent to fsck for it. You might want to boot from a LiveUSB and running it while the partition isn't mounted to make sure there aren't any problems there.
On Oct 21, 2021, at 00:51, Joe Zeff joe@zeff.us wrote:
I'm not familiar with LVM, but I'm sure there's an equivalent to fsck for it. You might want to boot from a LiveUSB and running it while the partition isn't mounted to make sure there aren't any problems there.
LVM is not a file system, just a logical volume manager (hence the initials), which provides logical volumes upon which a file system is written. So you’d run the traditional fsck program once the volumes have been assembled.
-- Jonathan Billings
On 10/21/21 5:15 PM, Jonathan Billings wrote:
On Oct 21, 2021, at 00:51, Joe Zeff joe@zeff.us wrote:
I'm not familiar with LVM, but I'm sure there's an equivalent to fsck for it. You might want to boot from a LiveUSB and running it while the partition isn't mounted to make sure there aren't any problems there.
LVM is not a file system, just a logical volume manager (hence the initials), which provides logical volumes upon which a file system is written. So you’d run the traditional fsck program once the volumes have been assembled.
Yes, I know what LVM is, although I don't use it. In any case, you can't run fsck when the volume is mounted, and as that LVM contains root, the best bet is to boot from a LiveUSB and then run fsck.
Since it is home, I would edit fstab and change "defaults" to "defaults,nofail" that will result in the system booting up if/when home is missing. Then you can look at what is going on with home with the system booted and all tools.
Rule #1: avoid emergency mode and get the system on the network.
If it boots up with no emergency mode then and without network run these 2 commands
cat /proc/cmdline and systemctl get-default
There are ways to prevent a VG/LV from being turned on during boot up that could cause this sort of issue.
systemctl status home.mount
should tell you the error it things it got.
Also before you have done anything do a "lvs" and see what the state of the home lv is.
On Wed, Oct 20, 2021 at 11:02 PM Dave Close dave@compata.com wrote:
Not sure how to title this issue but I'd appreciate advice. A laptop running F34 crashed last night and won't start properly since. The only errors I can see and find in the logs indicate some unknown issue mounting the /home filesystem. The system has /boot and an LVM partition with / and /home. / and /boot mount successfully but the startup drops to emergency mode. After I enter the root password, I can run "vgchange -a y; mount /home" and /home is immediately mounted successfully, no problem. I can then issue ^D and the boot seems to complete. However, the network is not started and no gettys are running on other PTYs.
It seems apparent to me that there is no problem with the LVM partition or the /home filesystem. So I don't understand why startup is failing nor how to discover the true cause.
BTW, and not likely related, but if I try to boot from the latest kernel (5.14.11), the screen goes very dim after the mode is changed, making it very hard to see what is going on, and the keys to brighten the display seem inoperative. Booting from kernel 5.13.19 doesn't have that effect though the same startup problem happens.
Off to get a COVID booster in the morning so further investigation will continue after I return. -- Dave Close, Compata, Irvine CA +1 714 434 7359 dave@compata.com dhclose@alumni.caltech.edu "Quantum computing is a marvelous way to show the non- intuitive nature of quantum mechanics." -Gordon Moore _______________________________________________ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
I asked:
Not sure how to title this issue but I'd appreciate advice. A laptop running F34 crashed last night and won't start properly since. The only errors I can see and find in the logs indicate some unknown issue mounting the /home filesystem. The system has /boot and an LVM partition with / and /home. / and /boot mount successfully but the startup drops to emergency mode. After I enter the root password, I can run "vgchange -a y; mount /home" and /home is immediately mounted successfully, no problem. I can then issue ^D and the boot seems to complete. However, the network is not started and no gettys are running on other PTYs.
It seems apparent to me that there is no problem with the LVM partition or the /home filesystem. So I don't understand why startup is failing nor how to discover the true cause.
Roger Heflin answered:
Since it is home, I would edit fstab and change "defaults" to "defaults,nofail" that will result in the system booting up if/when home is missing. Then you can look at what is going on with home with the system booted and all tools.
Done, and that helps a lot. Thanks.
systemctl status home.mount
should tell you the error it things it got.
The error is "dependency". The trick seems to be discovering what that dependency is. I've found a few minor problems and I think I've fixed them but /home still doesn't mount during startup.
The strangest thing I've found is that the files /etc/dbus-1/system.d/com.redhat.NewPrinterNotification.conf and /etc/dbus-1/system.d/com.redhat.PrinterDriversInstaller.conf were both empty. Without a network, I typed in what I see on another machine.
Currently, the only seemingly serious error I see is that zram0 swap isn't starting. The swap LV is properly configured so this doesn't seem that it should be a /home dependency.
I've currently reached a point where the network starts so my next task will be to verify recently updated RPMs. Other ideas welcome.
run "systemd-analyze critical-chain home.mount" and it will show you the requirements.
I would suspect something going wrong with the activation of the home lv.
On boot up do a "lvs" post that info. The Attr column will show if it is activated or not.
And if you find a dependency not working run at "systemctl status " against it, and that should show you what error it got.
Is home its own lv or on the vg with root?
On Thu, Oct 21, 2021 at 8:05 PM Dave Close dave@compata.com wrote:
I asked:
Not sure how to title this issue but I'd appreciate advice. A laptop running F34 crashed last night and won't start properly since. The only errors I can see and find in the logs indicate some unknown issue mounting the /home filesystem. The system has /boot and an LVM partition with / and /home. / and /boot mount successfully but the startup drops to emergency mode. After I enter the root password, I can run "vgchange -a y; mount /home" and /home is immediately mounted successfully, no problem. I can then issue ^D and the boot seems to complete. However, the network is not started and no gettys are running on other PTYs.
It seems apparent to me that there is no problem with the LVM partition or the /home filesystem. So I don't understand why startup is failing nor how to discover the true cause.
Roger Heflin answered:
Since it is home, I would edit fstab and change "defaults" to "defaults,nofail" that will result in the system booting up if/when home
is
missing. Then you can look at what is going on with home with the
system
booted and all tools.
Done, and that helps a lot. Thanks.
systemctl status home.mount
should tell you the error it things it got.
The error is "dependency". The trick seems to be discovering what that dependency is. I've found a few minor problems and I think I've fixed them but /home still doesn't mount during startup.
The strangest thing I've found is that the files /etc/dbus-1/system.d/com.redhat.NewPrinterNotification.conf and /etc/dbus-1/system.d/com.redhat.PrinterDriversInstaller.conf were both empty. Without a network, I typed in what I see on another machine.
Currently, the only seemingly serious error I see is that zram0 swap isn't starting. The swap LV is properly configured so this doesn't seem that it should be a /home dependency.
I've currently reached a point where the network starts so my next task will be to verify recently updated RPMs. Other ideas welcome. -- Dave Close, Compata, Irvine CA +1 714 434 7359 dave@compata.com dhclose@alumni.caltech.edu "A man who says, `I have learned enough and will learn no further,' should be considered as knowing nothing at all." --Haile Selassie _______________________________________________ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
(top-posted due to the length of this thread) Anyway you cut this, even if you get the problem fixed, you can no longer trust that this machine is sane. You have suffered some kind of critical corruption, and who knows if you've corrected it or whether there is more undiscovered damage or loss. The kernel modules had serious issues for most of the F34 life that caused unclean shutdowns to regularly occur, mostly in the i915 and systemd bits. Is this system properly updated? Laptops have a hard life, especially when the drive is bumped on the desk. The best solution at this point is to thoroughly check your hardware for failing components, fix them, reinstall and recover what you need from backups. Maybe moving up to a less complex storage system with built-in volume and raid management and dynamic error detection/correction (like btrfs or zfs) would also be a better move at this point. -- John Mellor
On 2021-10-22 07:58, Roger Heflin wrote:
run "systemd-analyze critical-chain home.mount" and it will show you the requirements.
I would suspect something going wrong with the activation of the home lv.
On boot up do a "lvs" post that info. The Attr column will show if it is activated or not.
And if you find a dependency not working run at "systemctl status " against it, and that should show you what error it got.
Is home its own lv or on the vg with root?
On Thu, Oct 21, 2021 at 8:05 PM Dave Close dave@compata.com wrote:
I asked: > Not sure how to title this issue but I'd appreciate advice. A laptop > running F34 crashed last night and won't start properly since. The > only errors I can see and find in the logs indicate some unknown > issue mounting the /home filesystem. The system has /boot and an LVM > partition with / and /home. / and /boot mount successfully but the > startup drops to emergency mode. After I enter the root password, > I can run "vgchange -a y; mount /home" and /home is immediately > mounted successfully, no problem. I can then issue ^D and the boot > seems to complete. However, the network is not started and no gettys > are running on other PTYs. > > It seems apparent to me that there is no problem with the LVM partition > or the /home filesystem. So I don't understand why startup is failing > nor how to discover the true cause. Roger Heflin answered: > Since it is home, I would edit fstab and change "defaults" to > "defaults,nofail" that will result in the system booting up if/when home is > missing. Then you can look at what is going on with home with the system > booted and all tools. Done, and that helps a lot. Thanks. > systemctl status home.mount > > should tell you the error it things it got. The error is "dependency". The trick seems to be discovering what that dependency is. I've found a few minor problems and I think I've fixed them but /home still doesn't mount during startup. The strangest thing I've found is that the files /etc/dbus-1/system.d/com.redhat.NewPrinterNotification.conf and /etc/dbus-1/system.d/com.redhat.PrinterDriversInstaller.conf were both empty. Without a network, I typed in what I see on another machine. Currently, the only seemingly serious error I see is that zram0 swap isn't starting. The swap LV is properly configured so this doesn't seem that it should be a /home dependency. I've currently reached a point where the network starts so my next task will be to verify recently updated RPMs. Other ideas welcome.
On Fri, 2021-10-22 at 09:09 -0400, John Mellor wrote:
Maybe moving up to a less complex storage system with built-in volume and raid management and dynamic error detection/correction (like btrfs or zfs) would also be a better move at this point.
I've always queried the point of using LVM by default. I'd suggest that *MOST* users only have one hard drive, especially if it's a laptop. So multi-drive-spanning systems aren't needed.
If you do want multi-drive-spanning storage, then you probably have the knowledge to set up a system using an alternative filing system. If you don't, you really should, because you're in for a world of fun and games if it goes belly up.
Yes, I'm aware LVM isn't *all* about multidisk, but it's one of its prime features.
LVM is also used to make separate LV's such that critical filesystems can have their own space and be protected against another filesystem filling up (if you only had a single filesystem).
There are reasons to use it, especially if you don't want filling up a /data only filesystem to impact the OS.
On Fri, Oct 22, 2021 at 8:38 AM Tim via users users@lists.fedoraproject.org wrote:
On Fri, 2021-10-22 at 09:09 -0400, John Mellor wrote:
Maybe moving up to a less complex storage system with built-in volume and raid management and dynamic error detection/correction (like btrfs or zfs) would also be a better move at this point.
I've always queried the point of using LVM by default. I'd suggest that *MOST* users only have one hard drive, especially if it's a laptop. So multi-drive-spanning systems aren't needed.
If you do want multi-drive-spanning storage, then you probably have the knowledge to set up a system using an alternative filing system. If you don't, you really should, because you're in for a world of fun and games if it goes belly up.
Yes, I'm aware LVM isn't *all* about multidisk, but it's one of its prime features.
--
uname -rsvp Linux 3.10.0-1160.42.2.el7.x86_64 #1 SMP Tue Sep 7 14:49:57 UTC 2021 x86_64
Boilerplate: All unexpected mail to my mailbox is automatically deleted. I will only get to see the messages that are posted to the mailing list.
users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
On Oct 22, 2021, at 13:16, Roger Heflin rogerheflin@gmail.com wrote:
LVM is also used to make separate LV's such that critical filesystems can have their own space and be protected against another filesystem filling up (if you only had a single filesystem).
There are reasons to use it, especially if you don't want filling up a /data only filesystem to impact the OS.
While you can do this with separate partitions with file systems on them, the advantage of LVM is that you can resize the volumes and not have to repartition. Root volume running out of space? Grow it and take the space from your Home volume.
I’ve also used the fact that LVM allows you to seamlessly transfer data to a new disk, live, when moving to new hardware.
Eventually, I hope btrfs makes LVM obsolete.
— Jonathan Billings
Roger Heflin rogerheflin@gmail.com
LVM is also used to make separate LV's such that critical filesystems can have their own space and be protected against another filesystem filling up (if you only had a single filesystem).
Jonathan Billings:
While you can do this with separate partitions with file systems on them, the advantage of LVM is that you can resize the volumes and not have to repartition. Root volume running out of space? Grow it and take the space from your Home volume.
I’ve also used the fact that LVM allows you to seamlessly transfer data to a new disk, live, when moving to new hardware.
Other filing systems can do all of that, too (three simple large partitions for boot, system root, and home does it very well). But, I'm certain that most people will not be able to manage changing any of that after the fact.
*Most* users will not be of the super technical mindset, and most wouldn't need to be, either. The ordinary person will (these days) have a ridiculously large hard drive, and not be a programmer. Computer literacy is not a prerequisite for using a computer any more, as any tech support person will attest to.
Nor even is computer literacy a prerequisite for being a tech support person, either: I had to get my ISP to change their faulty router, that was an exercise in stupidity. You can't phone them, you had to do it over the internet, in a little chat window through a website. So, beforehand, I swapped the failed one for a still working one, then spent an hour trying to tell them, no I can't put the failed one back in and continue chatting to you to test the failed one. I even pointed out that if I put the faulty one in-circuit I would not be able to talk to them any more, but they just couldn't see it. I did try offering to speak to them over the phone, but they couldn't or wouldn't do that.
On Sat, 2021-10-23 at 14:19 +1030, Tim via users wrote:
Nor even is computer literacy a prerequisite for being a tech support person, either: I had to get my ISP to change their faulty router, that was an exercise in stupidity. You can't phone them, you had to do it over the internet, in a little chat window through a website. So, beforehand, I swapped the failed one for a still working one, then spent an hour trying to tell them, no I can't put the failed one back in and continue chatting to you to test the failed one. I even pointed out that if I put the faulty one in-circuit I would not be able to talk to them any more, but they just couldn't see it. I did try offering to speak to them over the phone, but they couldn't or wouldn't do that.
I've had similar experiences, and ended up switching to a different ISP a couple of years ago. My current ISP provides Internet, nothing else (no TV packages, no mobile packages etc.) and do that extremely well. They also have 5-star technical support and if you need to call them you get someone who knows what they're doing and doesn't talk down to you as soon as they realise you have technical knowledge yourself.
Vote with your wallet.
poc
On 23/10/2021 18:04, Patrick O'Callaghan wrote:
On Sat, 2021-10-23 at 14:19 +1030, Tim via users wrote:
Nor even is computer literacy a prerequisite for being a tech support person, either: I had to get my ISP to change their faulty router, that was an exercise in stupidity. You can't phone them, you had to do it over the internet, in a little chat window through a website. So, beforehand, I swapped the failed one for a still working one, then spent an hour trying to tell them, no I can't put the failed one back in and continue chatting to you to test the failed one. I even pointed out that if I put the faulty one in-circuit I would not be able to talk to them any more, but they just couldn't see it. I did try offering to speak to them over the phone, but they couldn't or wouldn't do that.
I've had similar experiences, and ended up switching to a different ISP a couple of years ago. My current ISP provides Internet, nothing else (no TV packages, no mobile packages etc.) and do that extremely well. They also have 5-star technical support and if you need to call them you get someone who knows what they're doing and doesn't talk down to you as soon as they realise you have technical knowledge yourself.
Vote with your wallet.
I don;t know how they respond to the local Taiwanese, but I can say that when I have a network problem the English speaking support I get from Chunghwa Telecom is super. After days of poor performance, I told them there was an issue with one of their IPv6 edge routers causing unusually high latency they understood what I was telling them. They asked for my analysis and fixed the problem within 24hrs.
-- On Facebook it is called Vaguebooking.
On 2021-10-22 1:14 p.m., Roger Heflin wrote:
LVM is also used to make separate LV's such that critical filesystems can have their own space and be protected against another filesystem filling up (if you only had a single filesystem).
. . .
Yup, but btrfs and zfs also do the same thing, except more elegantly. One thing that btrfs does NOT do only on Fedora at this time is fs encryption, which is super useful on a laptop. I'm unsure why Fedora is still using the clunky old encryption layer mechanism instead of the builtin one.
--
John Mellor
On Fri, Oct 22, 2021 at 03:02:29PM -0400, John Mellor wrote:
Yup, but btrfs and zfs also do the same thing, except more elegantly. One thing that btrfs does NOT do only on Fedora at this time is fs encryption, which is super useful on a laptop. I'm unsure why Fedora is still using the clunky old encryption layer mechanism instead of the builtin one.
Btrfs doesn't have a working encryption feature. In the mean time, people use luks/dm-crypt. I'm not sure whether you consider that the clunky old encryption or the built-in one.
John Mellor wrote:
Anyway you cut this, even if you get the problem fixed, you can no longer trust that this machine is sane. You have suffered some kind of critical corruption, and who knows if you've corrected it or whether there is more undiscovered damage or loss. ...
Yep, I've come to the same conclusion. This laptop initially had a small SSD. I've added another larger one but the partitioning isn't optimum anyway. I suspect that /root ran out of space overnight and led to all this trouble.
Roger Heflin wrote:
run "systemd-analyze critical-chain home.mount" and it will show you the requirements. And if you find a dependency not working run at "systemctl status " against it, and that should show you what error it got.
# systemd-analyze critical-chain home.mount home.mount @2min 29.727s `-local-fs-pre.target @3.739s `-lvm2-monitor.service @955ms +1.176s `-dm-event.socket @912ms `-system.slice `--.slice
with the "lvm2" line in red. But status on that service does not show an error.
On boot up do a "lvs" post that info. The Attr column will show if it is activated or not.
Not activated on boot.
Thanks for the help and advice. But I think John is right, it's time to start over on this laptop.
there are only like 2-3 ways to get it not activated on boot..
This is a misconfig of some sort, not random breakage (unless it is metad).
cat /proc/cmdline disable/mask lvm2-lvmetad if your fedora version has it, it causes weird lvm issues (ie random fail to find/enable vgs).
Did "vgchange -ay" activate it and/or did you have to add the vgname on the line?
There are some volume_list options in /etc/lvm/lvm.conf file that people sometimes use that will use odd behavior.
I won't touch btrfs/zfs, I have had too many first person horror stories in the last 3 years to trust either with my data.
On Fri, Oct 22, 2021 at 1:54 PM Dave Close dave@compata.com wrote:
Roger Heflin wrote:
run "systemd-analyze critical-chain home.mount" and it will show you the requirements. And if you find a dependency not working run at "systemctl status " against it, and that should show you what error it got.
# systemd-analyze critical-chain home.mount home.mount @2min 29.727s `-local-fs-pre.target @3.739s `-lvm2-monitor.service @955ms +1.176s `-dm-event.socket @912ms `-system.slice `--.slice
with the "lvm2" line in red. But status on that service does not show an error.
On boot up do a "lvs" post that info. The Attr column will show if it is activated or not.
Not activated on boot.
Thanks for the help and advice. But I think John is right, it's time to start over on this laptop. -- Dave Close, Compata, Irvine CA "Whenever you have a secret, dave@compata.com, +1 714 434 7359 you have a vulnerability." dhclose@alumni.caltech.edu -- Whitfield Diffie _______________________________________________ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
On Fri, 22 Oct 2021 at 18:11, Roger Heflin rogerheflin@gmail.com wrote:
there are only like 2-3 ways to get it not activated on boot..
This is a misconfig of some sort, not random breakage (unless it is metad).
cat /proc/cmdline disable/mask lvm2-lvmetad if your fedora version has it, it causes weird lvm issues (ie random fail to find/enable vgs).
Did "vgchange -ay" activate it and/or did you have to add the vgname on the line?
There are some volume_list options in /etc/lvm/lvm.conf file that people sometimes use that will use odd behavior.
I won't touch btrfs/zfs, I have had too many first person horror stories in the last 3 years to trust either with my data.
The risks associated with using btrfs/zfs depend on your use case, but so does your expertise in configuring and maintaining a filesystem.
Many IT groups and users just reinstall the OS and fetch data from cloud storage when user workstations have filesystem problems.
A lot has happened to many linux desktop users in the last 3 years. Technologies moved on, and COVID-19 has many people wanting the flexibility of a laptop. Ubuntu 20.04 made zfs the default, Fedora 34 made btrf the default. Distros need good reasons for changing the filesystem. SSD's have replaced platters, people are using clouds for offsite storage, and more bigger data makes bitrot more of a concern. File compression with btrfs is claimed to significantly extend the life of SSD's (but copy-on-write increases wear). Laptops replacing desktops increases the importance of encryption and makes a UPS redundant if your external storage is in some cloud.