I'm having issues with a VM.
The VM was originally created under VMware and has worked fine for a while. Today when I booted it up instead of seeing the usual MATE login screen I get a login prompt:
f34-01-vm:
no matter what I enter, root or pgaltieri as login it never asks for password and immediately says login incorrect. While it's booting I see several [FAILED]... messages, e.g. [FAILED] to start CUPS Scheduler
I booted the system again and this time it dropped into emergency mode. In emergency mode I see the following messages in dmesg:
BTRFS info (device sda2): flagging fs with big metadata feature BTRFS info (device sda2): disk space caching is enabled BTRFS info (device sda2): has skinny extents BTRFS info (device sda2): start tree-log replay BTRFS info (device sda2): parent transid verify failed on 61849600 wanted 145639 fount 145637 BTRFS info (device sda2): parent transid verify failed on 61849600 wanted 145639 fount 145637 BTRFS: error (device sda2) in btrfs_replay_log:2423 errno=-5 IO failure (Failed to recover log tree) BTRFS error (device sda2) open_ctree failed
I ran btrfs check in emergency mode and it came up with a lot of errors.
How do i recover the partition(s) so I can boot the system, or at least mount them?
Also in emergency mode:
vi /run/initramfs/rdsosreport.txt
results in:
/usr/bin/vi: line 23: /usr/libexec/vi: No such file or directory
/usr/bin/vi is a script:
if test -f /usr/bin/vim then exec /usr/bin/vim "$@" fi
exec /usr/libexec/vi "$@"
neither /usr/bin/vim nor /usr/libexec/vi exist.
======================================================================================
I tried booting the vm under VirtualBox with the same result.
I converted the image:
qemu-img convert -O qcow ../VMware/VMs/f34-01-vm/f34-01-vm.vmdk f34-01-vm.qcow2
which worked without errors. I then ran virt-manager to try to boot the image. This fails with this error
Unable to complete install: 'internal error: process exited while connecting to monitor: 2022-03-18T19:13:15.196710Z qemu-system-x86_64: -blockdev {"driver":"file","filename":"/run/media/pgaltieri/SDNVIRTLAB02/VirtualMachines/KVM/f34-01-vm.qcow2","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}: Could not open '/run/media/pgaltieri/SDNVIRTLAB02/VirtualMachines/KVM/f34-01-vm.qcow2': Permission denied'
Traceback (most recent call last): File "/usr/share/virt-manager/virtManager/asyncjob.py", line 65, in cb_wrapper callback(asyncjob, *args, **kwargs) File "/usr/share/virt-manager/virtManager/createvm.py", line 2001, in _do_async_install installer.start_install(guest, meter=meter) File "/usr/share/virt-manager/virtinst/install/installer.py", line 701, in start_install domain = self._create_guest( File "/usr/share/virt-manager/virtinst/install/installer.py", line 649, in _create_guest domain = self.conn.createXML(install_xml or final_xml, 0) File "/usr/lib64/python3.9/site-packages/libvirt.py", line 4366, in createXML raise libvirtError('virDomainCreateXML() failed') libvirt.libvirtError: internal error: process exited while connecting to monitor: 2022-03-18T19:13:15.196710Z qemu-system-x86_64: -blockdev {"driver":"file","filename":"/run/media/pgaltieri/SDNVIRTLAB02/VirtualMachines/KVM/f34-01-vm.qcow2","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}: Could not open '/run/media/pgaltieri/SDNVIRTLAB02/VirtualMachines/KVM/f34-01-vm.qcow2': Permission denied
I added my user id to both the qemu and libvirt entries in /etc/group and logged out and logged back in and I get the same error. I also get SELinux alerts:
The first alert:
You need to change the label on f34-01-vm.qcow2' # semanage fcontext -a -t virt_image_t '/run/media/pgaltieri/SDNVIRTLAB02/VirtualMachines/KVM/f34-01-vm.qcow2' # restorecon -v '/run/media/pgaltieri/SDNVIRTLAB02/VirtualMachines/KVM/f34-01-vm.qcow2'
subsequent alerts tell me to run:
# /sbin/restorecon -v /run/media/pgaltieri/SDNVIRTLAB02/VirtualMachines/KVM/f34-01-vm.qcow2
I have run these commands, especially the restorecon, several times and I still get the alerts.
One thing the semanage command as shown fails with:
ValueError: File spec /run/media/pgaltieri/SDNVIRTLAB02/VirtualMachines/KVM/f34-01-vm.qcow2 conflicts with equivalency rule '/run /var/run'; Try adding '/var/run/media/pgaltieri/SDNVIRTLAB02/VirtualMachines/KVM/f34-01-vm.qcow2' instead
If I add the /var then it works.
here is the context of the file:
-rwxrwxrwx. 1 pgaltieri pgaltieri system_u:object_r:fusefs_t:s0 15041695744 Mar 18 11:46 f34-01-vm.qcow2*
So how the heck do I boot the image and get it running?
Paolo
On Fri, 18 Mar 2022 at 19:47, Paolo Galtieri pgaltieri@gmail.com wrote:
I'm having issues with a VM.
It would be useful to mention the host OS. From the name, I guess your VM is running Fedora 34.
The VM was originally created under VMware and has worked fine for a while. Today when I booted it up instead of seeing the usual MATE login screen I get a login prompt:
f34-01-vm:
no matter what I enter, root or pgaltieri as login it never asks for password and immediately says login incorrect. While it's booting I see several [FAILED]... messages, e.g. [FAILED] to start CUPS Scheduler
I booted the system again and this time it dropped into emergency mode. In emergency mode I see the following messages in dmesg:
BTRFS info (device sda2): flagging fs with big metadata feature BTRFS info (device sda2): disk space caching is enabled BTRFS info (device sda2): has skinny extents BTRFS info (device sda2): start tree-log replay BTRFS info (device sda2): parent transid verify failed on 61849600 wanted 145639 fount 145637 BTRFS info (device sda2): parent transid verify failed on 61849600 wanted 145639 fount 145637 BTRFS: error (device sda2) in btrfs_replay_log:2423 errno=-5 IO failure (Failed to recover log tree) BTRFS error (device sda2) open_ctree failed
I ran btrfs check in emergency mode and it came up with a lot of errors.
How do i recover the partition(s) so I can boot the system, or at least mount them?
The underlying problem could be the physical disk that holds the VM's virtual disk file, or a corrupt btrfs. Avoid doing anything that would write to the virtual disk. Make a backup copy of the virtual disk. If the physical drive is OK, use a separate VM to mount the Fedora 34 virtual disk for repair attempts.
Try: https://btrfs.wiki.kernel.org/index.php/FAQ How do I recover from a parent transid verify failed error?
At one time VirtualBox had issues with btrfs. You should check for similar reports for VMWare and btrfs.
The host OS is also F34.
On 3/20/22 08:14, George N. White III wrote:
On Fri, 18 Mar 2022 at 19:47, Paolo Galtieri pgaltieri@gmail.com wrote:
I'm having issues with a VM.It would be useful to mention the host OS. From the name, I guess your VM is running Fedora 34.
The VM was originally created under VMware and has worked fine for a while. Today when I booted it up instead of seeing the usual MATE login screen I get a login prompt: f34-01-vm: no matter what I enter, root or pgaltieri as login it never asks for password and immediately says login incorrect. While it's booting I see several [FAILED]... messages, e.g. [FAILED] to start CUPS Scheduler I booted the system again and this time it dropped into emergency mode. In emergency mode I see the following messages in dmesg: BTRFS info (device sda2): flagging fs with big metadata feature BTRFS info (device sda2): disk space caching is enabled BTRFS info (device sda2): has skinny extents BTRFS info (device sda2): start tree-log replay BTRFS info (device sda2): parent transid verify failed on 61849600 wanted 145639 fount 145637 BTRFS info (device sda2): parent transid verify failed on 61849600 wanted 145639 fount 145637 BTRFS: error (device sda2) in btrfs_replay_log:2423 errno=-5 IO failure (Failed to recover log tree) BTRFS error (device sda2) open_ctree failed I ran btrfs check in emergency mode and it came up with a lot of errors. How do i recover the partition(s) so I can boot the system, or at least mount them?The underlying problem could be the physical disk that holds the VM's virtual disk file, or a corrupt btrfs. Avoid doing anything that would write to the virtual disk. Make a backup copy of the virtual disk. If the physical drive is OK, use a separate VM to mount the Fedora 34 virtual disk for repair attempts.
Try: https://btrfs.wiki.kernel.org/index.php/FAQ How do I recover from a parent transid verify failed error?
At one time VirtualBox had issues with btrfs. You should check for similar reports for VMWare and btrfs.
-- George N. White III
users mailing list --users@lists.fedoraproject.org To unsubscribe send an email tousers-leave@lists.fedoraproject.org Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives:https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org Do not reply to spam on the list, report it:https://pagure.io/fedora-infrastructure
On Fri, Mar 18, 2022 at 4:47 PM Paolo Galtieri pgaltieri@gmail.com wrote:
I'm having issues with a VM.
The VM was originally created under VMware and has worked fine for a while. Today when I booted it up instead of seeing the usual MATE login screen I get a login prompt:
f34-01-vm:
no matter what I enter, root or pgaltieri as login it never asks for password and immediately says login incorrect. While it's booting I see several [FAILED]... messages, e.g. [FAILED] to start CUPS Scheduler
I booted the system again and this time it dropped into emergency mode. In emergency mode I see the following messages in dmesg:
BTRFS info (device sda2): flagging fs with big metadata feature BTRFS info (device sda2): disk space caching is enabled BTRFS info (device sda2): has skinny extents BTRFS info (device sda2): start tree-log replay BTRFS info (device sda2): parent transid verify failed on 61849600 wanted 145639 fount 145637 BTRFS info (device sda2): parent transid verify failed on 61849600 wanted 145639 fount 145637 BTRFS: error (device sda2) in btrfs_replay_log:2423 errno=-5 IO failure (Failed to recover log tree) BTRFS error (device sda2) open_ctree failed
That's not good. The tree-log is used during fsync as an optimization to avoid having to do full file system metadata updates. Since the tree-log exists, we know this file system was undergoing some fsync write operations which were then interrupted. Either the VM or host crashed, or one of them was forced to shutdown, or there's a bug that otherwise prevented the guest operations from completing. Further, the parent transid verification failure messages indicate some out of order writes, as if the virtual drive+controller+cache is occasionally ignoring flush/FUA requests.
I regularly use qemu-kvm VM with cache mode "unsafe". The VM can crash all day long and at most I lose ~30s of the most recent writes, depending on the fsync policy of the application doing the writes. But the file system mounts normally otherwise following the crash. However if the host crashes while the guest is writing, that file system can be irreparably damaged. This is expected. So you might want to check the cache policy being used, make sure that the guest VM is really shutting down properly before rebooting/shutting down the host.
I ran btrfs check in emergency mode and it came up with a lot of errors.
How do i recover the partition(s) so I can boot the system, or at least mount them?
I'd start with mount -o ro,nologreplay,rescue=usebackuproot
Followed by mount -o ro,nologreplay,rescue=all
The second one is a bit of a heavy hammer but it's safe insofar as it's mounting the fs read only and making no changes. It is also disabling csum checking so any corrupt files still get copied out, and without any corruption warnings. You can check man 5 btrfs to read a bit more about the other options and vary the selection. This is pretty much a recovery operation, i.e. get the important data out.
The repair comment for this particular set of errors:
btrfs rescue zero-log btrfs check --repair --init-extent-tree btrfs check --repair
I have somewhat low confidence that it can be repaired rather than make things worse. So you should start out with the earlier mount commands to get anything important out of the fs first. IF those don't work and there's important information to get out, you need to use btrfs restore.