Search results for "parent transid verify failed" - users

Re: BTRFS partition corrupted after deleting files in /home

by Sreyan Chakravarty

On Mon, Jan 4, 2021 at 10:14 PM Chris Murphy <lists(a)colorremedies.com> wrote: > transid errors like this indicate out of order writes due to drive > firmware not honoring file system write ordering and then getting a > badly timed crash/powerfail/shutdown. First of all thanks for your quick response. So would I be correct assuming that the problem is in my firmware ? Or is it too early to say anything like that ? Is my firmware so outdated that it can't handle BTRFS ? > You report that the file system went read only while using it. This > suggests a dropped write and the file system went read-only to limit > the damage. Ideally we'd get the log, if it made it to disk, to see > what lead up to this so we can determine what the problem is and get > it fixed. What I can tell you is this is not user error but that's not > much comfort. > Well it doesn't provide comfort but at least I can say that it wasn't me who messed up my filesystem. > > Yeah that's bad. I think it's fixable. We need to get a metadata dump > of the file system to see if fsck will fix it. > > btrfs-image -c9 -t4 /dev/sdXY /mnt/path/to/file > > That will include filenames but not any data. If you need to mask > filenames, add -ss option to the above. (-s won't help here). And the > path to file if you're on a live USB stick can just be something like > ~/sreyan-btrfs.img and then put it up on the google drive. I don't think there is any hope for my data, as I can't even create the meta-data image: # btrfs-image -c9 -t4 /dev/mapper/dm_crypt /run/media/liveuser/Backup\ Plus/btrfs_meta.img parent transid verify failed on 55640064 wanted 44146 found 44438 parent transid verify failed on 55640064 wanted 44146 found 44438 parent transid verify failed on 55640064 wanted 44146 found 44438 Ignoring transid failure parent transid verify failed on 55902208 wanted 44170 found 44438 Ignoring transid failure parent transid verify failed on 56410112 wanted 44170 found 44439 Ignoring transid failure parent transid verify failed on 58621952 wanted 44170 found 44439 Ignoring transid failure ERROR: child eb corrupted: parent bytenr=178081497088 item=246 parent level=1 child level=2 ERROR: cannot go to next leaf -5 ERROR: create failed: -5 What do I do now ? > I'm on irc.freenode.net as cmurf that's usually the easier way to get > help, on #fedora channel. > Do I need to have a bouncer ? I am in India, and I believe you are in the US, so when you are active, I am usually sleeping. > Also, have you ever done a balance on this file system? (That is not a > suggestion that you should or shouldn't have. Just a yes or no > question to try and piece together some other data points.) > No never did anything like that. -- Regards, Sreyan Chakravarty

3 years, 3 months

Re: BTRFS partition corrupted after deleting files in /home

by Chris Murphy

On Mon, Jan 4, 2021 at 11:32 AM Sreyan Chakravarty <sreyan32(a)gmail.com> wrote: > > On Mon, Jan 4, 2021 at 10:14 PM Chris Murphy <lists(a)colorremedies.com> wrote: > > transid errors like this indicate out of order writes due to drive > > firmware not honoring file system write ordering and then getting a > > badly timed crash/powerfail/shutdown. > > First of all thanks for your quick response. > > So would I be correct assuming that the problem is in my firmware ? Or > is it too early to say anything like that ? Too early. The usual case of transid errors is drive firmware bugs *and* ill timed shutdown. Since you don't have an ill timed shutdown, it's less likely this is a drive firmware bug, but can't be ruled out. i.e. I'm proposing there might be a software bug here and we just need to figure it out. Bad memory usually shows up as bit flips and doesn't result in damage like this - but it has to be considered whether a bitflip can affect code. It can also be a kernel bug - the storage stack has many layers, not just Btrfs and dm-crypt. But no one wants to go blaming other people's work without understanding the problem. > Is my firmware so outdated that it can't handle BTRFS ? No. It's a bit complicated. Buggy drive firmware is common. But normally it doesn't matter mainly due to good luck. More than one thing has to go wrong to cause a problem like (a) firmware bug exists (b) firmware bug is triggered (c) crash/powerfail. If one of those is not true, then it's not a problem. There is also the transient hardware defect problem that can act like a bug but it's just rotting the metadata or data. It's not obvious but it is possible to piece together what's happened when we have enough information. > # btrfs-image -c9 -t4 /dev/mapper/dm_crypt /run/media/liveuser/Backup\ > Plus/btrfs_meta.img > > parent transid verify failed on 55640064 wanted 44146 found 44438 > parent transid verify failed on 55640064 wanted 44146 found 44438 > parent transid verify failed on 55640064 wanted 44146 found 44438 > Ignoring transid failure > parent transid verify failed on 55902208 wanted 44170 found 44438 > Ignoring transid failure > parent transid verify failed on 56410112 wanted 44170 found 44439 > Ignoring transid failure > parent transid verify failed on 58621952 wanted 44170 found 44439 > Ignoring transid failure > ERROR: child eb corrupted: parent bytenr=178081497088 item=246 parent > level=1 child level=2 > ERROR: cannot go to next leaf -5 > ERROR: create failed: -5 > > What do I do now ? Rats. Can you retry by adding -w option? In the meantime I'll report back to upstream and see what they recommend next. > > I'm on irc.freenode.net as cmurf that's usually the easier way to get > > help, on #fedora channel. > > > > Do I need to have a bouncer ? I am in India, and I believe you are in > the US, so when you are active, I am usually sleeping. An alternative is matrix. We have a matrix-irc bridge in #fedora and pretty soon I think the plan is to switch mainly to matrix. So if you know about matrix then you can join #fedora - but I don't know how to explain it very well since I don't use matrix yet. I think it keeps the history for you, unlike IRC (I use a bouncer so I will see your messages later). I keep weird hours so it might overlap at some point. -- Chris Murphy

3 years, 3 months

Re: BTRFS partition corrupted after deleting files in /home

by Sreyan Chakravarty

On Tue, Jan 5, 2021 at 5:27 AM Chris Murphy <lists(a)colorremedies.com> wrote: > > New new plan, ngompa built it for us in Fedora copr. > > sudo dnf install > https://download.copr.fedorainfracloud.org/results/ngompa/btrfsprogs-robu... > > That will replace your btrfs-progs with josef's special repo with the > better -w. Later, you can revert back to the original: > > sudo dnf install > https://kojipkgs.fedoraproject.org//packages/btrfs-progs/5.9/1.fc33/x86_6... > > But there's no urgency to revert, it's the same thing just with this > btrfs-image enhancement. > It worked. Installing the COPR build worked without any problems. But I did get this on the console: parent transid verify failed on 55640064 wanted 44146 found 44438 parent transid verify failed on 55640064 wanted 44146 found 44438 parent transid verify failed on 55640064 wanted 44146 found 44438 Ignoring transid failure You can find the metadata image over here: https://drive.google.com/file/d/1MIwwtKvt8zQxrMomhvtBZXx_0au2-Pw6/view?us... The file is over 300 MB, and compressed with GZIP. Let me know if you are unable to download it. No file names are obfuscated, I did not use the -s option, so hopefully that should help you. Let me know what's next. PS: I have messaged you on IRC #fedora. -- Regards, Sreyan Chakravarty

3 years, 3 months

Re: maybe OT

by George N. White III

On Fri, 18 Mar 2022 at 19:47, Paolo Galtieri <pgaltieri(a)gmail.com> wrote: > I'm having issues with a VM. > It would be useful to mention the host OS. From the name, I guess your VM is running Fedora 34. > > The VM was originally created under VMware and has worked fine for a > while. Today when I booted it up instead of seeing the usual MATE login > screen I get a login prompt: > > f34-01-vm: > > no matter what I enter, root or pgaltieri as login it never asks for > password and immediately says login incorrect. While it's booting I see > several [FAILED]... messages, e.g. [FAILED] to start CUPS Scheduler > > I booted the system again and this time it dropped into emergency mode. > In emergency mode I see the following messages in dmesg: > > BTRFS info (device sda2): flagging fs with big metadata feature > BTRFS info (device sda2): disk space caching is enabled > BTRFS info (device sda2): has skinny extents > BTRFS info (device sda2): start tree-log replay > BTRFS info (device sda2): parent transid verify failed on 61849600 > wanted 145639 fount 145637 > BTRFS info (device sda2): parent transid verify failed on 61849600 > wanted 145639 fount 145637 > BTRFS: error (device sda2) in btrfs_replay_log:2423 errno=-5 IO failure > (Failed to recover log tree) > BTRFS error (device sda2) open_ctree failed > > I ran btrfs check in emergency mode and it came up with a lot of errors. > > How do i recover the partition(s) so I can boot the system, or at least > mount them? > The underlying problem could be the physical disk that holds the VM's virtual disk file, or a corrupt btrfs. Avoid doing anything that would write to the virtual disk. Make a backup copy of the virtual disk. If the physical drive is OK, use a separate VM to mount the Fedora 34 virtual disk for repair attempts. Try: https://btrfs.wiki.kernel.org/index.php/FAQ How do I recover from a parent transid verify failed error? At one time VirtualBox had issues with btrfs. You should check for similar reports for VMWare and btrfs. -- George N. White III

2 years, 1 month

Re: maybe OT

by Paolo Galtieri

The host OS is also F34. On 3/20/22 08:14, George N. White III wrote: > On Fri, 18 Mar 2022 at 19:47, Paolo Galtieri <pgaltieri(a)gmail.com> wrote: > > I'm having issues with a VM. > > > It would be useful to mention the host OS. From the name, I guess your > VM is running Fedora 34. > > > The VM was originally created under VMware and has worked fine for a > while. Today when I booted it up instead of seeing the usual MATE > login > screen I get a login prompt: > > f34-01-vm: > > no matter what I enter, root or pgaltieri as login it never asks for > password and immediately says login incorrect. While it's booting > I see > several [FAILED]... messages, e.g. [FAILED] to start CUPS Scheduler > > I booted the system again and this time it dropped into emergency > mode. > In emergency mode I see the following messages in dmesg: > > BTRFS info (device sda2): flagging fs with big metadata feature > BTRFS info (device sda2): disk space caching is enabled > BTRFS info (device sda2): has skinny extents > BTRFS info (device sda2): start tree-log replay > BTRFS info (device sda2): parent transid verify failed on 61849600 > wanted 145639 fount 145637 > BTRFS info (device sda2): parent transid verify failed on 61849600 > wanted 145639 fount 145637 > BTRFS: error (device sda2) in btrfs_replay_log:2423 errno=-5 IO > failure > (Failed to recover log tree) > BTRFS error (device sda2) open_ctree failed > > I ran btrfs check in emergency mode and it came up with a lot of > errors. > > How do i recover the partition(s) so I can boot the system, or at > least > mount them? > > > The underlying problem could be the physical disk that holds the VM's > virtual disk file, or a corrupt btrfs. Avoid doing anything that > would write to the > virtual disk. Make a backup copy of the virtual disk. If the physical > drive > is OK, use a separate VM to mount the Fedora 34 virtual disk for repair > attempts. > > Try: https://btrfs.wiki.kernel.org/index.php/FAQ > How do I recover from a parent transid verify failed error? > > At one time VirtualBox had issues with btrfs. You should check for > similar > reports for VMWare and btrfs. > > -- > George N. White III > > > _______________________________________________ > users mailing list --users(a)lists.fedoraproject.org > To unsubscribe send an email tousers-leave(a)lists.fedoraproject.org > Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives:https://lists.fedoraproject.org/archives/list/users@lists.fedora... > Do not reply to spam on the list, report it:https://pagure.io/fedora-infrastructure

2 years, 1 month

Re: BTRFS partition corrupted after deleting files in /home

by Chris Murphy

On Mon, Jan 4, 2021 at 6:59 AM Sreyan Chakravarty <sreyan32(a)gmail.com> wrote: > > On Mon, Jan 4, 2021 at 1:16 AM Chris Murphy <lists(a)colorremedies.com> wrote: > > > > Try to mount normally, then: > > I am unable to mount normally : > > # mount -t btrfs /dev/mapper/dm_crypt /mnt/ > mount: /mnt: wrong fs type, bad option, bad superblock on > /dev/mapper/dm_crypt, missing codepage or helper program, or other > error. > > > > > dmesg > > This is what I get in dmesg: > > [29867.234062] BTRFS info (device dm-4): disk space caching is enabled > [29867.234067] BTRFS info (device dm-4): has skinny extents > [29867.317955] BTRFS error (device dm-4): parent transid verify failed > on 55640064 wanted 44146 found 44438 > [29867.326701] BTRFS error (device dm-4): parent transid verify failed > on 55640064 wanted 44146 found 44438 > [29867.326727] BTRFS warning (device dm-4): failed to read root (objectid=9): -5 > [29867.333668] BTRFS error (device dm-4): open_ctree failed transid errors like this indicate out of order writes due to drive firmware not honoring file system write ordering and then getting a badly timed crash/powerfail/shutdown. However... You report that the file system went read only while using it. This suggests a dropped write and the file system went read-only to limit the damage. Ideally we'd get the log, if it made it to disk, to see what lead up to this so we can determine what the problem is and get it fixed. What I can tell you is this is not user error but that's not much comfort. > > > btrfs check --readonly > > A lot of errors, could not even upload to pastebin. > > This is in my Google Drive: > https://drive.google.com/file/d/1dpW7aftB3FuD8i1J7d4nRrzZHaGF4vuN/view?us... Yeah that's bad. I think it's fixable. We need to get a metadata dump of the file system to see if fsck will fix it. btrfs-image -c9 -t4 /dev/sdXY /mnt/path/to/file That will include filenames but not any data. If you need to mask filenames, add -ss option to the above. (-s won't help here). And the path to file if you're on a live USB stick can just be something like ~/sreyan-btrfs.img and then put it up on the google drive. By the way I'm on irc.freenode.net as cmurf that's usually the easier way to get help, on #fedora channel. Also, have you ever done a balance on this file system? (That is not a suggestion that you should or shouldn't have. Just a yes or no question to try and piece together some other data points.) -- Chris Murphy

3 years, 3 months

Re: maybe OT

by Chris Murphy

On Fri, Mar 18, 2022 at 4:47 PM Paolo Galtieri <pgaltieri(a)gmail.com> wrote: > > I'm having issues with a VM. > > The VM was originally created under VMware and has worked fine for a > while. Today when I booted it up instead of seeing the usual MATE login > screen I get a login prompt: > > f34-01-vm: > > no matter what I enter, root or pgaltieri as login it never asks for > password and immediately says login incorrect. While it's booting I see > several [FAILED]... messages, e.g. [FAILED] to start CUPS Scheduler > > I booted the system again and this time it dropped into emergency mode. > In emergency mode I see the following messages in dmesg: > > BTRFS info (device sda2): flagging fs with big metadata feature > BTRFS info (device sda2): disk space caching is enabled > BTRFS info (device sda2): has skinny extents > BTRFS info (device sda2): start tree-log replay > BTRFS info (device sda2): parent transid verify failed on 61849600 > wanted 145639 fount 145637 > BTRFS info (device sda2): parent transid verify failed on 61849600 > wanted 145639 fount 145637 > BTRFS: error (device sda2) in btrfs_replay_log:2423 errno=-5 IO failure > (Failed to recover log tree) > BTRFS error (device sda2) open_ctree failed That's not good. The tree-log is used during fsync as an optimization to avoid having to do full file system metadata updates. Since the tree-log exists, we know this file system was undergoing some fsync write operations which were then interrupted. Either the VM or host crashed, or one of them was forced to shutdown, or there's a bug that otherwise prevented the guest operations from completing. Further, the parent transid verification failure messages indicate some out of order writes, as if the virtual drive+controller+cache is occasionally ignoring flush/FUA requests. I regularly use qemu-kvm VM with cache mode "unsafe". The VM can crash all day long and at most I lose ~30s of the most recent writes, depending on the fsync policy of the application doing the writes. But the file system mounts normally otherwise following the crash. However if the host crashes while the guest is writing, that file system can be irreparably damaged. This is expected. So you might want to check the cache policy being used, make sure that the guest VM is really shutting down properly before rebooting/shutting down the host. > > I ran btrfs check in emergency mode and it came up with a lot of errors. > > How do i recover the partition(s) so I can boot the system, or at least > mount them? I'd start with mount -o ro,nologreplay,rescue=usebackuproot Followed by mount -o ro,nologreplay,rescue=all The second one is a bit of a heavy hammer but it's safe insofar as it's mounting the fs read only and making no changes. It is also disabling csum checking so any corrupt files still get copied out, and without any corruption warnings. You can check man 5 btrfs to read a bit more about the other options and vary the selection. This is pretty much a recovery operation, i.e. get the important data out. The repair comment for this particular set of errors: btrfs rescue zero-log btrfs check --repair --init-extent-tree btrfs check --repair I have somewhat low confidence that it can be repaired rather than make things worse. So you should start out with the earlier mount commands to get anything important out of the fs first. IF those don't work and there's important information to get out, you need to use btrfs restore. -- Chris Murphy

2 years, 1 month

Re: BTRFS partition corrupted after deleting files in /home

by Sreyan Chakravarty

On Mon, Jan 4, 2021 at 1:16 AM Chris Murphy <lists(a)colorremedies.com> wrote: > > Try to mount normally, then: I am unable to mount normally : # mount -t btrfs /dev/mapper/dm_crypt /mnt/ mount: /mnt: wrong fs type, bad option, bad superblock on /dev/mapper/dm_crypt, missing codepage or helper program, or other error. > > dmesg This is what I get in dmesg: [29867.234062] BTRFS info (device dm-4): disk space caching is enabled [29867.234067] BTRFS info (device dm-4): has skinny extents [29867.317955] BTRFS error (device dm-4): parent transid verify failed on 55640064 wanted 44146 found 44438 [29867.326701] BTRFS error (device dm-4): parent transid verify failed on 55640064 wanted 44146 found 44438 [29867.326727] BTRFS warning (device dm-4): failed to read root (objectid=9): -5 [29867.333668] BTRFS error (device dm-4): open_ctree failed > btrfs check --readonly A lot of errors, could not even upload to pastebin. This is in my Google Drive: https://drive.google.com/file/d/1dpW7aftB3FuD8i1J7d4nRrzZHaGF4vuN/view?us... Let me know if you are not able to download. It's compressed via gzip. > > mount -o ro,usebackuproot > mount -o ro,usebackuproot /dev/mapper/dm_crypt /mnt/ mount: /mnt: wrong fs type, bad option, bad superblock on /dev/mapper/dm_crypt, missing codepage or helper program, or other error. Something is horribly wrong. -- Regards, Sreyan Chakravarty

3 years, 3 months

maybe OT

by Paolo Galtieri

I'm having issues with a VM. The VM was originally created under VMware and has worked fine for a while. Today when I booted it up instead of seeing the usual MATE login screen I get a login prompt: f34-01-vm: no matter what I enter, root or pgaltieri as login it never asks for password and immediately says login incorrect. While it's booting I see several [FAILED]... messages, e.g. [FAILED] to start CUPS Scheduler I booted the system again and this time it dropped into emergency mode. In emergency mode I see the following messages in dmesg: BTRFS info (device sda2): flagging fs with big metadata feature BTRFS info (device sda2): disk space caching is enabled BTRFS info (device sda2): has skinny extents BTRFS info (device sda2): start tree-log replay BTRFS info (device sda2): parent transid verify failed on 61849600 wanted 145639 fount 145637 BTRFS info (device sda2): parent transid verify failed on 61849600 wanted 145639 fount 145637 BTRFS: error (device sda2) in btrfs_replay_log:2423 errno=-5 IO failure (Failed to recover log tree) BTRFS error (device sda2) open_ctree failed I ran btrfs check in emergency mode and it came up with a lot of errors. How do i recover the partition(s) so I can boot the system, or at least mount them? Also in emergency mode: vi /run/initramfs/rdsosreport.txt results in: /usr/bin/vi: line 23: /usr/libexec/vi: No such file or directory /usr/bin/vi is a script: if test -f /usr/bin/vim then exec /usr/bin/vim "$@" fi exec /usr/libexec/vi "$@" neither /usr/bin/vim nor /usr/libexec/vi exist. ====================================================================================== I tried booting the vm under VirtualBox with the same result. I converted the image: qemu-img convert -O qcow ../VMware/VMs/f34-01-vm/f34-01-vm.vmdk f34-01-vm.qcow2 which worked without errors. I then ran virt-manager to try to boot the image. This fails with this error Unable to complete install: 'internal error: process exited while connecting to monitor: 2022-03-18T19:13:15.196710Z qemu-system-x86_64: -blockdev {"driver":"file","filename":"/run/media/pgaltieri/SDNVIRTLAB02/VirtualMachines/KVM/f34-01-vm.qcow2","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}: Could not open '/run/media/pgaltieri/SDNVIRTLAB02/VirtualMachines/KVM/f34-01-vm.qcow2': Permission denied' Traceback (most recent call last): File "/usr/share/virt-manager/virtManager/asyncjob.py", line 65, in cb_wrapper callback(asyncjob, *args, **kwargs) File "/usr/share/virt-manager/virtManager/createvm.py", line 2001, in _do_async_install installer.start_install(guest, meter=meter) File "/usr/share/virt-manager/virtinst/install/installer.py", line 701, in start_install domain = self._create_guest( File "/usr/share/virt-manager/virtinst/install/installer.py", line 649, in _create_guest domain = self.conn.createXML(install_xml or final_xml, 0) File "/usr/lib64/python3.9/site-packages/libvirt.py", line 4366, in createXML raise libvirtError('virDomainCreateXML() failed') libvirt.libvirtError: internal error: process exited while connecting to monitor: 2022-03-18T19:13:15.196710Z qemu-system-x86_64: -blockdev {"driver":"file","filename":"/run/media/pgaltieri/SDNVIRTLAB02/VirtualMachines/KVM/f34-01-vm.qcow2","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}: Could not open '/run/media/pgaltieri/SDNVIRTLAB02/VirtualMachines/KVM/f34-01-vm.qcow2': Permission denied I added my user id to both the qemu and libvirt entries in /etc/group and logged out and logged back in and I get the same error. I also get SELinux alerts: The first alert: You need to change the label on f34-01-vm.qcow2' # semanage fcontext -a -t virt_image_t '/run/media/pgaltieri/SDNVIRTLAB02/VirtualMachines/KVM/f34-01-vm.qcow2' # restorecon -v '/run/media/pgaltieri/SDNVIRTLAB02/VirtualMachines/KVM/f34-01-vm.qcow2' subsequent alerts tell me to run: # /sbin/restorecon -v /run/media/pgaltieri/SDNVIRTLAB02/VirtualMachines/KVM/f34-01-vm.qcow2 I have run these commands, especially the restorecon, several times and I still get the alerts. One thing the semanage command as shown fails with: ValueError: File spec /run/media/pgaltieri/SDNVIRTLAB02/VirtualMachines/KVM/f34-01-vm.qcow2 conflicts with equivalency rule '/run /var/run'; Try adding '/var/run/media/pgaltieri/SDNVIRTLAB02/VirtualMachines/KVM/f34-01-vm.qcow2' instead If I add the /var then it works. here is the context of the file: -rwxrwxrwx. 1 pgaltieri pgaltieri system_u:object_r:fusefs_t:s0 15041695744 Mar 18 11:46 f34-01-vm.qcow2* So how the heck do I boot the image and get it running? Paolo

2 years, 1 month

Fw: mount fails for btrfs filesystem, need help please.

by George R Goffe

Hi, I started getting i/o error messages accessing this filesystem so I rebooted the system. This might have been the wrong thing to do. This subsequent boot went to maintenance mode due the filesystem's path being in /etc/fstab. I need some help with this please. Here is what mount says: mount /dev/sda6 /opt. kernel: BTRFS info (device sda6): flagging fs with big metadata feature kernel: BTRFS info (device sda6): disk space caching is enabled kernel: BTRFS info (device sda6): has skinny extents kernel: BTRFS error (device sda6): parent transid verify failed on 148312850432 wanted 73476 found 73484 kernel: BTRFS error (device sda6): failed to read block groups: -5 kernel: BTRFS error (device sda6): open_ctree failed

1 year, 8 months

users search results for query "parent transid verify failed"