Re: Fedora 33 System-Wide Change proposal: Make btrfs the default file system for desktop variants

Monday, 6 July 2020

On Mon, Jul 6, 2020 at 9:52 AM Stephen John Smoogen <smooge(a)gmail.com&gt; wrote:
...

 On Mon, 6 Jul 2020 at 01:19, Chris Murphy <lists(a)colorremedies.com&gt; wrote:
 >
 > On Fri, Jul 3, 2020 at 8:40 PM Eric Sandeen <sandeen(a)redhat.com&gt; wrote:
 > >
 > > On 7/3/20 1:41 PM, Chris Murphy wrote:
 > > > SSDs can fail in weird ways. Some spew garbage as they're failing,
 > > > some go read-only. I've seen both. I don't have stats on how
common it
 > > > is for an SSD to go read-only as it fails, but once it happens you
 > > > cannot fsck it. It won't accept writes. If it won't mount, your
only
 > > > chance to recover data is some kind of offline scrape tool. And Btrfs
 > > > does have a very very good scrape tool, in terms of its success rate -
 > > > UX is scary. But that can and will improve.
 > >
 > > Ok, you and Josef have both recommended the btrfs restore ("scrape")
 > > tool as a next recovery step after fsck fails, and I figured we should
 > > check that out, to see if that alleviates the concerns about
 > > recoverability of user data in the face of corruption.
 > >
 > > I also realized that mkfs of an image isn't representative of an SSD
 > > system typical of Fedora laptops, so I added "-m single" to mkfs,
 > > because this will be the mkfs.btrfs default on SSDs (right?).  Based
 > > on Josef's description of fsck's algorithm of throwing away any
 > > block with a bad CRC this seemed worth testing.
 > >
 > > I also turned fuzzing /down/ to hitting 2048 bytes out of the 1G
 > > image, or a bit less than 1% of the filesystem blocks, at random.
 > > This is 1/4 the fuzzing rate from the original test.
 > >
 > > So: -m single, fuzz 2048 bytes of 1G image, run btrfsck --repair,
 > > mount, mount w/ recovery, and then restore ("scrape") if all that
 > > fails, see what we get.
 >
 > What's the probability of this kind of corruption occurring in the
 > real world? If the probability is so low it can't practically be
 > computed, how do we assess the risk? And if we can't assess risk,
 > what's the basis of concern?
 >

 Aren't most disk failure tests 'huh it somehow happened at least once
 and I think this explains all these other failures too?' I know that
 with giant clusters you can do more testing but you also have a lot of
 things like

 What is the chance that a disk will die over time? 100%
 What is the chance that a disk died from this particular scenario?
 0.00000<maybe put a digit here> %
 reword the question slightly differently.. What is the chance this
 disk died from that scenario? 100%. 
Yes. Also in fuzzing there is the concept of "when to stop fuzzing"
because it's a rabbit hole, you have to come up for air at some point,
and work on other things. But you raise a good and subtle point which
is also that ext4 has a very good fsck built up over decades, they
succeed today from past failures. It's no different with Btrfs.

But also there is a bias. ext4 needs fsck to succeed in the worst
cases in order to mount the file system. Btrfs doesn't need that.
Often it can tolerate a read-only mount without any other mount
option; and optionally can be made more tolerant to errors while still
mounting read-only. This is a significant difference in recovery
strategy. An fsck is something of a risk because it is writing changes
to the file system. It is irreversible. Btrfs takes a different view,
which is to increase the chance of recovery without needing a risky
repair as the first step. Once your important data is out, now try the
repair. Good chance it works, but maybe not as good as ext4's.

--
Chris Murphy

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: Fedora 33 System-Wide Change proposal: Make btrfs the default file system for desktop variants