On 7/2/20 4:38 PM, Eric Sandeen wrote:
On 7/1/20 12:50 PM, Chris Murphy wrote:
> Integrity checking is highly valued by some and less by others.
> Considering that we know hardware isn't 100% reliable, and doesn't
> always report its own failures as expected, and hence why most file
> systems now at least checksum metadata, it's not persuasive to me that
> the data should be left unchecked, and corruption ought to be handled
> by user space somehow.
There's a flip side to this coin - in my experience, if the right btrfs
metadata blocks experience this disk corruption, there can be
a complete inability to recover the btrfs filesystem from that error -
i.e. it won't mount, and btrfsck --repair won't get it to a mountable
So if we're saying disk corruption happens often enough that data
checksumming is critical, then it happens often enough that metadata
recovery is at least as critical.
I've been trying to quantify this and have not come up with a particularly
compelling test scenario, because it involves purposefully (though at random)
corrupting enough blocks on a filesystem image that a critical block gets
hit, so it looks synthetic. But the net result is frequently a filesystem
where btrfsck and/or mount fails, and at first blush this type of failure
happens much more often than on other filesystems.
I think Josef has alluded to this situation as well. To me, that's a big
concern. Not trying to be a wet blanket here but I think this needs to be
carefully investigated and evaluated to understand what impact it may have
on Fedora btrfs users and their ability to recover their data in the face
of metadata corruption, because it looks to me like a definite btrfs weak
Yeah this is what I've said many times over the last 3 weeks. Btrfs is more
vulnerable to metadata corruption.
Now there's things that we can do to mitigate this. I have one patch up to
handle one of the main cases (a corrupt global tree). The next patch set will
be to keep entire metadata tree's around for longer as long as we have space to
handle it. These two things will drastically improve the situation, but of
course if I'm being evil we can still end up in a bad spot. These patches are
not hard or controversial, they'll likely land in 5.9 which will be what F33
ships with (if I'm doing my math right).
And this sort of ignores the other side of the coin. fsfuzzer isn't just
corrupting metadata, it's corrupting data. Btrfs is the only file system that's
going to notice that and let the user know.
Checksumming is great because it lets the user know things are going wrong
before they go catastrophically wrong. However just because we know something
went wrong doesn't mean we can do anything about it, it just means that the user
knows now that they need to restore from backups and find a new drive. These
features do not mean you are absolved of good practices. If you care about
data, you need to have it in multiple places. End of story. Btrfs is just
going to let you know in advance that things are going wrong.
We're talking about this issue like it's reasonable that xfs and ext4 are going
to allow the user to get back a bunch of data they don't know is ok or not.
We're also talking about it like the user should be able to carry on his happy
merry way. In these cases the drive is dying and needs to be shredded, and a
new install needs to happen and a restore from backups needs to happen. Is the
btrfs failure much less user friendly? No doubt about it. Is it any comfort at
all when a user shows up and we say "where are your backups" and they say
backups?", no. But if we're going to talk about this like ext4 and xfs are much
better because they give you the _appearance_ that your data is fine, that's a
"Well what if it was just /usr." Sure, then you got lucky and you could copy
things off. But what if it wasn't? That's the measure that's being applied
btrfs here. Is it likely that random corruption is going to be so bad that you
end up with an unmountable file system? It's about as likely that the random
corruption is on your dissertation or your family photographs. The difference
is that btrfs will tell you that your dissertation or your family photographs
are now bad, whereas ext4 and xfs will not.
These are tradeoffs no doubt. Every file system choice is a series of trade
offs. We're arguing/optimizing for the narrowest usecase. Arguments can be
made either way, but in the end is it important enough to not move ahead with