On 6/27/20 9:57 AM, Peter Robinson wrote:
> I've been very clear from the outset that Facebook's
fault tolerance is much
> higher than the average Fedora user. The only reason I've agreed to assist in
> answering questions and support this proposal is because I have multi-year data
> that shows our failure rates are the same that we see on every other file
> system, which is basically the failure rate of the disks themselves.
>
> And I specifically point out the hardware that we use that most closely reflects
> the drives that an average Fedora user is going to have. We of course have a
> very wide variety of hardware. In fact the very first thing we deployed on were
> these expensive hardware RAID setups. Btrfs found bugs in that firmware that
> was silently corrupting data. These corruptions had been corrupting AI test
> data for years under XFS, and Btrfs found it in a matter of days because of our
> checksumming.
>
> We use all sorts of hardware, and have all sorts of similar stories like this.
> I agree that the hardware is going to be muuuuuch more varied with Fedora users,
> and that Facebook has muuuuch higher fault tolerance. But higher production
> failures inside FB means more engineering time spent dealing with those
> failures, which translates to lost productivity. If btrfs was causing us to run
> around fixing it all the time then we wouldn't deploy it. The fact is that
it's
> not, it's perfectly stable from our perspective. Thanks,
Thanks for the details, you have any data/information/opinions on non
x86 architectures such as aarch64/armv7/ppc64le all of which have
supported desktops too?
I can't speak to ppc* at all, and I'm not sure how much I can talk about our arm
stuff, but it was tested and used in production on arm a few years ago. But
obviously the bulk of our workload is x86. Thanks,
Josef