Re: Fedora 33 System-Wide Change proposal: Make btrfs the default file system for desktop variants

Saturday, 27 June 2020

On Sat, 2020-06-27 at 12:42 -0600, Chris Murphy wrote:
...
 On Sat, Jun 27, 2020 at 8:01 AM Konstantin Kharlamov
<hi-angel(a)yandex.ru&gt;
 wrote:
 > I see no one mentined yet: BTRFS is slow on HDDs. It trivially comes from
 > BTRFS
 > being COW. So if you changed a bit in a file, BTRFS will copy a block (or
 > maybe
 > a number of them, not sure this detail matters) to another place, and now
 > your
 > data got fragmented. SSDs may not care, HDDs on the other hand do.

 It's faster on some workloads, slower on others. There are
 optimizations to help make up for COW: inline extents for small files,
 and random writes that commit together (i.e. in the same 30s window)
 will be written as sequential writes. It is true btrfs does not have
 nearly as many locality optimizations as ext4 and xfs, but at least
 xfs developers have recently proposed removing those HDD optimizations
 in favor of optimizations that are more relevant to today's hardware
 and workloads.

 > Another reason worth mentioning: BTRFS per se is slow. If you look at
 > benchmarks
 > on Phoronix comparing BTRFS with others, BTRFS is rarely even on par with
 > them.

 It wins some. It loses others. 
This sounds very wrong. This deludes readers into thinking BTRFS is on par with
other FSes. If you head over to the Phoronix article you linked below and try to
count how many times BTRFS was winning/on par/lost, you'll see the ratio is not
even close on the BTRFS side.

To save you the effort, it is:

type      | win | on par | lose
NVMe:     | 3   | 4      | 8
SATA SSD: | 0   | 5      | 10
USB SSD:  | 0   | 1      | 4

FYI, in this calculation I took the BTRFS side a few times, and counted it as
either "winning" or "on par". It was when it had a head against only
part of
other FSes. (Idk why "USB SSD" has many tests missing)

...
  Head over to the xfs list and enjoy the
 benchmark commentary from people who actually understand benchmarking.
 A recurring theme is that a benchmark is only as relevant to the
 degree it actually mimics the workload you care about. And most
 benchmark tools don't do that very well.

 Here's a benchmark that's apples to apples because I'm merely timing
 the time to compile the exact same thing each time, twice.

https://docs.google.com/spreadsheets/d/1b-y2WVrQK4ijo1TS5aRe0QROSf8CU3ckT...

What point are you trying to make here? If you're implying that "applications
startup time" that the article measured is more "syntetic test" than
kernel
compilation time you measuring, then this sounds odd. Because people start apps
up more often than compile the kernel. In fact, compiation process includes
starting up apps.

...
 They're all in the same ballpark, except there's a write time
hit for
 the one with zstd:1 on this particular setup (and the compression hit
 isn't consistent across all hardware or setups, it's case by case -
 and hence the proposal option for compression indicates applying it
 selectively to locations we know there's a benefit across the board).
 But also you can tell there's no read time (decompression) hit from
 this same data set. 
It is nice to see, although I'm pretty surprised they all have the same
performance, except the one with compression. Could it be because all files got
cached in RAM? If you did test by doing `git clone` and then running the build,
then I'm pretty sure it did. I don't know how it works when files are cached,
but I wouldn't be surprised if a number of filesystem-specific paths would be
skipped in this case.

...
 Meanwhile, this is somewhere between embarrassing and comedy:
 https://www.phoronix.com/scan.php?page=article&item=linux-50-filesyst...

 Hmmm, 21 seconds to launch GNOME Terminal with an NVMe and you aren't
 curious about what went wrong? Because obviously something is wrong.
 The measurement is wrong or the method is wrong or something in the
 setup is fouling things up. How do you get a fast result with SSD but
 then such a slow result with NVMe?

 It makes no sense, but meh, we'll just publish that shit anyway! LOLZ!
 And that is how you light your credibility on fire, because you just
 don't give a crap about it. 
You misread it, the NVMe startup time is 1.03sec. The 21.01sec. time is SATA
3.0 SSD. No need to swear.

Not to say it is not odd compared to other results, but we can only guess.

...
 On my 9 year old laptop with a mere Samsung 840 EVO, barely under  1
 second for GNOME Terminal to launch, following a reboot and login so
 this is not the result of caching. On my much newer HP Spectre with
 NVMe, under 0.5s to launch.

 My methodology and metrology? I'm using the "one mississippi" method
 from finger click of the actual app icon to the time I see a cursor in
 the launched app.  Not rocket science. 
Good for you. But you're trying take take decision for all other peoples, so you
need to take into account not everyone has NVMe or SSD. HDDs that many people
are also using are much slower. This means your "1 second vs 0.5 second" can
easily turn into "5 seconds vs 10 seconds" (and not necessarily linearly).

...
 > As a matter of fact, I have two Archlinux laptops on BTRFS with
compression,
 > both only have HDD. I've been using for 3-4 years BTRFS there I think, maybe
 > more. I made use of BTRFS because I was hoping that using ZSTD would result
 > in
 > less IO. Well, now my overall experience is that it is not rare that systems
 > starting to lag terribly, then I execute `grep "" /proc/pressure/*`, and
see
 > someone is hogging IO. Then I pop up `iotop -a` and see among various
 > processes
 > a `[btrfs-cleaner]` and `[btrfs-transacti]`. It may be because of defrag
 > option,
 > I'm not sure…

 There are many btrfs threads. Those actually make it more performant.
 If you look at their total cpu time though, e.g. ps aux, you'll see
 it's really small compared to most anything else you might think is
 idle.

 root         366  0.0  0.0      0     0 ?        S    Jun25   1:22
 [btrfs-transacti]
 root         500  0.0  0.0      0     0 ?        S    Jun25   1:45
 [irq/135-iwlwifi]
 dbus         538  0.0  0.0 271548  6968 ?        S    Jun25   1:13
 dbus-broker --log 4 --controller 9 --machine-id
 ce3f1eade82d42bd891a8c15714b13cf --max-bytes 536870912 --m
 root        1328  0.0  0.1 1273476 10116 ?       Sl   Jun25   3:00
 /opt/teamviewer/tv_bin/teamviewerd -d

 There is in fact a WTF moment as a result of this partial listing and
 it's not btrfs.

 BTW this is 2 days of uptime. 
You misread me, I wasn't talking about CPU time, I was talking about IO.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: Fedora 33 System-Wide Change proposal: Make btrfs the default file system for desktop variants