On Sat, 2020-06-27 at 12:42 -0600, Chris Murphy wrote:
On Sat, Jun 27, 2020 at 8:01 AM Konstantin Kharlamov
> I see no one mentined yet: BTRFS is slow on HDDs. It trivially comes from
> being COW. So if you changed a bit in a file, BTRFS will copy a block (or
> a number of them, not sure this detail matters) to another place, and now
> data got fragmented. SSDs may not care, HDDs on the other hand do.
It's faster on some workloads, slower on others. There are
optimizations to help make up for COW: inline extents for small files,
and random writes that commit together (i.e. in the same 30s window)
will be written as sequential writes. It is true btrfs does not have
nearly as many locality optimizations as ext4 and xfs, but at least
xfs developers have recently proposed removing those HDD optimizations
in favor of optimizations that are more relevant to today's hardware
> Another reason worth mentioning: BTRFS per se is slow. If you look at
> on Phoronix comparing BTRFS with others, BTRFS is rarely even on par with
It wins some. It loses others.
This sounds very wrong. This deludes readers into thinking BTRFS is on par with
other FSes. If you head over to the Phoronix article you linked below and try to
count how many times BTRFS was winning/on par/lost, you'll see the ratio is not
even close on the BTRFS side.
To save you the effort, it is:
type | win | on par | lose
NVMe: | 3 | 4 | 8
SATA SSD: | 0 | 5 | 10
USB SSD: | 0 | 1 | 4
FYI, in this calculation I took the BTRFS side a few times, and counted it as
either "winning" or "on par". It was when it had a head against only
other FSes. (Idk why "USB SSD" has many tests missing)
Head over to the xfs list and enjoy the
benchmark commentary from people who actually understand benchmarking.
A recurring theme is that a benchmark is only as relevant to the
degree it actually mimics the workload you care about. And most
benchmark tools don't do that very well.
Here's a benchmark that's apples to apples because I'm merely timing
the time to compile the exact same thing each time, twice.
What point are you trying to make here? If you're implying that "applications
startup time" that the article measured is more "syntetic test" than
compilation time you measuring, then this sounds odd. Because people start apps
up more often than compile the kernel. In fact, compiation process includes
starting up apps.
They're all in the same ballpark, except there's a write time
the one with zstd:1 on this particular setup (and the compression hit
isn't consistent across all hardware or setups, it's case by case -
and hence the proposal option for compression indicates applying it
selectively to locations we know there's a benefit across the board).
But also you can tell there's no read time (decompression) hit from
this same data set.
It is nice to see, although I'm pretty surprised they all have the same
performance, except the one with compression. Could it be because all files got
cached in RAM? If you did test by doing `git clone` and then running the build,
then I'm pretty sure it did. I don't know how it works when files are cached,
but I wouldn't be surprised if a number of filesystem-specific paths would be
skipped in this case.
Meanwhile, this is somewhere between embarrassing and comedy:
Hmmm, 21 seconds to launch GNOME Terminal with an NVMe and you aren't
curious about what went wrong? Because obviously something is wrong.
The measurement is wrong or the method is wrong or something in the
setup is fouling things up. How do you get a fast result with SSD but
then such a slow result with NVMe?
It makes no sense, but meh, we'll just publish that shit anyway! LOLZ!
And that is how you light your credibility on fire, because you just
don't give a crap about it.
You misread it, the NVMe startup time is 1.03sec. The 21.01sec. time is SATA
3.0 SSD. No need to swear.
Not to say it is not odd compared to other results, but we can only guess.
On my 9 year old laptop with a mere Samsung 840 EVO, barely under 1
second for GNOME Terminal to launch, following a reboot and login so
this is not the result of caching. On my much newer HP Spectre with
NVMe, under 0.5s to launch.
My methodology and metrology? I'm using the "one mississippi" method
from finger click of the actual app icon to the time I see a cursor in
the launched app. Not rocket science.
Good for you. But you're trying take take decision for all other peoples, so you
need to take into account not everyone has NVMe or SSD. HDDs that many people
are also using are much slower. This means your "1 second vs 0.5 second" can
easily turn into "5 seconds vs 10 seconds" (and not necessarily linearly).
> As a matter of fact, I have two Archlinux laptops on BTRFS with
> both only have HDD. I've been using for 3-4 years BTRFS there I think, maybe
> more. I made use of BTRFS because I was hoping that using ZSTD would result
> less IO. Well, now my overall experience is that it is not rare that systems
> starting to lag terribly, then I execute `grep "" /proc/pressure/*`, and
> someone is hogging IO. Then I pop up `iotop -a` and see among various
> a `[btrfs-cleaner]` and `[btrfs-transacti]`. It may be because of defrag
> I'm not sure…
There are many btrfs threads. Those actually make it more performant.
If you look at their total cpu time though, e.g. ps aux, you'll see
it's really small compared to most anything else you might think is
root 366 0.0 0.0 0 0 ? S Jun25 1:22
root 500 0.0 0.0 0 0 ? S Jun25 1:45
dbus 538 0.0 0.0 271548 6968 ? S Jun25 1:13
dbus-broker --log 4 --controller 9 --machine-id
ce3f1eade82d42bd891a8c15714b13cf --max-bytes 536870912 --m
root 1328 0.0 0.1 1273476 10116 ? Sl Jun25 3:00
There is in fact a WTF moment as a result of this partial listing and
it's not btrfs.
BTW this is 2 days of uptime.
You misread me, I wasn't talking about CPU time, I was talking about IO.