Il giorno 4 ott 2019, alle ore 19:21, Dusty Mabe
<dusty(a)dustymabe.com> ha scritto:
On 10/4/19 1:07 PM, Paolo Valente wrote:
>> Il giorno 4 ott 2019, alle ore 17:32, Dusty Mabe <dusty(a)dustymabe.com> ha
>> On 10/4/19 11:06 AM, Paolo Valente wrote:
>>> Hi, I'm Paolo, the main developer of the BFQ I/O scheduler.
>> Hi Paolo!
>>> The switch to the BFQ I/O scheduler by Fedora paves the way to up to
>>> a ~10X throughput boost, and up a ~400X latency reduction. This
>>> performance improvement concerns I/O workloads generated by multiple
>>> containers that share common storage devices. Actually it concerns,
>>> in general, also workloads generated by multiple groups, VMs or
>>> entities of any kind.
>>> The reason for these apparently impressive numbers is that all other
>>> solutions for controlling I/O severely underutilize the speed of
>>> storage devices (usually between 10 and 20%).
>>> If so, why probably you have never been warned about such an
>>> impressive waste of resources? Because it is extremely difficult to
>>> guarantee bandwidths and latencies on a loaded drive. So the most
>>> common solution for avoiding starvation, or very high latencies, has
>>> always been to keep storage devices underutilized. When an
>>> underutilized device is hit by the I/O of some container/group/VM,
>>> it is likely to serve this I/O very quickly, because it is unlikely
>>> to be already busy serving other I/O. If the I/O demand grows, then
>>> one simply adds more drives, so as to keep utilization low. And when
>>> this stops scaling, one goes buy faster drives.
>>> More clever solutions do exist. They are based on I/O throttling.
>>> But, depending on the workload, these solutions may happen to
>>> forcibly lower utilization to about the same values reached with the
>>> above solution.
>>> In contrast, BFQ is smart enough to highly utilize drives, with
>>> every workload. So, using, e.g., only one drive, BFQ satisfies an
>>> I/O demand that requires from 5 to 10 drives with the other
>>> If you want to take advantage of this performance boost in Fedora
>>> CoreOS, I'm willing to help in every step.
>> It looks like the original request for this was made to Fedora in 
>> and applied to F31+.
>> Fedora CoreOS uses the same systemd from Fedora so unless we explicitly decide
>> against it we'll be using what Fedora does. I don't see any reason to
>> Paolo, does that match your understanding?
> Yep. The issue I wanted to address with this topic is that maybe few
> people know about the 10X throughput they can get, with BFQ, for
> container workloads. And now that they know it, maybe they still
> don't know how to enable this boost (fortunately, it is extremely
> easy). So I'm saying mainly "hey, here I am to help!" :)
Just checking this one point: On Fedora and Fedora CoreOS (assuming we don't change
any defaults) users won't need to do anything to "enable this boost".
Thank you very much for this useful question; it helped me realize
that I didn't explain the main problem at all, sorry.
The answer to your question is yes and no. The 'yes' is because, BFQ
is already there, as you rightly point out.
The 'no' is the tricky part. The main issue is that BFQ cannot make a
disk reach a higher throughput than that requested by the workload.
Let me give a simple example. If the only process doing I/O on a disk
does a read of 1 MB every second, then the maximum possible throughput
that can be reached by the disk is 1 MB/s. A bad solution for
controlling I/O may cause throughput to be below 1 MB/s, but no
solution could go above 1 MB/s, simply because no more than that is
So, if a user/sysadmin keeps disk bandwidths underutilized, because
this is their long-standing practice for guaranteeing bandwidth and
latency, and because they are not aware of what they can now do with
BFQ, then nothing changes for them, even if now I/O is scheduled by
I hope my concern is clearer now.
The goal of this topic is to spread the word, and offer help.
So if you boot up Fedora (or Fedora CoreOS) in F31+ you'll get it
by default. No action