BTRFS: The Good, The Bad and The Ugly

Wed Jul 13 21:51:05 UTC 2011

On 07/13/2011 11:14 PM, Josef Bacik wrote:
> On Wed, Jul 13, 2011 at 4:53 PM, Manuel Escudero <Jmlevick at gmail.com> wrote:
>> Today I'll be switching from BTRFS to Ext4 again because of the troubles
>> I've been having with
>> the New Linux Filesystem. As BTRFS is going to be the Default in F16 I
>> wanted the developers to
>> know what kind of troubles I've been experiencing with this FS in F15 so
>> they can take a look
>> at them in order to have a better F16 release:
>> The Good:
>> Since BTRFS arrived into my computer (Everything in the HDD is formated with
>> BTRFS excluding "/boot")
>> I've seen a performance improvement in the data transfer part from and to
>> the computer (copying files seem to
>> be faster than before) But that's all about the good things I noticed...
>> The Bad:
>> BTRFS has reduced system's overall performance, at this point, sometimes it
>> is OK, sometimes it is
>> VERY BAD, I've noticed "Performance Peaks" in F15 with BTRFS and the Boot
>> times are not nice: I mean,
>> they are not the slowest ones, but they're not as good as Before in F14 with
>> Ext4 instead of BTRFS.
>> The performance Running/Launching apps has been afected too and now the PC
>> freezes sometimes (that never
>> happened in F14 unless I forced it a lot with 4 VM's to suck the 4GB of RAM
>> I have); And Now it freezes
>> very often when it wants without a lot of effort.
>> The Ugly:
>> Running VM's when having their virtual HDD's stored in a BTRFS partition is
>> DEATH!
>> They're very slow, sometimes they open, sometimes they not, usually they
>> freeze, You can't
>> work with them. Same thing about Gnome Shell working over a BTRFS partition:
>> it is really slow,
>> sometimes it reacts but most of the time is pretty unresponsive.
>> Reading in the Web, I found that some users think that the BTRFS poor
>> performance is caused by some
>> special kind of fragmentation it suffers, others think it's because of it's
>> CopyonWrite attributes and some
>> others blame other stuff, God Knows! the only thing I know is that BTRFS is
>> not ready for being
>> used in normal production machines (as I tought) and it needs to be fixed
>> before the release of F16, because it's
>> performance is really far from good...
>> Other Stuff I noticed is that with Kernel 2.6.38.8-35 the system seems to
>> work better that with the previous one,
>> just a little, but is some kind of improvement.
>> Here you have all the info I found on the net about BTRFS Performance
>> issues noticed by users:
>> https://bugzilla.redhat.com/show_bug.cgi?id=689127
>> http://arosenfeld.wordpress.com/2010/12/27/back-to-ext4-from-btrfs/
>> http://www.vyatta4people.org/btrfs-is-a-bad-choice-when-running-kvm/
>> http://lkml.org/lkml/2010/7/13/475
>> http://blog.patshead.com/2011/03/btrfs---six-months-later.html
>> I only have a question:
>> Why Any Kind of VM is Sooo Slow when being stored on a BTRFS
>> partition? Any Way to Solve this? or at least have a BTRFS performance
>> improvement?
> 
> Yeah VMs are a particular problem with Btrfs.  There are a ton of
> reasons for this, for example by default we use fsync.  Fsync _sucks_
> for btrfs currently, and it has historically not been a well optimized
> piece of code.  I'm working on fixing this, but it requires VFS level
> changes that are currently sitting in Al's queue.  I suspect they will
> go into 3.1 and so we can move ahead with our work, but for now, it
> sucks.  You can use cache=none you get better performance, but still
> not that great.  And this is all because of one major thing
> 
> Btrfs has threads for _everything_.  This works out fantastically when
> you have big chunks of reads or writes you want done.  This _sucks_
> when you are doing little piddly io's.  The reason for all of this is
> because we don't want you to get bottlenecked on us
> calculating/verifying checksums, so we farm all IO and endio out to
> different threads, which as I said works out great if you are trying
> to cram gigs of data down your drives throat.
> 
> But with VMs you are doing small scattered IO's, so the IO comes down,
> we prepare it, and farm it off to a thread and wait for that thread to
> wake up and submit the io.  Then the io is completed and that is
> farmed off to another thread and we wait on that.  This switching
> around and waiting for things to wake up is hugely painful when all
> you want to do is write a few bytes.  If you were to do
> 
> dd if=/dev/zero of=/mnt/btrfs/file bs=4k count=100 oflag=direct
> 
> on a btrfs fs and then do it on an ext4 fs, you would see about a 20%
> difference between the 2.  But if you do say bs=20M, the gap closes
> quite a bit.
> 
> I fixed part of this problem for O_DIRECT (which is cache=none with
> qemu), if the IO's are small we don't send it off to a thread but
> submit it within our threads context, which is what got us with 20% of
> ext4 as opposed to 50%.  The other half is doing the completion in the
> submitters context, which is going to take some extra work.  I'm
> fixing this in the fsync case as well, but as I said we need a VFS
> patch to do it properly so that will be a little later coming.  After
> that I can do the endio part of it and hopefully get us within
> spitting distance of ext4.
> 
> So there's my long ass explanation of why VMs on Btrfs suck.  I'm
> sorry, I'm aware of the problem and I'm trying to fix it, but it's a
> slow going process.

if you said that this's the current state of btrfs than it's not ready
as a default fs for f16. so please postpone it at least to f17.

-- 
  Levente                               "Si vis pacem para bellum!"