Taking the BTRFS plunge

Wed Jul 23 18:57:01 UTC 2014

On Jul 17, 2014, at 4:08 PM, Lists <lists at benjamindsmith.com> wrote:

> As a ZFS on Linux user, I noticed that there are btrfs packages for Fedora 20. Can anybody here comment on their stability? Are you adventurous enough to use btrfs on root ? Has it saved your data? Have you lost data because of it?
> 
> Context: I'm planning on moving /home on my laptop to ZFS/BTRFS soon.

I think the top two things to keep in mind with Btrfs is: a.) don't underestimate the need for backing up; b.) don't use 'btrfs check --repair' a.k.a. btrfsck --repair, unless you've posted the specifics of your problem on the linux-btrfs@ list, it's still kinda dangerous. As in, it can make things worse. 

You always want to try 'mount -o recovery' first if a filesystem fails to mount. Then mount -o ro,recovery which if successful will at least let you update your backup before proceeding. Then possibly btrfs-zero-log at the expense of losing the most recent ~30 seconds of data. It is OK to do btrfs check without --repair, it only reports and it's good to include it in posting to linux-btrfs at .

I've found btrfs to be quite stable in single disk configurations. Most crashes I've had, btrfs either has no problem or repairs itself on normal mount. A small number of cases it wouldn't mount and I had to use 'mount -o recovery' which fixed the problem. But there are still many edge cases not yet fixed, and still being found, and there are different repair methods depending on the problem. So very quickly it can get chaotic.

Multiple device Btrfs volumes are much simpler in some respects than LVM or mdraid. But error notification is still weak. There's also confusion about how to properly report filesystem usage with df. The behaviors you get when mounting a file system degraded aren't always obvious either. Two recent problems I stumbled on:

1. systemd hangs waiting forever (no limit) for a missing root fs UUID when booting btrfs with a missing device. This happens due to some problems between btrfs kernel code and udev not making the volume UUID available when devices are missing; therefore systemd won't even attempt to mount root fs. OK so you use rd.break=pre-mount to get to a dracut shell before this hang; and then you have to know to use 'mount -o subvol=root,degraded /dev/sdaX /sysroot' and then 'exit' twice. That's pretty icky UX wise.

2. Just yesterday I lost a test system on btrfs. I'd successfully mounted a single device degraded (simulated failed 2nd device), and successfully converted it from raid1 to single profile. However I'd forgotten to "btrfs device delete missing" to get rid of the 2nd device. Upon reboot, I wasn't permitted to mount the single device normally *OR* degraded. I could mount it ro,degraded but since it's ro, I can't use 'btrfs device delete' nor can I create read-only subvolume to use with btrfs send/receive. So yeah I can mount it ro, and get data off of it with cp or rsync, but basically it's not fixable and I'll have to blow it away.

*shrug*

So I'd say if you want something quite stable, use better hard drives and XFS, optionally with LVM so you can use pvmove; and thinp snapshots which don't suffer the performance or setup troubles conventional LVM snapshots do.

Chris Murphy