Am 26.03.2015 um 18:07 hat Chris Murphy geschrieben:
Could someone check out this bug, and see if it needs upstream
attention? It's currently set to kernel; but I have no idea if it
would be libvirt or qemu upstreams to make aware of this.
The gist is that on Fedora 21 and 22, when virtio blkc + cache=none +
qcow2 on Btrfs, the guest OS (regardless of the file system it uses)
starts to experience many I/O errors. If the qcow2 is on XFS, or if
cache=writeback or writethrough, the problem doesn't happen.
Regression testing shows the problem does not happen on Fedora 20's
versions of libvirt and qemu, even with newer kernels. So maybe it's
not a kernel problem, or maybe it's a collision of kernel and libvirt
or qemu problem, hence the inquiry. Upstream Btrfs is aware of this
bug and are looking into it.
Could you try using new qemu with old libvirt and vice versa? This way
it should be possible to isolate the component that triggers the change
in behaviour.
To be honest, it sounds much like a problem with the btrfs driver and
O_DIRECT to me. But if changing libvirt and qemu versions is enough to
trigger it, we need to have a look at them - even if it's just to
support the btrfs investigation.
The fallout of the bug is that gnome-boxes experiences problems
since
they're currently using cache=none by default (and there's no way to
change this in the GUI) when qcow2 is on Btrfs.
blk_update_request: I/O error, dev vda, sector XXXXXXXX when qcow2 is on Btrfs
https://bugzilla.redhat.com/show_bug.cgi?id=1204569
It might be helpful to get a trace of all write requests made by qemu to
the image file.
Can you please run qemu under strace while you reproduce? (You'll need
-f because I/O is done in worker threads; also restricting the trace to
pwrite and pwritev should help to reduce the noise level)
The other option would be using qemu's own tracing, but strace should be
more relevant at this point.
Kevin