I've hit this bug a few times on f20 with a fairly boring config:
One f20 DomU (16GB), one el6 DomU (2GB), i7 desktop w/ 32GB RAM (12GB
Dom0). The only thing not completely bog-standard is that my 'physical'
interface is on a vlan tag on each bridge, but Dom0 network continues to
hum along at the time, so I don't suppose that's a factor.
Symptom is the DomU network just goes away. It appears to be load
related (I was doing video encoding over NFS). dmesg says:
[76819.472975] vif vif-2-0 vif2.0: txreq.offset: 8ee, size: 3858, end: 6144
[76819.473012] vif vif-2-0 vif2.0: fatal error; disabling device
[76819.482474] brbfc: port 2(vif2.0) entered disabled state
A workaround is to xl save the domU to a checkpoint file (have to use -c
and destroy it), then restore it, and things continue happily. I wasn't
able to figure out a way to tell Xen to just restart the network device
(it appears to be attached and up after Xen decides it's failed).
I'll be applying the 3-line kernel patch here; do we stand any chance of
getting something like this cherry picked into the Fedora kernel? It's
not upstream as of 3.18:
I can advocate on xen-devel if needed.
Bill McGonigle, Owner
BFC Computing, LLC
Email, IM, VOIP: bill(a)bfccomputing.com
Social networks: bill_mcgonigle/bill.mcgonigle