f20 netfront failures
by Bill McGonigle
I've hit this bug a few times on f20 with a fairly boring config:
http://lists.xen.org/archives/html/xen-devel/2014-12/msg01045.html
currently:
xen-4.3.3-5.fc20.x86_64
3.17.3-200.fc20.x86_64
One f20 DomU (16GB), one el6 DomU (2GB), i7 desktop w/ 32GB RAM (12GB
Dom0). The only thing not completely bog-standard is that my 'physical'
interface is on a vlan tag on each bridge, but Dom0 network continues to
hum along at the time, so I don't suppose that's a factor.
Symptom is the DomU network just goes away. It appears to be load
related (I was doing video encoding over NFS). dmesg says:
[76819.472975] vif vif-2-0 vif2.0: txreq.offset: 8ee, size: 3858, end: 6144
[76819.473012] vif vif-2-0 vif2.0: fatal error; disabling device
[76819.482474] brbfc: port 2(vif2.0) entered disabled state
A workaround is to xl save the domU to a checkpoint file (have to use -c
and destroy it), then restore it, and things continue happily. I wasn't
able to figure out a way to tell Xen to just restart the network device
(it appears to be attached and up after Xen decides it's failed).
I'll be applying the 3-line kernel patch here; do we stand any chance of
getting something like this cherry picked into the Fedora kernel? It's
not upstream as of 3.18:
http://lxr.free-electrons.com/source/drivers/net/xen-netfront.c#L628
I can advocate on xen-devel if needed.
-Bill
--
Bill McGonigle, Owner
BFC Computing, LLC
http://bfccomputing.com/
Telephone: +1.855.SW.LIBRE
Email, IM, VOIP: bill(a)bfccomputing.com
VCard: http://bfccomputing.com/vcard/bill.vcf
Social networks: bill_mcgonigle/bill.mcgonigle
8 years, 8 months
Dom0 crashes with 3.17.8-300+
by Bill McGonigle
Hi, all,
My desktop is f21/Xen 4.4 and the Dom0 crashes if it's 3.17.8-300 or later. 3.17.7-300 is stable. The crashing Dom0 kernels are stable as baremetal kernels.
I've tried getting 3.18 from -testing and rebuilding 4.4 locally and get the same thing.
Serial output here:
http://fpaste.org/173093/
Looking for ideas about what could be going on and/or how to figure out what is going wrong once Dom0 starts.
(XEN) Command line: placeholder dom0_mem=12288M,max:12288M loglvl=all guest_loglvl=all com1=115200,8n1 console=com1,vga
[ 0.000000] Command line: placeholder root=/dev/mapper/luks-ba790367-2232-475a-ae19-82bbf7f7ccc5 ro earlyprintk=xen ipv6.disable=1 selinux=0 elevator=deadline rootflags=data=journal,relatime
The last lines are:
(XEN) PCI add device 0000:04:00.0
(XEN) Domain 0 crashed: rebooting machine in 5 seconds.
and device 4:00.0 is the USB controller, but I suspect it's what's happening next that's crashing.
The baremetal boot near there is:
[ 0.217464] pci 0000:04:00.0: [1b21:1142] type 00 class 0x0c0330
[ 0.217488] pci 0000:04:00.0: reg 0x10: [mem 0xf7800000-0xf7807fff 64bit]
[ 0.217611] pci 0000:04:00.0: PME# supported from D3cold
[ 0.217635] pci 0000:04:00.0: System wakeup disabled by ACPI
[ 0.219323] pci 0000:00:1c.6: PCI bridge to [bus 04]
[ 0.219395] pci 0000:00:1c.6: bridge window [mem 0xf7800000-0xf78fffff]
[ 0.219416] acpi PNP0A08:00: Disabling ASPM (FADT indicates it is unsupported)
[ 0.220032] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 10 *11 12 14 15)
[ 0.220266] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 10 11 12 14 15) *0, disabled.
[ 0.220573] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 10 *11 12 14 15)
[ 0.220803] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 *10 11 12 14 15)
[ 0.221032] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 *5 6 10 11 12 14 15)
[ 0.221264] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 10 11 12 14 15) *0, disabled.
[ 0.221569] ACPI: PCI Interrupt Link [LNKG] (IRQs *3 4 5 6 10 11 12 14 15)
[ 0.221799] ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 *10 11 12 14 15)
[ 0.222099] ACPI: Enabled 5 GPEs in block 00 to 3F
[ 0.222252] vgaarb: setting as boot device: PCI:0000:00:02.0
[ 0.222321] vgaarb: device added: PCI:0000:00:02.0,decodes=io+mem,owns=io+mem,locks=none
but I don't know enough about Xen boot to know if that's even useful information.
Thanks,
-Bill
--
Bill McGonigle, Owner
BFC Computing, LLC
http://bfccomputing.com/
Telephone: +1.855.SW.LIBRE
Email, IM, VOIP: bill(a)bfccomputing.com
VCard: http://bfccomputing.com/vcard/bill.vcf
Social networks: bill_mcgonigle/bill.mcgonigle
8 years, 8 months