Hi,
I've posted this same problem on the fedora-xen list, and the fedora forums. Sorry to anybody who is getting duplicates.
Additional log info is available at http://forums.fedoraforum.org/showthread.php?p=1149972&posted=1#post1149 972 It is also formatted a lot better and may be easier to follow.
------------------------------------------------------------------ I have two machines running fresh installs of f8 with the xen. Kernel and all software versions are the same on both. Specifically: [root@machineA boot]# uname -a Linux machineA 2.6.21.7-5.fc8xen #1 SMP Thu Aug 7 12:44:22 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux [root@machineA boot]# virsh version Compiled against library: libvir 0.4.4 Using library: libvir 0.4.4 Using API: Xen 3.0.1 Running hypervisor: Xen 3.1.0 ------------------------------------------------------------------
And: ------------------------------------------------------------------ [root@machineB ~]# uname -a Linux machineB 2.6.21.7-5.fc8xen #1 SMP Thu Aug 7 12:44:22 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux [root@machineB ~]# virsh version Compiled against library: libvir 0.4.4 Using library: libvir 0.4.4 Using API: Xen 3.0.1 Running hypervisor: Xen 3.1.0
MachineA has two AMD Opteron 275s. MachineB has four Intel(R) Xeon(TM) CPU 2.80GHz processors.
Both machines are as up to date as possible.
I can boot or create x86_64 f10 guests on MachineA with no trouble whatsoever.
MachineB will not boot/create x86_64 f10 guests.
The configuration files are created in the same manner, but as soon as Xen tries to unpause the newly created domain, it crashes pretty much instantly.
------------------------------------------------------------------ /var/log/xen/xend.log relevant output: [2009-01-16 14:45:32 4120] DEBUG (DevController:150) Waiting for devices vtpm. [2009-01-16 14:45:32 4120] INFO (XendDomain:1130) Domain f10testB (21) unpaused. [2009-01-16 14:45:32 4120] WARNING (XendDomainInfo:1203) Domain has crashed: name=f10testB id=21. [2009-01-16 14:45:32 4120] DEBUG (XendDomainInfo:1802) XendDomainInfo.destroy: domid=21 [2009-01-16 14:45:32 4120] DEBUG (XendDomainInfo:1821) XendDomainInfo.destroyDomain(21) ------------------------------------------------------------------
I've also tried moving a functional guest from MachineA to MachineB to boot it there, with the same results. Guest will not boot on MachineB.
f8 64bit guests will boot on MachineB with no problems. f10 32bit guests will boot on MachineB with no problems.
Only 64bit machines seem to be borked.
Mark on the fedora-xen list suggested running xenctx on the crashed domain. Output is as follows: ------------------------------------------------------------------ xenctx output: /usr/lib64/xen/bin/xenctx -s System.map-2.6.27.5-117.fc10.x86_64 46 rip: ffffffff8100b8a2 set_page_prot+0x6d rsp: ffffffff81573f08 rax: ffffffea rbx: 000016e1 rcx: 00000055 rdx: 00000000 rsi: 800000014ffc6061 rdi: ffffffff816e1000 rbp: ffffffff81573f68 r8: 0000000f r9: ffffffff817eb450 r10: ffffffff817eb650 r11: 00000010 r12: ffffffff816e1000 r13: 800000014ffc6061 r14: 8000000000000161 r15: 00000016 cs: 0000e033 ds: 00000000 fs: 00000000 gs: 00000000
Stack: 0000000000000055 0000000000000010 ffffffff8100b8a2 000000010000e030 0000000000010082 ffffffff81573f48 000000000000e02b ffffffff8100b89e 0000000000000200 ffffffff816e4000 0000000000000800 0000000000002c00 ffffffff81573ff8 ffffffff815a3c60 0000000000002c00 0000000000000000
Code: 7b 4a 1d 00 4c 89 e7 4c 89 ee 31 d2 e8 22 d9 ff ff 85 c0 74 04 <0f> 0b eb fe 5b 41 5c 41 5d 41 5e
Call Trace: [<ffffffff8100b8a2>] set_page_prot+0x6d <-- [<ffffffff8100b8a2>] set_page_prot+0x6d [<ffffffff8100b89e>] set_page_prot+0x69 [<ffffffff815a3c60>] xen_start_kernel+0x5dd ------------------------------------------------------------------
I also finally figured out you can look at the Xen dmesg, which includes the following line: (XEN) traps.c:405:d44 Unhandled invalid opcode fault/trap [#6] in domain 46 on VCPU 0 [ec=0000]
The domain does install so the following bug does not seem to be the cause of the current issues: http://fedoraproject.org/wiki/Bugs/F10Common#Installing_Fedora_10_DomU_o n_Fedora_8_Dom0_Fails
Any information / help / insight as to why this is happening would be very much appreciated. The machines are pretty similar, and since the guests are paravirtualized it does not really make sense for the processors to be the cause of the problem.
Thanks, jon
(Jeremy/Ian - here's some more info on the bug reported here:
http://lists.xensource.com/archives/html/xen-devel/2009-01/msg00176.html )
Hi Jon/Phill,
Thanks for all the info.
Here's the important bits:
1) Host kernel is 2.6.21.7-5.fc8xen, that means the hypervisor is xen-3.1.4
2) The guest kernel is 2.6.27.5-117.fc10.x86_64
3) Phill points out the faulting instruction is UD2. That just means the guest kernel is hitting a BUG() assertion. See /asm-x86/bug.h:
#define BUG() \ do { \ asm volatile("ud2"); \ for (;;) ; \ } while (0)
4) The backtrace shows the fault happens in set_page_prot()
5) Jon's dmesg contains:
(XEN) mm.c:1362:d46 Bad L1 flags 800000
That means the guest is faulting here:
static void set_page_prot(void *addr, pgprot_t prot) { .... if (HYPERVISOR_update_va_mapping((unsigned long)addr, pte, 0)) BUG(); }
because the PTE update is failing in the HV here:
static int mod_l1_entry(l1_pgentry_t *pl1e, l1_pgentry_t nl1e, unsigned long gl1mfn) { ... if ( unlikely(l1e_get_flags(nl1e) & L1_DISALLOW_MASK) ) { MEM_LOG("Bad L1 flags %x", l1e_get_flags(nl1e) & L1_DISALLOW_MASK); return 0; } ... }
the PTE flags are 800000 which corresponds to:
#define _PAGE_NX_BIT (1U<<23)
Jon/Phill - can one of you two file a bug (bugzilla.redhat.com) with all this info?
Thanks, Mark.
Previous posts, for reference:
http://www.redhat.com/archives/fedora-xen/2009-January/thread.html#00022 http://www.redhat.com/archives/fedora-virt/2009-January/thread.html#00013
On Tue, 2009-01-20 at 10:27 +0000, Mark McLoughlin wrote:
if ( unlikely(l1e_get_flags(nl1e) & L1_DISALLOW_MASK) ) { MEM_LOG("Bad L1 flags %x", l1e_get_flags(nl1e) & L1_DISALLOW_MASK); return 0; }... }
the PTE flags are 800000 which corresponds to:
#define _PAGE_NX_BIT (1U<<23)
At least in xen-unstable (and I think for much longer) L1_DISALLOW_MASK contains _PAGE_NX_BIT dynamically depending on the processor capabilities.
#define _PAGE_NX (cpu_has_nx ? _PAGE_NX_BIT : 0) ... /* * Disallow unused flag bits plus PAT/PSE, PCD, PWT and GLOBAL. * Permit the NX bit if the hardware supports it. */ #define BASE_DISALLOW_MASK (0xFFFFF198U & ~_PAGE_NX)
#define L1_DISALLOW_MASK (BASE_DISALLOW_MASK | _PAGE_GNTTAB)
Does the hardware support NX? What does /proc/cpuinfo in dom0 think?
The guest kernel should be setting up __supported_pte_mask appropriately to match the hardware and hence shouldn't be using NX if it isn't available. There's a command line option to force NX, can you try noexec=off on the guest command line.
My guess would be that the guest is getting a wrong EFER from somewhere...
Ian.
(resending with original xen-devel thread participants on CC, please reply to this subthread, I'll forward you guys Mark's original mail in a second)
On Tue, 2009-01-20 at 10:27 +0000, Mark McLoughlin wrote:
if ( unlikely(l1e_get_flags(nl1e) & L1_DISALLOW_MASK) ) { MEM_LOG("Bad L1 flags %x", l1e_get_flags(nl1e) & L1_DISALLOW_MASK); return 0; }... }
the PTE flags are 800000 which corresponds to:
#define _PAGE_NX_BIT (1U<<23)
At least in xen-unstable (and I think for much longer) L1_DISALLOW_MASK contains _PAGE_NX_BIT dynamically depending on the processor capabilities.
#define _PAGE_NX (cpu_has_nx ? _PAGE_NX_BIT : 0) ... /* * Disallow unused flag bits plus PAT/PSE, PCD, PWT and GLOBAL. * Permit the NX bit if the hardware supports it. */ #define BASE_DISALLOW_MASK (0xFFFFF198U & ~_PAGE_NX)
#define L1_DISALLOW_MASK (BASE_DISALLOW_MASK | _PAGE_GNTTAB)
Does the hardware support NX? What does /proc/cpuinfo in dom0 think?
The guest kernel should be setting up __supported_pte_mask appropriately to match the hardware and hence shouldn't be using NX if it isn't available. There's a command line option to force NX, can you try noexec=off on the guest command line.
My guess would be that the guest is getting a wrong EFER from somewhere...
Ian.
Hi Ian,
Indeed nx is on one and not the other! However, that doesn't help...
Broken CPU: /proc/cpuinfo: model name : Intel(R) Xeon(TM) CPU 3.00GHz flags : fpu tsc msr pae mce cx8 apic mtrr mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall lm constant_tsc up pni monitor ds_cpl cid cx16 xtpr
Good CPU: model name : AMD Athlon(tm) 64 X2 Dual Core Processor 3600+ flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy 3dnowprefetch
So, ran it again with noexec=0,
[ Minimal BASH-like line editing is supported. ESC at any time cancels. ENTER at any time accepts your changes. ]
kernel /vmlinuz-2.6.27.9-159.fc10.x86_64 ro root=LABEL=/ selinux=0 noipv6 nomodeset noexec=off
Results:
[ root@office64 xen ]# /usr/lib64/xen/bin/xenctx -s System.map-2.6.27.9-159.fc10.x86_64 119 rip: ffffffff8100b8a2 set_page_prot+0x6d rsp: ffffffff81575f08 rax: ffffffea rbx: 000016e4 rcx: 00000055 rdx: 00000000 rsi: 800000014a293061 rdi: ffffffff816e4000 rbp: ffffffff81575f68 r8: 0000000f r9: ffffffff817ee350 r10: ffffffff817ee550 r11: 00000010 r12: ffffffff816e4000 r13: 800000014a293061 r14: 8000000000000161 r15: 00002c00 cs: 0000e033 ds: 00000000 fs: 00000000 gs: 00000000
Stack: 0000000000000055 0000000000000010 ffffffff8100b8a2 000000010000e030 0000000000010082 ffffffff81575f48 000000000000e02b ffffffff8100b89e 0000000000000200 ffffffff816e7000 0000000000000800 0000000000000016 ffffffff81575ff8 ffffffff815a5c60 0000000000002c00 0000000000000000
Code: df 54 1d 00 4c 89 e7 4c 89 ee 31 d2 e8 22 d9 ff ff 85 c0 74 04 <0f> 0b eb fe 5b 41 5c 41 5d 41 5e
Call Trace: [<ffffffff8100b8a2>] set_page_prot+0x6d <-- [<ffffffff8100b8a2>] set_page_prot+0x6d [<ffffffff8100b89e>] set_page_prot+0x69 [<ffffffff815a5c60>] xen_start_kernel+0x5dd
Battling with bugzilla trying to get a new account. It doesn't like me :-(
Might have to leave it up to Jon to do the bugzilla thing.
Cheers Phill.
On Tue, 2009-01-20 at 17:03 +0000, Ian Campbell wrote:
(resending with original xen-devel thread participants on CC, please reply to this subthread, I'll forward you guys Mark's original mail in a second)
On Tue, 2009-01-20 at 10:27 +0000, Mark McLoughlin wrote:
if ( unlikely(l1e_get_flags(nl1e) & L1_DISALLOW_MASK) ) { MEM_LOG("Bad L1 flags %x", l1e_get_flags(nl1e) & L1_DISALLOW_MASK); return 0; }... }
the PTE flags are 800000 which corresponds to:
#define _PAGE_NX_BIT (1U<<23)
At least in xen-unstable (and I think for much longer) L1_DISALLOW_MASK contains _PAGE_NX_BIT dynamically depending on the processor capabilities.
#define _PAGE_NX (cpu_has_nx ? _PAGE_NX_BIT : 0) ... /* * Disallow unused flag bits plus PAT/PSE, PCD, PWT and GLOBAL. * Permit the NX bit if the hardware supports it. */ #define BASE_DISALLOW_MASK (0xFFFFF198U & ~_PAGE_NX) #define L1_DISALLOW_MASK (BASE_DISALLOW_MASK | _PAGE_GNTTAB)Does the hardware support NX? What does /proc/cpuinfo in dom0 think?
The guest kernel should be setting up __supported_pte_mask appropriately to match the hardware and hence shouldn't be using NX if it isn't available. There's a command line option to force NX, can you try noexec=off on the guest command line.
My guess would be that the guest is getting a wrong EFER from somewhere...
Ian.
Hi,
I can confirm this. With NX enabled in BIOS the domU boots fine (tested with 2.6.28).
If I disable NX in BIOS the domU will crash.
- Valtteri Kiviniemi
Virtualization kirjoitti:
Hi Ian,
Indeed nx is on one and not the other! However, that doesn't help...
Broken CPU: /proc/cpuinfo: model name : Intel(R) Xeon(TM) CPU 3.00GHz flags : fpu tsc msr pae mce cx8 apic mtrr mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall lm constant_tsc up pni monitor ds_cpl cid cx16 xtpr
Good CPU: model name : AMD Athlon(tm) 64 X2 Dual Core Processor 3600+ flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy 3dnowprefetch
So, ran it again with noexec=0,
[ Minimal BASH-like line editing is supported. ESC at any time cancels. ENTER at any time accepts your changes. ]
kernel /vmlinuz-2.6.27.9-159.fc10.x86_64 ro root=LABEL=/ selinux=0 noipv6 nomodeset noexec=off
Results:
[ root@office64 xen ]# /usr/lib64/xen/bin/xenctx -s System.map-2.6.27.9-159.fc10.x86_64 119 rip: ffffffff8100b8a2 set_page_prot+0x6d rsp: ffffffff81575f08 rax: ffffffea rbx: 000016e4 rcx: 00000055 rdx: 00000000 rsi: 800000014a293061 rdi: ffffffff816e4000 rbp: ffffffff81575f68 r8: 0000000f r9: ffffffff817ee350 r10: ffffffff817ee550 r11: 00000010 r12: ffffffff816e4000 r13: 800000014a293061 r14: 8000000000000161 r15: 00002c00 cs: 0000e033 ds: 00000000 fs: 00000000 gs: 00000000
Stack: 0000000000000055 0000000000000010 ffffffff8100b8a2 000000010000e030 0000000000010082 ffffffff81575f48 000000000000e02b ffffffff8100b89e 0000000000000200 ffffffff816e7000 0000000000000800 0000000000000016 ffffffff81575ff8 ffffffff815a5c60 0000000000002c00 0000000000000000
Code: df 54 1d 00 4c 89 e7 4c 89 ee 31 d2 e8 22 d9 ff ff 85 c0 74 04 <0f> 0b eb fe 5b 41 5c 41 5d 41 5e
Call Trace: [<ffffffff8100b8a2>] set_page_prot+0x6d <-- [<ffffffff8100b8a2>] set_page_prot+0x6d [<ffffffff8100b89e>] set_page_prot+0x69 [<ffffffff815a5c60>] xen_start_kernel+0x5dd
Battling with bugzilla trying to get a new account. It doesn't like me :-(
Might have to leave it up to Jon to do the bugzilla thing.
Cheers Phill.
On Tue, 2009-01-20 at 17:03 +0000, Ian Campbell wrote:
(resending with original xen-devel thread participants on CC, please reply to this subthread, I'll forward you guys Mark's original mail in a second)
On Tue, 2009-01-20 at 10:27 +0000, Mark McLoughlin wrote:
if ( unlikely(l1e_get_flags(nl1e) & L1_DISALLOW_MASK) ) { MEM_LOG("Bad L1 flags %x", l1e_get_flags(nl1e) & L1_DISALLOW_MASK); return 0; }... }
the PTE flags are 800000 which corresponds to:
#define _PAGE_NX_BIT (1U<<23)
At least in xen-unstable (and I think for much longer) L1_DISALLOW_MASK contains _PAGE_NX_BIT dynamically depending on the processor capabilities.
#define _PAGE_NX (cpu_has_nx ? _PAGE_NX_BIT : 0) ... /* * Disallow unused flag bits plus PAT/PSE, PCD, PWT and GLOBAL. * Permit the NX bit if the hardware supports it. */ #define BASE_DISALLOW_MASK (0xFFFFF198U & ~_PAGE_NX) #define L1_DISALLOW_MASK (BASE_DISALLOW_MASK | _PAGE_GNTTAB)Does the hardware support NX? What does /proc/cpuinfo in dom0 think?
The guest kernel should be setting up __supported_pte_mask appropriately to match the hardware and hence shouldn't be using NX if it isn't available. There's a command line option to force NX, can you try noexec=off on the guest command line.
My guess would be that the guest is getting a wrong EFER from somewhere...
Ian.
On Tue, 2009-01-27 at 19:42 +0200, Valtteri Kiviniemi wrote:
Hi,
I can confirm this. With NX enabled in BIOS the domU boots fine (tested with 2.6.28).
If I disable NX in BIOS the domU will crash.
Thanks for confirming. Did you get a chance to try the patch I sent?
Ian.
- Valtteri Kiviniemi
Virtualization kirjoitti:
Hi Ian,
Indeed nx is on one and not the other! However, that doesn't help...
Broken CPU: /proc/cpuinfo: model name : Intel(R) Xeon(TM) CPU 3.00GHz flags : fpu tsc msr pae mce cx8 apic mtrr mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall lm constant_tsc up pni monitor ds_cpl cid cx16 xtpr
Good CPU: model name : AMD Athlon(tm) 64 X2 Dual Core Processor 3600+ flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy 3dnowprefetch
So, ran it again with noexec=0,
[ Minimal BASH-like line editing is supported. ESC at any time cancels. ENTER at any time accepts your changes. ]
kernel /vmlinuz-2.6.27.9-159.fc10.x86_64 ro root=LABEL=/ selinux=0 noipv6 nomodeset noexec=off
Results:
[ root@office64 xen ]# /usr/lib64/xen/bin/xenctx -s System.map-2.6.27.9-159.fc10.x86_64 119 rip: ffffffff8100b8a2 set_page_prot+0x6d rsp: ffffffff81575f08 rax: ffffffea rbx: 000016e4 rcx: 00000055 rdx: 00000000 rsi: 800000014a293061 rdi: ffffffff816e4000 rbp: ffffffff81575f68 r8: 0000000f r9: ffffffff817ee350 r10: ffffffff817ee550 r11: 00000010 r12: ffffffff816e4000 r13: 800000014a293061 r14: 8000000000000161 r15: 00002c00 cs: 0000e033 ds: 00000000 fs: 00000000 gs: 00000000
Stack: 0000000000000055 0000000000000010 ffffffff8100b8a2 000000010000e030 0000000000010082 ffffffff81575f48 000000000000e02b ffffffff8100b89e 0000000000000200 ffffffff816e7000 0000000000000800 0000000000000016 ffffffff81575ff8 ffffffff815a5c60 0000000000002c00 0000000000000000
Code: df 54 1d 00 4c 89 e7 4c 89 ee 31 d2 e8 22 d9 ff ff 85 c0 74 04 <0f> 0b eb fe 5b 41 5c 41 5d 41 5e
Call Trace: [<ffffffff8100b8a2>] set_page_prot+0x6d <-- [<ffffffff8100b8a2>] set_page_prot+0x6d [<ffffffff8100b89e>] set_page_prot+0x69 [<ffffffff815a5c60>] xen_start_kernel+0x5dd
Battling with bugzilla trying to get a new account. It doesn't like me :-(
Might have to leave it up to Jon to do the bugzilla thing.
Cheers Phill.
On Tue, 2009-01-20 at 17:03 +0000, Ian Campbell wrote:
(resending with original xen-devel thread participants on CC, please reply to this subthread, I'll forward you guys Mark's original mail in a second)
On Tue, 2009-01-20 at 10:27 +0000, Mark McLoughlin wrote:
if ( unlikely(l1e_get_flags(nl1e) & L1_DISALLOW_MASK) ) { MEM_LOG("Bad L1 flags %x", l1e_get_flags(nl1e) & L1_DISALLOW_MASK); return 0; }... }
the PTE flags are 800000 which corresponds to:
#define _PAGE_NX_BIT (1U<<23)
At least in xen-unstable (and I think for much longer) L1_DISALLOW_MASK contains _PAGE_NX_BIT dynamically depending on the processor capabilities.
#define _PAGE_NX (cpu_has_nx ? _PAGE_NX_BIT : 0) ... /* * Disallow unused flag bits plus PAT/PSE, PCD, PWT and GLOBAL. * Permit the NX bit if the hardware supports it. */ #define BASE_DISALLOW_MASK (0xFFFFF198U & ~_PAGE_NX) #define L1_DISALLOW_MASK (BASE_DISALLOW_MASK | _PAGE_GNTTAB)Does the hardware support NX? What does /proc/cpuinfo in dom0 think?
The guest kernel should be setting up __supported_pte_mask appropriately to match the hardware and hence shouldn't be using NX if it isn't available. There's a command line option to force NX, can you try noexec=off on the guest command line.
My guess would be that the guest is getting a wrong EFER from somewhere...
Ian.
Thank you again Mark, Ian, and Phil.
As Phil pointed out, my working host has the NX feature, the broken host does not.
I also could not create a bugzilla account, but was able to use a co-worker's credentials to create this ticket: https://bugzilla.redhat.com/show_bug.cgi?id=480880
Ian, I didn't have any luck with the noexec=off.
xm dmesg output is the same as posted earlier. (XEN) mm.c:1362:d52 Bad L1 flags 800000 (XEN) traps.c:405:d52 Unhandled invalid opcode fault/trap [#6] in domain 52 on VCPU 0 [ec=0000] (XEN) domain_crash_sync called from entry.S (XEN) Domain 52 (vcpu#0) crashed on cpu#3: ...
xenctx output: /usr/lib64/xen/bin/xenctx -s System.map-2.6.27.5-117.fc10.x86_64 52 rip: ffffffff8100b8a2 set_page_prot+0x6d rsp: ffffffff81573f08 rax: ffffffea rbx: 000016e1 rcx: 00000054 rdx: 00000000 rsi: 80000000a7be7061 rdi: ffffffff816e1000 rbp: ffffffff81573f68 r8: 0000000f r9: ffffffff817eb450 r10: ffffffff817eb650 r11: 00000010 r12: ffffffff816e1000 r13: 80000000a7be7061 r14: 8000000000000161 r15: 00000016 cs: 0000e033 ds: 00000000 fs: 00000000 gs: 00000000
Stack: 0000000000000054 0000000000000010 ffffffff8100b8a2 000000010000e030 0000000000010082 ffffffff81573f48 000000000000e02b ffffffff8100b89e 0000000000000200 ffffffff816e4000 0000000000000800 0000000000002c00 ffffffff81573ff8 ffffffff815a3c60 0000000000002c00 0000000000000000
Code: 7b 4a 1d 00 4c 89 e7 4c 89 ee 31 d2 e8 22 d9 ff ff 85 c0 74 04 <0f> 0b eb fe 5b 41 5c 41 5d 41 5e
Call Trace: [<ffffffff8100b8a2>] set_page_prot+0x6d <-- [<ffffffff8100b8a2>] set_page_prot+0x6d [<ffffffff8100b89e>] set_page_prot+0x69 [<ffffffff815a3c60>] xen_start_kernel+0x5dd
Thanks, jon
-----Original Message----- From: Mark McLoughlin [mailto:markmc@redhat.com] Sent: Tuesday, January 20, 2009 7:27 PM To: Jon Swanson Cc: fedora-virt@redhat.com; virtualization@webwombat.com.au; Jeremy Fitzhardinge; Ian Campbell Subject: Re: [fedora-virt] f10 x86_64 xen VM guests fail to boot on f8 host (guest setting NX bit in L1 PTE?)
(Jeremy/Ian - here's some more info on the bug reported here:
http://lists.xensource.com/archives/html/xen-devel/2009-01/msg00176.html )
Hi Jon/Phill,
Thanks for all the info.
Here's the important bits:
1) Host kernel is 2.6.21.7-5.fc8xen, that means the hypervisor is xen-3.1.4
2) The guest kernel is 2.6.27.5-117.fc10.x86_64
3) Phill points out the faulting instruction is UD2. That just means the guest kernel is hitting a BUG() assertion. See /asm-x86/bug.h:
#define BUG() \ do { \ asm volatile("ud2"); \ for (;;) ; \ } while (0)
4) The backtrace shows the fault happens in set_page_prot()
5) Jon's dmesg contains:
(XEN) mm.c:1362:d46 Bad L1 flags 800000
That means the guest is faulting here:
static void set_page_prot(void *addr, pgprot_t prot) { .... if (HYPERVISOR_update_va_mapping((unsigned long)addr, pte, 0)) BUG(); }
because the PTE update is failing in the HV here:
static int mod_l1_entry(l1_pgentry_t *pl1e, l1_pgentry_t nl1e, unsigned long gl1mfn) { ... if ( unlikely(l1e_get_flags(nl1e) & L1_DISALLOW_MASK) ) { MEM_LOG("Bad L1 flags %x", l1e_get_flags(nl1e) & L1_DISALLOW_MASK); return 0; } ... }
the PTE flags are 800000 which corresponds to:
#define _PAGE_NX_BIT (1U<<23)
Jon/Phill - can one of you two file a bug (bugzilla.redhat.com) with all this info?
Thanks, Mark.
Previous posts, for reference:
http://www.redhat.com/archives/fedora-xen/2009-January/thread.html#00022 http://www.redhat.com/archives/fedora-virt/2009-January/thread.html#0001 3
On Wed, 2009-01-21 at 12:22 +0900, Jon Swanson wrote:
Thank you again Mark, Ian, and Phil.
As Phil pointed out, my working host has the NX feature, the broken host does not.
I also could not create a bugzilla account, but was able to use a co-worker's credentials to create this ticket: https://bugzilla.redhat.com/show_bug.cgi?id=480880
Ian, I didn't have any luck with the noexec=off.
It turns out that nonx_setup() and check_efer() both run quite a while after all the set_page_prot calls in xen_setup_kernel() which include _PAGE_NX via PAGE_KERNEL_RO.
On 32 bit __supported_pte_mask starts off without NX in it and it gets added later if the system supports it. This is safe but means that the pages frobbed by the early Xen setup won't have NX set when they could (unless they all get frobbed again later?)
On 64 bit __supported_pte_mask contains NX at start of day and it is taken away later on if the system turns out not to support it.
Native seems to mainly use _KERNPG_TABLE which does not include NX, can you try this patch? (lots of printks because I don't have any non-NX hardware to test properly).
diff -r ec792b22009f arch/x86/mm/init_64.c --- a/arch/x86/mm/init_64.c Fri Jan 23 15:27:45 2009 +0000 +++ b/arch/x86/mm/init_64.c Fri Jan 23 15:58:03 2009 +0000 @@ -103,12 +103,15 @@ */ static int __init nonx_setup(char *str) { + printk(KERN_CRIT "noexec_setup %s\n", str); if (!str) return -EINVAL; if (!strncmp(str, "on", 2)) { + printk(KERN_CRIT "noexec_setup: enabling NX\n"); __supported_pte_mask |= _PAGE_NX; do_not_nx = 0; } else if (!strncmp(str, "off", 3)) { + printk(KERN_CRIT "noexec_setup: disabling NX\n"); do_not_nx = 1; __supported_pte_mask &= ~_PAGE_NX; } @@ -121,8 +124,13 @@ unsigned long efer;
rdmsrl(MSR_EFER, efer); - if (!(efer & EFER_NX) || do_not_nx) + if (!(efer & EFER_NX) || do_not_nx) { + printk(KERN_CRIT "check_efer: disabling NX\n"); __supported_pte_mask &= ~_PAGE_NX; + } else + printk(KERN_CRIT "check_efer: leaving NX alone. supported_pte_mask %s the NX bit\n", + __supported_pte_mask & _PAGE_NX ? "includes" : "excludes"); + }
int force_personality32; diff -r ec792b22009f arch/x86/xen/enlighten.c --- a/arch/x86/xen/enlighten.c Fri Jan 23 15:27:45 2009 +0000 +++ b/arch/x86/xen/enlighten.c Fri Jan 23 15:58:03 2009 +0000 @@ -54,6 +54,9 @@ #include "xen-ops.h" #include "mmu.h" #include "multicalls.h" + +#define _KERNPG_TABLE_RO __pgprot(_KERNPG_TABLE & ~_PAGE_RW) +//#define _KERNPG_TABLE_RO (_KERNPG_TABLE)
EXPORT_SYMBOL_GPL(hypercall_page);
@@ -1476,6 +1479,15 @@ { unsigned long pfn = __pa(addr) >> PAGE_SHIFT; pte_t pte = pfn_pte(pfn, prot); + static int once = 5; + + if (once > 0 && pte_val(pte) & _PAGE_NX) { + once--; + printk(KERN_CRIT "set_page_prot to %#lx (incl NX) supported_pte_mask %#lx %s the NX bit\n", + pgprot_val(prot), __supported_pte_mask, __supported_pte_mask & _PAGE_NX ? "includes" : "excludes"); + printk(KERN_CRIT "pte is %#lx\n", pte_val(pte)); + WARN_ON(1); + }
if (HYPERVISOR_update_va_mapping((unsigned long)addr, pte, 0)) BUG(); @@ -1522,9 +1534,9 @@ }
for (pteidx = 0; pteidx < ident_pte; pteidx += PTRS_PER_PTE) - set_page_prot(&level1_ident_pgt[pteidx], PAGE_KERNEL_RO); + set_page_prot(&level1_ident_pgt[pteidx], _KERNPG_TABLE_RO);
- set_page_prot(pmd, PAGE_KERNEL_RO); + set_page_prot(pmd, _KERNPG_TABLE_RO); }
static __init void xen_ident_map_ISA(void) @@ -1601,12 +1613,12 @@ xen_map_identity_early(level2_ident_pgt, max_pfn);
/* Make pagetable pieces RO */ - set_page_prot(init_level4_pgt, PAGE_KERNEL_RO); - set_page_prot(level3_ident_pgt, PAGE_KERNEL_RO); - set_page_prot(level3_kernel_pgt, PAGE_KERNEL_RO); - set_page_prot(level3_user_vsyscall, PAGE_KERNEL_RO); - set_page_prot(level2_kernel_pgt, PAGE_KERNEL_RO); - set_page_prot(level2_fixmap_pgt, PAGE_KERNEL_RO); + set_page_prot(init_level4_pgt, _KERNPG_TABLE_RO); + set_page_prot(level3_ident_pgt, _KERNPG_TABLE_RO); + set_page_prot(level3_kernel_pgt, _KERNPG_TABLE_RO); + set_page_prot(level3_user_vsyscall, _KERNPG_TABLE_RO); + set_page_prot(level2_kernel_pgt, _KERNPG_TABLE_RO); + set_page_prot(level2_fixmap_pgt, _KERNPG_TABLE_RO);
/* Pin down new L4 */ pin_pagetable_pfn(MMUEXT_PIN_L4_TABLE, @@ -1670,9 +1682,9 @@ set_pgd(&swapper_pg_dir[KERNEL_PGD_BOUNDARY], __pgd(__pa(level2_kernel_pgt) | _PAGE_PRESENT));
- set_page_prot(level2_kernel_pgt, PAGE_KERNEL_RO); - set_page_prot(swapper_pg_dir, PAGE_KERNEL_RO); - set_page_prot(empty_zero_page, PAGE_KERNEL_RO); + set_page_prot(level2_kernel_pgt, _KERNPG_TABLE_RO); + set_page_prot(swapper_pg_dir, _KERNPG_TABLE_RO); + set_page_prot(empty_zero_page, _KERNPG_TABLE_RO);
pin_pagetable_pfn(MMUEXT_UNPIN_TABLE, PFN_DOWN(__pa(pgd)));
Hey Ian,
Probably incompetence on my part, but I am unable to get the patch to apply. What specific version of the kernel are you doing this on?
I'm getting the following error: ------------------------------------------------------------------------ --------------------------------------------- ~/rpmbuild/BUILD> patch --dry-run -p0 < ~/f10xenNoNX.patch.txt patching file a/arch/x86/mm/init_64.c patch: **** malformed patch at line 19: @@ -121,8 +124,13 @@ ------------------------------------------------------------------------ ---------------------------------------------
I've tried with the following kernel versions, and duplicated your directory structure: kernel-2.6.27.5-117.fc10.src.rpm kernel-2.6.27.9-159.fc10.src.rpm
I've also tried just patching one of the files: ------------------------------------------------------------------------ --------------------------------------------- ~/rpmbuild/BUILD>patch --dry-run --verbose a/arch/x86/mm/init_64.c ~/f10init_64.c.patch Hmm... Looks like a unified diff to me... The text leading up to this was: -------------------------- |diff -r ec792b22009f arch/x86/mm/init_64.c |--- a/arch/x86/mm/init_64.c Fri Jan 23 15:27:45 2009 +0000 |+++ b/arch/x86/mm/init_64.c Fri Jan 23 15:58:03 2009 +0000 -------------------------- Patching file a/arch/x86/mm/init_64.c using Plan A... patch: **** malformed patch at line 19: @@ -121,8 +124,13 @@ ------------------------------------------------------------------------ ---------------------------------------------
I opened up the file and was about to do it manually, but it seems radically different so stopped. ------------------------------------------------------------------------ --------------------------------------------- ~/rpmbuild/BUILD> sed -n '120,125p' a/arch/x86/mm/init_64.c
pud = pud_page + pud_index(vaddr); if (pud_none(*pud)) { pmd = (pmd_t *) spp_getpage(); pud_populate(&init_mm, pud, pmd); if (pmd != pmd_offset(pud, 0)) { ------------------------------------------------------------------------ ---------------------------------------------
I'm sorry if I'm just being an idiot, but any insight you can share would be greatly appreciated.
Thanks, jon
-----Original Message----- From: Ian Campbell [mailto:Ian.Campbell@citrix.com] Sent: Saturday, January 24, 2009 12:59 AM To: Jon Swanson Cc: Mark McLoughlin; fedora-virt@redhat.com; virtualization@webwombat.com.au; Jeremy Fitzhardinge Subject: RE: [fedora-virt] f10 x86_64 xen VM guests fail to boot on f8 host(guest setting NX bit in L1 PTE?)
On Wed, 2009-01-21 at 12:22 +0900, Jon Swanson wrote:
Thank you again Mark, Ian, and Phil.
As Phil pointed out, my working host has the NX feature, the broken host does not.
I also could not create a bugzilla account, but was able to use a co-worker's credentials to create this ticket: https://bugzilla.redhat.com/show_bug.cgi?id=480880
Ian, I didn't have any luck with the noexec=off.
It turns out that nonx_setup() and check_efer() both run quite a while after all the set_page_prot calls in xen_setup_kernel() which include _PAGE_NX via PAGE_KERNEL_RO.
On 32 bit __supported_pte_mask starts off without NX in it and it gets added later if the system supports it. This is safe but means that the pages frobbed by the early Xen setup won't have NX set when they could (unless they all get frobbed again later?)
On 64 bit __supported_pte_mask contains NX at start of day and it is taken away later on if the system turns out not to support it.
Native seems to mainly use _KERNPG_TABLE which does not include NX, can you try this patch? (lots of printks because I don't have any non-NX hardware to test properly).
diff -r ec792b22009f arch/x86/mm/init_64.c --- a/arch/x86/mm/init_64.c Fri Jan 23 15:27:45 2009 +0000 +++ b/arch/x86/mm/init_64.c Fri Jan 23 15:58:03 2009 +0000 @@ -103,12 +103,15 @@ */ static int __init nonx_setup(char *str) { + printk(KERN_CRIT "noexec_setup %s\n", str); if (!str) return -EINVAL; if (!strncmp(str, "on", 2)) { + printk(KERN_CRIT "noexec_setup: enabling NX\n"); __supported_pte_mask |= _PAGE_NX; do_not_nx = 0; } else if (!strncmp(str, "off", 3)) { + printk(KERN_CRIT "noexec_setup: disabling NX\n"); do_not_nx = 1; __supported_pte_mask &= ~_PAGE_NX; } @@ -121,8 +124,13 @@ unsigned long efer;
rdmsrl(MSR_EFER, efer); - if (!(efer & EFER_NX) || do_not_nx) + if (!(efer & EFER_NX) || do_not_nx) { + printk(KERN_CRIT "check_efer: disabling NX\n"); __supported_pte_mask &= ~_PAGE_NX; + } else + printk(KERN_CRIT "check_efer: leaving NX alone. supported_pte_mask %s the NX bit\n", + __supported_pte_mask & _PAGE_NX ? "includes" : "excludes"); + }
int force_personality32; diff -r ec792b22009f arch/x86/xen/enlighten.c --- a/arch/x86/xen/enlighten.c Fri Jan 23 15:27:45 2009 +0000 +++ b/arch/x86/xen/enlighten.c Fri Jan 23 15:58:03 2009 +0000 @@ -54,6 +54,9 @@ #include "xen-ops.h" #include "mmu.h" #include "multicalls.h" + +#define _KERNPG_TABLE_RO __pgprot(_KERNPG_TABLE & ~_PAGE_RW) //#define +_KERNPG_TABLE_RO (_KERNPG_TABLE)
EXPORT_SYMBOL_GPL(hypercall_page);
@@ -1476,6 +1479,15 @@ { unsigned long pfn = __pa(addr) >> PAGE_SHIFT; pte_t pte = pfn_pte(pfn, prot); + static int once = 5; + + if (once > 0 && pte_val(pte) & _PAGE_NX) { + once--; + printk(KERN_CRIT "set_page_prot to %#lx (incl NX) supported_pte_mask %#lx %s the NX bit\n", + pgprot_val(prot), __supported_pte_mask, __supported_pte_mask & _PAGE_NX ? "includes" : "excludes"); + printk(KERN_CRIT "pte is %#lx\n", pte_val(pte)); + WARN_ON(1); + }
if (HYPERVISOR_update_va_mapping((unsigned long)addr, pte, 0)) BUG(); @@ -1522,9 +1534,9 @@ }
for (pteidx = 0; pteidx < ident_pte; pteidx += PTRS_PER_PTE) - set_page_prot(&level1_ident_pgt[pteidx], PAGE_KERNEL_RO); + set_page_prot(&level1_ident_pgt[pteidx], _KERNPG_TABLE_RO);
- set_page_prot(pmd, PAGE_KERNEL_RO); + set_page_prot(pmd, _KERNPG_TABLE_RO); }
static __init void xen_ident_map_ISA(void) @@ -1601,12 +1613,12 @@ xen_map_identity_early(level2_ident_pgt, max_pfn);
/* Make pagetable pieces RO */ - set_page_prot(init_level4_pgt, PAGE_KERNEL_RO); - set_page_prot(level3_ident_pgt, PAGE_KERNEL_RO); - set_page_prot(level3_kernel_pgt, PAGE_KERNEL_RO); - set_page_prot(level3_user_vsyscall, PAGE_KERNEL_RO); - set_page_prot(level2_kernel_pgt, PAGE_KERNEL_RO); - set_page_prot(level2_fixmap_pgt, PAGE_KERNEL_RO); + set_page_prot(init_level4_pgt, _KERNPG_TABLE_RO); + set_page_prot(level3_ident_pgt, _KERNPG_TABLE_RO); + set_page_prot(level3_kernel_pgt, _KERNPG_TABLE_RO); + set_page_prot(level3_user_vsyscall, _KERNPG_TABLE_RO); + set_page_prot(level2_kernel_pgt, _KERNPG_TABLE_RO); + set_page_prot(level2_fixmap_pgt, _KERNPG_TABLE_RO);
/* Pin down new L4 */ pin_pagetable_pfn(MMUEXT_PIN_L4_TABLE, @@ -1670,9 +1682,9 @@ set_pgd(&swapper_pg_dir[KERNEL_PGD_BOUNDARY], __pgd(__pa(level2_kernel_pgt) | _PAGE_PRESENT));
- set_page_prot(level2_kernel_pgt, PAGE_KERNEL_RO); - set_page_prot(swapper_pg_dir, PAGE_KERNEL_RO); - set_page_prot(empty_zero_page, PAGE_KERNEL_RO); + set_page_prot(level2_kernel_pgt, _KERNPG_TABLE_RO); + set_page_prot(swapper_pg_dir, _KERNPG_TABLE_RO); + set_page_prot(empty_zero_page, _KERNPG_TABLE_RO);
pin_pagetable_pfn(MMUEXT_UNPIN_TABLE, PFN_DOWN(__pa(pgd)));