Thank you again Mark, Ian, and Phil.
As Phil pointed out, my working host has the NX feature, the broken host does not.
I also could not create a bugzilla account, but was able to use a co-worker's credentials to create this ticket: https://bugzilla.redhat.com/show_bug.cgi?id=480880
Ian, I didn't have any luck with the noexec=off.
xm dmesg output is the same as posted earlier. (XEN) mm.c:1362:d52 Bad L1 flags 800000 (XEN) traps.c:405:d52 Unhandled invalid opcode fault/trap [#6] in domain 52 on VCPU 0 [ec=0000] (XEN) domain_crash_sync called from entry.S (XEN) Domain 52 (vcpu#0) crashed on cpu#3: ...
xenctx output: /usr/lib64/xen/bin/xenctx -s System.map-2.6.27.5-117.fc10.x86_64 52 rip: ffffffff8100b8a2 set_page_prot+0x6d rsp: ffffffff81573f08 rax: ffffffea rbx: 000016e1 rcx: 00000054 rdx: 00000000 rsi: 80000000a7be7061 rdi: ffffffff816e1000 rbp: ffffffff81573f68 r8: 0000000f r9: ffffffff817eb450 r10: ffffffff817eb650 r11: 00000010 r12: ffffffff816e1000 r13: 80000000a7be7061 r14: 8000000000000161 r15: 00000016 cs: 0000e033 ds: 00000000 fs: 00000000 gs: 00000000
Stack: 0000000000000054 0000000000000010 ffffffff8100b8a2 000000010000e030 0000000000010082 ffffffff81573f48 000000000000e02b ffffffff8100b89e 0000000000000200 ffffffff816e4000 0000000000000800 0000000000002c00 ffffffff81573ff8 ffffffff815a3c60 0000000000002c00 0000000000000000
Code: 7b 4a 1d 00 4c 89 e7 4c 89 ee 31 d2 e8 22 d9 ff ff 85 c0 74 04 <0f> 0b eb fe 5b 41 5c 41 5d 41 5e
Call Trace: [<ffffffff8100b8a2>] set_page_prot+0x6d <-- [<ffffffff8100b8a2>] set_page_prot+0x6d [<ffffffff8100b89e>] set_page_prot+0x69 [<ffffffff815a3c60>] xen_start_kernel+0x5dd
Thanks, jon
-----Original Message----- From: Mark McLoughlin [mailto:markmc@redhat.com] Sent: Tuesday, January 20, 2009 7:27 PM To: Jon Swanson Cc: fedora-virt@redhat.com; virtualization@webwombat.com.au; Jeremy Fitzhardinge; Ian Campbell Subject: Re: [fedora-virt] f10 x86_64 xen VM guests fail to boot on f8 host (guest setting NX bit in L1 PTE?)
(Jeremy/Ian - here's some more info on the bug reported here:
http://lists.xensource.com/archives/html/xen-devel/2009-01/msg00176.html )
Hi Jon/Phill,
Thanks for all the info.
Here's the important bits:
1) Host kernel is 2.6.21.7-5.fc8xen, that means the hypervisor is xen-3.1.4
2) The guest kernel is 2.6.27.5-117.fc10.x86_64
3) Phill points out the faulting instruction is UD2. That just means the guest kernel is hitting a BUG() assertion. See /asm-x86/bug.h:
#define BUG() \ do { \ asm volatile("ud2"); \ for (;;) ; \ } while (0)
4) The backtrace shows the fault happens in set_page_prot()
5) Jon's dmesg contains:
(XEN) mm.c:1362:d46 Bad L1 flags 800000
That means the guest is faulting here:
static void set_page_prot(void *addr, pgprot_t prot) { .... if (HYPERVISOR_update_va_mapping((unsigned long)addr, pte, 0)) BUG(); }
because the PTE update is failing in the HV here:
static int mod_l1_entry(l1_pgentry_t *pl1e, l1_pgentry_t nl1e, unsigned long gl1mfn) { ... if ( unlikely(l1e_get_flags(nl1e) & L1_DISALLOW_MASK) ) { MEM_LOG("Bad L1 flags %x", l1e_get_flags(nl1e) & L1_DISALLOW_MASK); return 0; } ... }
the PTE flags are 800000 which corresponds to:
#define _PAGE_NX_BIT (1U<<23)
Jon/Phill - can one of you two file a bug (bugzilla.redhat.com) with all this info?
Thanks, Mark.
Previous posts, for reference:
http://www.redhat.com/archives/fedora-xen/2009-January/thread.html#00022 http://www.redhat.com/archives/fedora-virt/2009-January/thread.html#0001 3