Thanks for keeping vanilla kernel alive.

Boris.

--- On Wed, 8/26/09, Boris Derzhavets <bderzhavets@yahoo.com> wrote:

From: Boris Derzhavets <bderzhavets@yahoo.com>
Subject: Re: [fedora-virt] Re: [Fedora-xen] Dom0 kernels
To: fedora-xen@redhat.com, fedora-virt@redhat.com, "M A Young" <m.a.young@durham.ac.uk>
Date: Wednesday, August 26, 2009, 9:58 AM

I've made one more clean install with repo file :-

[root@ServerXen341 yum.repos.d]# cat fedora-myoung-dom0.repo
[myoung-dom0]
name=myoung's repository of Fedora based dom0 kernels - $basearch
baseurl=http://fedorapeople.org/~myoung/dom0/$basearch/
enabled=1
gpgcheck=0

[myoung-dom0-source]
name=myoung's repository of Fedora based dom0 kernels - Source
baseurl=http://fedorapeople.org/~myoung/dom0/src/
enabled=1
gpgcheck=0

  Then tried to load kernel been built as vanilla. Console dropped into stack trace
and hanged.
   However, kernel loads fine and works under Xen. Dmesg report
under Xen stays the same ( stack trace entries ).

Boris.

--- On Wed, 8/26/09, Boris Derzhavets <bderzhavets@yahoo.com> wrote:

From: Boris Derzhavets <bderzhavets@yahoo.com>
Subject: Re: [fedora-virt] Re: [Fedora-xen] Dom0 kernels
To: fedora-xen@redhat.com, fedora-virt@redhat.com, "M A Young" <m.a.young@durham.ac.uk>
Date: Wednesday, August 26, 2009, 7:26 AM

Upstream issues looks DomU related.
Kernel 2.6.31-0.1.2.58.rc7.git1.xendom0.fc11.x86_64 works stable at runtime.

Boris.

--- On Wed, 8/26/09, Boris Derzhavets <bderzhavets@yahoo.com> wrote:

From: Boris Derzhavets <bderzhavets@yahoo.com>
Subject: Re: [fedora-virt] Re: [Fedora-xen] Dom0 kernels
To: fedora-xen@redhat.com, fedora-virt@redhat.com, "M A Young" <m.a.young@durham.ac.uk>
Cc: xen-devel@xensource.com
Date: Wednesday, August 26, 2009, 3:58 AM

Not sure how 2.6.31-0.1.2.58.rc7.git1.xendom0.fc11.x86_64  has been built.
There are pretty recent ongoing issues with 2.6.31-rc7 in upstream :-

http://lkml.org/lkml/2009/8/25/347
http://patchwork.kernel.org/patch/43791/


I was able to load Xen guest under   2.6.31-0.1.2.58.rc7.git1.xendom0.fc11.x86_64
followed by kernel error with sky2 (?) and loosing vnc connection to Xen Host 3.4.1 on top of F11.  I'll do some more testing.

Boris.


--- On Wed, 8/26/09, Boris Derzhavets <bderzhavets@yahoo.com> wrote:

From: Boris Derzhavets <bderzhavets@yahoo.com>
Subject: Re: [fedora-virt] Re: [Fedora-xen] Dom0 kernels
To: fedora-xen@redhat.com, fedora-virt@redhat.com, "M A Young" <m.a.young@durham.ac.uk>
Cc: xen-devel@xensource.com
Date: Wednesday, August 26, 2009, 2:50 AM

With rpm upgraded can load Dom0. However, dmesg report contains :-

Initializing cgroup subsys cpuset
Initializing cgroup subsys cpu
Linux version 2.6.31-0.1.2.58.rc7.git1.xendom0.fc11.x86_64 (root@ServerXenSRC) (gcc version 4.4.0 20090506 (Red Hat 4.4.0-4) (GCC) ) #1 SMP Wed Aug 26 08:14:58 MSD 2009
Command line: root=/dev/mapper/vg_serverxensrc-LogVol00  ro console=tty0
KERNEL supported cpus:
  Intel GenuineIntel
  AMD AuthenticAMD
  Centaur CentaurHauls
BIOS-provided physical RAM map:
 Xen: 0000000000000000 - 000000000009ec00 (usable)
 Xen: 000000000009ec00 - 0000000000100000 (reserved)
 Xen: 0000000000100000 - 00000000cff80000 (usable)
 Xen: 00000000cff80000 - 00000000cff8e000 (ACPI data)
 Xen: 00000000cff8e000 - 00000000cffe0000 (ACPI NVS)
 Xen: 00000000cffe0000 - 00000000d0000000 (reserved)
 Xen: 00000000fee00000 - 00000000fee01000 (reserved)
 Xen: 00000000ffe00000 - 0000000100000000 (reserved)
 Xen: 0000000100000000 - 00000001f1a6b000 (usable)
DMI 2.4 present.
AMI BIOS detected: BIOS may corrupt low RAM, working around it.
e820 update range: 0000000000000000 - 0000000000010000 (usable) ==> (reserved)
last_pfn = 0x1f1a6b max_arch_pfn = 0x400000000
last_pfn = 0xcff80 max_arch_pfn = 0x400000000
initial memory mapped : 0 - 20000000
init_memory_mapping: 0000000000000000-00000000cff80000
 0000000000 - 00cff80000 page 4k
kernel direct mapping tables up to cff80000 @ 100000-785000
init_memory_mapping: 0000000100000000-00000001f1a6b000
 0100000000 - 01f1a6b000 page 4k

. . . . . . .

======================================================
[ INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected ]
2.6.31-0.1.2.58.rc7.git1.xendom0.fc11.x86_64 #1
------------------------------------------------------
khubd/28 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
 (&retval->lock){......}, at: [<ffffffff8112a240>] dma_pool_alloc+0x45/0x321

and this task is already holding:
 (&ehci->lock){-.....}, at: [<ffffffff813d4654>] ehci_urb_enqueue+0xb4/0xd5c
which would create a new lock dependency:
 (&ehci->lock){-.....} -> (&retval->lock){......}

but this new dependency connects a HARDIRQ-irq-safe lock:
 (&ehci->lock){-.....}
... which became HARDIRQ-irq-safe at:
  [<ffffffff8109908b>] __lock_acquire+0x254/0xc0e
  [<ffffffff81099b33>] lock_acquire+0xee/0x12e
  [<ffffffff8150b987>] _spin_lock+0x45/0x8e
  [<ffffffff813d325c>] ehci_irq+0x41/0x441
  [<ffffffff813b7e5f>] usb_hcd_irq+0x59/0xcc
  [<ffffffff810ca810>] handle_IRQ_event+0x62/0x148
  [<ffffffff810ccda3>] handle_level_irq+0x90/0xf9
  [<ffffffff81017078>] handle_irq+0x9a/0xba
  [<ffffffff8130a0c6>] xen_evtchn_do_upcall+0x10c/0x1bd
  [<ffffffff8101527e>] xen_do_hypervisor_callback+0x1e/0x30
  [<ffffffffffffffff>] 0xffffffffffffffff

to a HARDIRQ-irq-unsafe lock:
 (purge_lock){+.+...}
... which became HARDIRQ-irq-unsafe at:
...  [<ffffffff810990ff>] __lock_acquire+0x2c8/0xc0e
  [<ffffffff81099b33>] lock_acquire+0xee/0x12e
  [<ffffffff8150b987>] _spin_lock+0x45/0x8e
  [<ffffffff811233a4>] __purge_vmap_area_lazy+0x63/0x198
  [<ffffffff81124c74>] vm_unmap_aliases+0x18f/0x1b2
  [<ffffffff8100eeb3>] xen_alloc_ptpage+0x5a/0xa0
  [<ffffffff8100ef97>] xen_alloc_pte+0x26/0x3c
  [<ffffffff81118381>] __pte_alloc_kernel+0x6f/0xdd
  [<ffffffff811241a1>] vmap_page_range_noflush+0x1c5/0x315
  [<ffffffff81124332>] map_vm_area+0x41/0x6b
  [<ffffffff8112448b>] __vmalloc_area_node+0x12f/0x167
  [<ffffffff81124553>] __vmalloc_node+0x90/0xb5
  [<ffffffff811243c8>] __vmalloc_area_node+0x6c/0x167
  [<ffffffff81124553>] __vmalloc_node+0x90/0xb5
  [<ffffffff811247ca>] __vmalloc+0x28/0x3e
  [<ffffffff818504c9>] alloc_large_system_hash+0x12f/0x1fb
  [<ffffffff81852b4e>] vfs_caches_init+0xb8/0x140
  [<ffffffff8182b061>] start_kernel+0x3ef/0x44c
  [<ffffffff8182a2d0>] x86_64_start_reservations+0xbb/0xd6
  [<ffffffff8182e6d1>] xen_start_kernel+0x5ab/0x5b2
  [<ffffffffffffffff>] 0xffffffffffffffff

other info that might help us debug this:

2 locks held by khubd/28:
 #0:  (usb_address0_mutex){+.+...}, at: [<ffffffff813b2e82>] hub_port_init+0x8c/0x7ee
 #1:  (&ehci->lock){-.....}, at: [<ffffffff813d4654>] ehci_urb_enqueue+0xb4/0xd5c

the HARDIRQ-irq-safe lock's dependencies:
-> (&ehci->lock){-.....} ops: 0 {
   IN-HARDIRQ-W at:
                        [<ffffffff8109908b>] __lock_acquire+0x254/0xc0e
                        [<ffffffff81099b33>] lock_acquire+0xee/0x12e
                        [<ffffffff8150b987>] _spin_lock+0x45/0x8e
                        [<ffffffff813d325c>] ehci_irq+0x41/0x441
                        [<ffffffff813b7e5f>] usb_hcd_irq+0x59/0xcc
                        [<ffffffff810ca810>] handle_IRQ_event+0x62/0x148
                        [<ffffffff810ccda3>] handle_level_irq+0x90/0xf9

.  .  .  .  .  .

Jeremy's current version is rc6.  I believe this issue has been noticed at xen-devel.
Boris





-----Inline Attachment Follows-----



-----Inline Attachment Follows-----



-----Inline Attachment Follows-----



-----Inline Attachment Follows-----


start: 0000-00-00 end: 0000-00-00