Hi Guys,
I was wondering if someone was going thru the same dmraid 'challenges' as me and had figured out a solution to boot the recent (-25xx) kernels, since i now seem to be unable too (again)
First some historical background.. I've gotten a Dell XPS600 system about 6 months ago, it comes with all kinds of goodnesses, including 2x250gb serial ata disks in 'hardware assisted' raid0 mode, specifically with a a nVidia Intel SLI motherboard and its "nVidia Corporation CK804" serial ata controller / raid.
Before Fedora Core 4.9x, it was impossible to run linux on this machine since the raid set was not supported but then in the Fedora Core 4.9x (fc5 test 1 and 2) releases it worked just fine .. no sweat.. insert cd, sometimes fiddle with configs a little bit, and i was up and running! Also FC5 seemed to work great out of the box (2054 kernel), sometimes i hit some great glitches of not booting because of the 'dm-striped device sizes must be a multiple of chunk-size for 2.6.16 kernels' bug (well some call it a feature :-)), for a while i was forced to run older kernels until i figured out how to fix it thru some magical dmsetup and friends commands
Then came FC5 updates, and the updated kernels didn't work on my dmraid set .. Turned out some of my problems were caused by the init script in the initrd images, which wasn't adding the 'dmraid setup' commands, ergo no root disks, no linux :-)
And for a while now, all the way upto the 2.6.17 24xx kernels, i've been able to update my kernels by gunzipping / cpio'ing / editing the init script / packaging it back up into a new initrd.img file ..
However since a week or so (maybe 2 ? i'm not sure), this trick didn't work anymore .. the good news was that the dmraid setup commands were now automaticly included in the init script again, however despite their presence at boot i am now supprised by no error reports, until the 'could not mount the root device' comes up and tells me that this attempt was an utter failure.
I've diffed the init scripts from my working booting kernel (which is 2.6.17-1.2364 for some obscure historical but unknown to me reason) and the one in 2517, and there were no differences, so everything should work right! :-)
Anyone been having the same problems and been able to figure out what now is making booting impossible? I'd be positively delighted to learn how to fix it again :-)
Kindest Regards,
-- Chris Chabot
After some experimenting i 'hit gold' on how to kind of make it boot the latest kernel again:
1) gunzip, cpio 2364 initrd 2) ditto for 2527 initrd (different directory) 3) copy over *.ko files from 2527 directory into 2364 directory 4) cpio, gzip resulting old nash/insmod/modprobe/init+new kernel modules into a new 2527 initrd
and presto .. it boots!
Only anoyance left (well appart from having to jump thru those hoops) is that it now shouts at me at boot:
============================================= [ INFO: possible recursive locking detected ] --------------------------------------------- init/1 is trying to acquire lock: (&md->io_lock){----}, at: [<ffffffff880d9654>] dm_request+0x25/0x130 [dm_mod]
but task is already holding lock: (&md->io_lock){----}, at: [<ffffffff880d9654>] dm_request+0x25/0x130 [dm_mod]
other info that might help us debug this: 1 lock held by init/1: #0: (&md->io_lock){----}, at: [<ffffffff880d9654>] dm_request+0x25/0x130 [dm_mod]
stack backtrace:
Call Trace: [<ffffffff8026e73d>] show_trace+0xae/0x319 [<ffffffff8026e9bd>] dump_stack+0x15/0x17 [<ffffffff802a7f00>] __lock_acquire+0x135/0xa5f [<ffffffff802a8dcd>] lock_acquire+0x4b/0x69 [<ffffffff802a58f9>] down_read+0x3e/0x4a [<ffffffff880d9654>] :dm_mod:dm_request+0x25/0x130 [<ffffffff8021cf45>] generic_make_request+0x21a/0x235 [<ffffffff880d8402>] :dm_mod:__map_bio+0xca/0x104 [<ffffffff880d8e48>] :dm_mod:__split_bio+0x16a/0x36b [<ffffffff880d974c>] :dm_mod:dm_request+0x11d/0x130 [<ffffffff8021cf45>] generic_make_request+0x21a/0x235 [<ffffffff80235eb7>] submit_bio+0xcc/0xd5 [<ffffffff8021b381>] submit_bh+0x100/0x124 [<ffffffff802e1a3c>] block_read_full_page+0x283/0x2a1 [<ffffffff802e40df>] blkdev_readpage+0x13/0x15 [<ffffffff8021358d>] __do_page_cache_readahead+0x17b/0x1fc [<ffffffff80234e37>] blockable_page_cache_readahead+0x5f/0xc1 [<ffffffff80214784>] page_cache_readahead+0x146/0x1bb [<ffffffff8020c2d6>] do_generic_mapping_read+0x157/0x4b4 [<ffffffff8020c78e>] __generic_file_aio_read+0x15b/0x1b1 [<ffffffff802c852e>] generic_file_read+0xc6/0xe0 [<ffffffff8020b5fb>] vfs_read+0xcc/0x172 [<ffffffff802121ae>] sys_read+0x47/0x6f [<ffffffff802603ce>] system_call+0x7e/0x83 DWARF2 unwinder stuck at system_call+0x7e/0x83
Leftover inexact backtrace:
Chris Chabot wrote:
Hi Guys,
I was wondering if someone was going thru the same dmraid 'challenges' as me and had figured out a solution to boot the recent (-25xx) kernels, since i now seem to be unable too (again)
First some historical background.. I've gotten a Dell XPS600 system about 6 months ago, it comes with all kinds of goodnesses, including 2x250gb serial ata disks in 'hardware assisted' raid0 mode, specifically with a a nVidia Intel SLI motherboard and its "nVidia Corporation CK804" serial ata controller / raid.
Before Fedora Core 4.9x, it was impossible to run linux on this machine since the raid set was not supported but then in the Fedora Core 4.9x (fc5 test 1 and 2) releases it worked just fine .. no sweat.. insert cd, sometimes fiddle with configs a little bit, and i was up and running! Also FC5 seemed to work great out of the box (2054 kernel), sometimes i hit some great glitches of not booting because of the 'dm-striped device sizes must be a multiple of chunk-size for 2.6.16 kernels' bug (well some call it a feature :-)), for a while i was forced to run older kernels until i figured out how to fix it thru some magical dmsetup and friends commands
Then came FC5 updates, and the updated kernels didn't work on my dmraid set .. Turned out some of my problems were caused by the init script in the initrd images, which wasn't adding the 'dmraid setup' commands, ergo no root disks, no linux :-)
And for a while now, all the way upto the 2.6.17 24xx kernels, i've been able to update my kernels by gunzipping / cpio'ing / editing the init script / packaging it back up into a new initrd.img file ..
However since a week or so (maybe 2 ? i'm not sure), this trick didn't work anymore .. the good news was that the dmraid setup commands were now automaticly included in the init script again, however despite their presence at boot i am now supprised by no error reports, until the 'could not mount the root device' comes up and tells me that this attempt was an utter failure.
I've diffed the init scripts from my working booting kernel (which is 2.6.17-1.2364 for some obscure historical but unknown to me reason) and the one in 2517, and there were no differences, so everything should work right! :-)
Anyone been having the same problems and been able to figure out what now is making booting impossible? I'd be positively delighted to learn how to fix it again :-)
Kindest Regards,
-- Chris Chabot
Hi.
Chris Chabot chabotc@xs4all.nl wrote:
- gunzip, cpio 2364 initrd
- ditto for 2527 initrd (different directory)
- copy over *.ko files from 2527 directory into 2364 directory
- cpio, gzip resulting old nash/insmod/modprobe/init+new kernel
modules into a new 2527 initrd
and presto .. it boots!
Thats pretty amazing, given that the kernel modules do not belong to the kernel you are booting.
On Sat, 2006-08-05 at 16:53 +0200, Ralf Ertzinger wrote:
Hi.
Chris Chabot chabotc@xs4all.nl wrote:
- gunzip, cpio 2364 initrd
- ditto for 2527 initrd (different directory)
- copy over *.ko files from 2527 directory into 2364 directory
- cpio, gzip resulting old nash/insmod/modprobe/init+new kernel
modules into a new 2527 initrd
and presto .. it boots!
Thats pretty amazing, given that the kernel modules do not belong to the kernel you are booting.
No actually as you can read at point #3, the only thing that it does have is the new kernel modules :-) else i'm sure only a lot of fire and smoke would be the end result..
Dnia 05-08-2006, sob o godzinie 16:53 +0200, Ralf Ertzinger napisał(a):
- cpio, gzip resulting old nash/insmod/modprobe/init+new kernel
modules into a new 2527 initrd
Thats pretty amazing, given that the kernel modules do not belong to the kernel you are booting.
Read him again, modules are from the new kernel.
Lam
I've added all my findings to the ongoing bug report at: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=186842
(started 2006-03-26)
which also has all my raid metadata's etc attached as well :-)
dragoran wrote:
have you filled a bug report?
Chris Chabot wrote:
After some experimenting i 'hit gold' on how to kind of make it boot the latest kernel again:
- gunzip, cpio 2364 initrd
- ditto for 2527 initrd (different directory)
- copy over *.ko files from 2527 directory into 2364 directory
- cpio, gzip resulting old nash/insmod/modprobe/init+new kernel modules
into a new 2527 initrd
and presto .. it boots!
Good work!, so it is a userspace problem!
Is there a way to find out the version of nash in the initrd, since Chris has compared the scripts in the initrd's (AFAIK) its probably due to a nash change??
Regards,
Hans