system fails to boot
yanmin_zhang at linux.intel.com
Mon Nov 17 08:19:55 UTC 2008
On Fri, 2008-11-14 at 14:29 +0800, Zhang, Yanmin wrote:
> On Fri, 2008-11-14 at 09:18 +0300, Alexey Dobriyan wrote:
> > On Fri, Nov 14, 2008 at 01:16:21PM +0800, Zhang, Yanmin wrote:
> > > Jens,
> > >
> > > We run into system boot failure with kernel 2.6.28-rc. We found it on a couple of
> > > machines, including T61 notebook, nehalem machine, and another HPC NX6325 notebook.
> > > All the machines use FedoraCore 8 or FedoraCore 9. With kernel prior to 2.6.28-rc,
> > > system boot doesn't fail.
> > >
> > > I debug it and locate the root cause. Pls. see
> > > http://bugzilla.kernel.org/show_bug.cgi?id=11899
> > > https://bugzilla.redhat.com/show_bug.cgi?id=471517
> > >
> > > As a matter of fact, there are 2 bugs.
> > >
> > > 2) root=LABEL=/, system always can't boot. initrd init reports
> > > switchroot fails. Here is an executation branch of nash when booting:
> > > (1) nash read /sys/block/sda/dev; Assume major is 8 (on my desktop)
> > > (2) nash query /proc/devices with the major number; It found line "8 sd";
> > > (3) nash use 'sd' to search its own probe table to find device (DISK) type for the device
> > > and add it to its own list;
> > > (4) Later on, it probes all devices in its list to get filesystem labels;
> > > scsi register "8 sd" always.
> > > When major is 259, nash fails to find the device(DISK) type. I enables CONFIG_DEBUG_BLOCK_EXT_DEVT=y
> > > when compiling kernel, so 259 is picked up for device /dev/sda1, which causes nash to fail
> > > to find device (DISK) type.
> > > To fixing issue 2), I create a patch for nash and another patch for kernel.
> > > http://bugzilla.kernel.org/attachment.cgi?id=18859
> > > http://bugzilla.kernel.org/attachment.cgi?id=18837
As for issue 2) with root=LABEL=/, I double-checked nash codes. That's really beyond what I imagined. I'm not
an expert of nash. kernel might allocate MINOR number from MAX_EXT_DEVT (259) for any type of disk
(cciss/ataraid/sd/ide/floppy/md ...), while nash assumes a MAJOR number is used by one of them exclusively.
In the other hand, nash probes scsi/ide/usb serially as long as the type is DEV_TYPE_DISK. I won't say
nash codes are not perfect, but nash is growing.
You maintain nash. What's your opinion?
More information about the devel