Disc failure on a software RAID system has killed everything
Bryn M. Reeves
bmr at redhat.com
Wed May 19 11:13:12 UTC 2010
On 05/19/2010 12:04 PM, Gary Stainburn wrote:
> Hi folks,
>
> I've got a PC with 5x500GB HDD's running software raid. On drive 0 and 1 I
> had RAID0 for the boot partition and then on all 5 drives I had RAID 5 for
> everything else.
Why use RAID0 for the boot partition? That means that a failure of
either drive will make the system unbootable (since half the file system
needed for booting is on the dead device).
I tend to use RAID1 for boot devices, RAID5 for general storage (where I
either don't care too much about write performance or expect a low level
of write I/O). I would only use RAID0 for transient data that can be
easily regenerated after a hardware failure (or combine it with RAID1 if
you need to combine redundancy with better write performance but make
sure you understand the way that different stackings behave[1]).
> One of the first two drives has died causing the PC to hang, and then when I
> rebooted it couldn't get past GRUB. I have found out which drive it is and
> disconnected it. It then got past GRUB, loaded the kernel which then paniced
> and hung.
If the boot file system was really stored on a RAID0 device one side of
which is now absent then it is possible that we are loading a broken
vmlinuz or initrd (initramfs) image from the surviving disk.
> I have tried booting using a FC11 install DVD (I believe the dead PC is either
> FC9 or FC10) and going into rescue mode but it says that there are no Linux
> partitions and doesn't go any further.
You might need to activate the arrays manually depending on how they
were configured.
> Going into fdisk for each drive (with the dead one still disconnected) shows
> the partition tables.
>
> in theory, this system should still be bootable, can anyone suggest things to
> try to get it working again.
I'm not sure I agree if you had /boot on a RAID0.
Regards,
Bryn.
More information about the users
mailing list