RAID & HDD failure recovery

Laurence Vanek lvanek at charter.net
Thu Nov 16 01:15:49 UTC 2006


I thought I knew how to do this.  Thought I was prepared.

My FC6 has a simple RAID1 setup with two ATA HDD.  Three paritions on 
each drive (/boot, /, swap).  Three RAID devices defined (i.e. hda1 & 
hdc1 for md0, hda2& hdc2 for md1, hda3 & hdc3 for md2).  works great, 
can boot off either drive with the other powered down.

A week ago hda began to show disk read failures that seemed to increase 
by the day (smartd).  Checked hda with smartctl & hdd vendor test 
software & sure enough drive was failing.

I removed hda from arrays (marked as failed then removed with mdadm).  
Shutdown & replaced with new identical drive.  Plan was to boot then use:

sfdisk -d /dev/hdc | sfdisk /dev/hda

to copy partition table from remaining good drive to newly installed 
drive.  Then add new drive back into arrays.

Surprise! boot hangs, cant find partitions on hda (of course not).  
drops me to simple shell.

I do not understand why I was not able to boot using the remaining good 
drive (hdc).  I had done so prior during raid testing.  machine acts 
like doesnt see hdc.

What steps am I missing here.  What should I have done ("best practice")?

Thanks in advance for any advice.





More information about the users mailing list