Re: Borked MD RAID...

Sunday, 30 November 2008

Lonni J Friedman wrote:
...
 Is it always the same disk that gets marked offline ?  Perhaps the
 disk is actually bad?

 On Sun, Nov 30, 2008 at 4:20 PM, Eitan Tsur <eitan.tsur(a)gmail.com&gt; wrote:

> Even if I remove that it still happens.
>
> On Sun, Nov 30, 2008 at 3:53 PM, Lonni J Friedman <netllama(a)gmail.com&gt;
> wrote:
>     
>> If you only have 3 disks, then you can't have:
>> spares=1
>>
>>
>>
>> On Sun, Nov 30, 2008 at 3:11 PM, Eitan Tsur <eitan.tsur(a)gmail.com&gt; wrote:
>>       
>>>> # mdadm.conf written out by anaconda
>>>> DEVICE partitions
>>>> MAILADDR root@localhost
>>>>
>>>> ARRAY /dev/md0 level=raid5 num-devices=3 spares=1
>>>> UUID=0c21bf19:83747f05:70a4872d:90643876
>>>>           
>>> If I switch the "DEVICE partitions" with "DEVICE /dev/sdb1
/dev/sdc1
>>> /dev/sdd1", drives no longer are allocated as spares, however the array
>>> still seems to rebuild every boot.
>>>
>>> I don't remember the specifics of what was in /proc/mdstat at the time,
>>> but
>>> currently the array is being rebuilt.  I'll reboot after it is complete
>>> to
>>> give you a copy of it.  Basically it allocated the dropped drive as a
>>> spare
>>> which I'd have to mdadm -stop and mdadm -add to the original array after
>>> every boot, manually.  Give me an hour or two and I'll get you the
>>> output of
>>> mdstat.
>>>
>>> On Sun, Nov 30, 2008 at 2:41 PM, Lonni J Friedman <netllama(a)gmail.com&gt;
>>> wrote:
>>>         
>>>> On Sun, Nov 30, 2008 at 2:38 PM, Eitan Tsur <eitan.tsur(a)gmail.com&gt;
>>>> wrote:
>>>>           
>>>>> I just recently installed a 3-disk RAID5 array in a server of mine,
>>>>> running
>>>>> FC9. Upon reboot, one of the drives drops out, and is allocated as a
>>>>> spare.
>>>>> I suspect there is some sort of issue where DBUS re-arranges the
>>>>> drive-to-device maps between boots, but I am not sure... Just kind
of
>>>>> annoying to have to stop and re-add a drive every boot, and wait the
>>>>> couple
>>>>> hours for the array to rebuild the 3rd disk. Any thoughts? Anyone
>>>>> else
>>>>> encountered such an issue before? What should I be looking for?
I'm
>>>>> new
>>>>> to
>>>>> the world of RAID, so any information you can give may be helpful.
>>>>>             
>>>> What's in /etc/mdadm.conf, /proc/mdstat and dmesg when this fails ?
>>>>           

check and make sure that the UUID number your specifying in your 
/etc/mdadm.conf file is correct.  You can verify the UUID numbers by 
typing "ls -l /dev/disk/by-uuid"

Next verify that the UUID numbers in your /etc/mdad.conf file stored in 
the initrd file is correct, you'll have to extract the initrd file
with cpio.  I don't remember the full procedure but you should be able 
to find it pretty easily with your favorite search engine.

Jeff