Raid 5 on Fedora 4 working with SATA ?

Terry Barnaby terry1 at beam.ltd.uk
Thu Feb 2 10:13:37 UTC 2006


Terry Barnaby wrote:
> Gilboa Davara wrote:
> 
>> On Wed, 2006-02-01 at 12:01 +0000, Terry Barnaby wrote:
>>
>>> Gilboa Davara wrote:
>>>
>>>> By default software RAID1/5/6 support on-line drive
>>>> kill/remove/rebuild/etc.
>>>> However, seems that the MD driver is unaware of the dead drive.
>>>>
>>>> What does /proc/mdstat say?
>>>>
>>>> Gilboa
>>>>
>>>>
>>>
>>> After removing the SATA cable on /dev/sdd, if I access a file there 
>>> is a long delay
>>> and then the program returns with no error but no data. For example:
>>> "cat /data/test-file" will delay and then exit with status of "0" but 
>>> no file
>>> contents are displayed.
>>>
>>> The kernel is: 2.6.14-1.1656_FC4smp: I get the following kernel 
>>> messages:
>>>
>>> Feb  1 11:51:37 library kernel: ata2: command 0x35 timeout, stat 0x0 
>>> host_stat 0x61
>>> Feb  1 11:51:38 library sshd(pam_unix)[13027]: session opened for 
>>> user root by root(uid=0)
>>> Feb  1 11:52:07 library kernel: ata2: command 0x25 timeout, stat 0x0 
>>> host_stat 0x61
>>> Feb  1 11:53:07 library last message repeated 2 times
>>> Feb  1 11:54:37 library last message repeated 3 times
>>> Feb  1 11:55:01 library crond(pam_unix)[13091]: session opened for 
>>> user root by (uid=0)
>>> Feb  1 11:55:01 library crond(pam_unix)[13091]: session closed for 
>>> user root
>>> Feb  1 11:55:07 library kernel: ata2: command 0x25 timeout, stat 0x0 
>>> host_stat 0x61
>>>
>>> /proc/mdstat has:
>>> Personalities : [raid1] [raid5]
>>> md1 : active raid1 sdc1[0]
>>>       20482752 blocks [2/1] [U_]
>>>
>>> md2 : active raid5 sdd3[3] sdc3[2] sdb3[1] sda3[0]
>>>       873196800 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
>>>
>>> md0 : active raid1 sdb1[1] sda1[0]
>>>       20482752 blocks [2/2] [UU]
>>>
>>> unused devices: <none>
>>>
>>> The output of "mdadm -Q --detail /dev/md2" is:
>>> /dev/md2:
>>>         Version : 00.90.02
>>>   Creation Time : Tue Jan 31 14:14:07 2006
>>>      Raid Level : raid5
>>>      Array Size : 873196800 (832.75 GiB 894.15 GB)
>>>     Device Size : 291065600 (277.58 GiB 298.05 GB)
>>>    Raid Devices : 4
>>>   Total Devices : 4
>>> Preferred Minor : 2
>>>     Persistence : Superblock is persistent
>>>
>>>     Update Time : Wed Feb  1 11:51:07 2006
>>>           State : active
>>>  Active Devices : 4
>>> Working Devices : 4
>>>  Failed Devices : 0
>>>   Spare Devices : 0
>>>
>>>          Layout : left-symmetric
>>>      Chunk Size : 64K
>>>
>>>            UUID : 56bd5037:9d9b9018:eb8f01d6:94155776
>>>          Events : 0.230
>>>
>>>     Number   Major   Minor   RaidDevice State
>>>        0       8        3        0      active sync   /dev/sda3
>>>        1       8       19        1      active sync   /dev/sdb3
>>>        2       8       35        2      active sync   /dev/sdc3
>>>        3       8       51        3      active sync   /dev/sdd3
>>>
>>> Terry
>>
>>
>>
>> Very weird.
>> I've got a number of both IDE, SATA and SCSI RAID5 setups and I never
>> seen such a problem.
>> What happens if you try to access the RAID5 array?
>> (hdparm -tT /dev/md2)
>>
>> Gilboa
>>
> 
> I hav'nt tried "hdparm -tT /dev/md2", but if I access a file there is a 
> long delay
> and then the program returns with no error but no data. For example:
> "cat /data/test-file" will delay and then exit with status of "0" but no 
> file
> contents are displayed.
> 
> This is VERY VERY BAD !
> 
> I really think this must be a bug, possibly in the SATA driver in Fedora 
> Core 4's
> 2.6.14-1.1656_FC4smp kernel. I have a spare SCSI system with 3 SCSI disks,
> I will set that up and see how this handles the situation ...
> 
> Terry
> 
I have just set up a SCSI raid array and tried unplugging a drive.
All works as expected here, ie there are error messages from the raid
system and an email to root and the system continues running fine.

So it looks like a bug in the SATA driver ....

Terry




More information about the users mailing list