Kernel bug or disk failure

Sam Varshavchik mrsam at courier-mta.com
Thu Jul 10 12:44:26 UTC 2008


Every other week or so, I get a disk kicked out of my RAID, with this:

Jul  6 04:05:38 commodore kernel: (scsi1:A:0:0): scsi1: device overrun 
(status 10) on 0:0:0
Jul  6 04:05:38 commodore kernel: Unexpected busfree in DT Data-in phase, 1 
SCBs aborted, PRGMCNT == 0x22f
Jul  6 04:05:38 commodore kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins 
<<<<<<<<<<<<<<<<<
Jul  6 04:05:38 commodore kernel: scsi1: Dumping Card State at program 
address 0x22d Mode 0x22
Jul  6 04:05:38 commodore kernel: Card was paused

… followed by a rather dry dump of the HBA's registers. This is aic79xxx.

This does not look like a disk error to me. I re-add the drive into the 
array, and rebuild with no downtime. SMART shows 0 in the defect list on 
this drive, and over the disk's lifetime 0 uncorrectable reads and 1 
uncorrectable write -- but this kernel barf already happened 4-5 times now, 
and it's getting rather annoying.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
Url : http://lists.fedoraproject.org/pipermail/users/attachments/20080710/22a7c883/attachment-0001.bin 


More information about the users mailing list