problems with mptscsih / FC5

Grant Ozolins grant.ozolins at firebox.com
Wed Nov 29 10:55:48 UTC 2006


Hi Nigel,

Thanks for your reply!

The SCSI device in question does work properly, smartctl -H reports the 
drive is in good health, and smartctl -a shows fairly normal looking 
output. 

Error counter log:
           Errors Corrected by           Total   Correction     
Gigabytes    Total
               ECC          rereads/    errors   algorithm      
processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 
bytes]  errors
read:    1606774        0         4         0          0      
33811.545           4
write:         0        0         0         0          0       
5737.408           0


Except for the 4 Total uncorrected errors - it's been a while since I 
last checked the health of this drive with smartctl though, so I'm not 
sure whether those 4 errors could relate to the recent SCSI wierdness.  
I'm going to run a smart long background test and see what that turns up.

Thanks for your help!

Cheers,
Grant




Non-medium error count:       26


Nigel Wade wrote:
> Grant Ozolins wrote:
>> Hi all,
>>
>> We enountered an unusual condition with the LSI SCSI drivers last 
>> night - we got an "attempted task abort", followed by about 10 
>> minutes of no messages (I wasn't logged in at the time but believe 
>> the system was unresponsive), followed by several minutes more of 
>> similar messages, repeated:
>>
>
> It looks to me like whatever the LSI device driver was attempting to 
> talk to is failing. The apparent sequence is that a SCSI command has 
> failed to work correctly (a read, by the looks of it), there is a SCSI 
> command abort followed by the device driver issuing Test Unit Ready. 
> The TUR either took 10 mins. or the device driver timed out after 
> 10mins. Another read command then failed in a similar way, resulting 
> in the same sequence of events. At some point, presumably whatever was 
> attempting to read the device gave up and all went back to normal.
>
> Does that device normally work ok? It might be going faulty, or there 
> might be a cabling or termination issue. I doubt it's a device driver 
> fault, it looks to me like a hardware read error/timeout issue, 
> followed by re-tries, and/or additional failed attempts.
>
> If daytime TV made shows about computers, this one would be "When SCSI 
> devices go bad."
>

-- 
Grant Ozolins <grant.ozolins at firebox.com>
Senior Web Developer
Firebox.com
+44 (0)20 8678 5581




More information about the users mailing list