Seagate disk problems (NCQ bug???)

Robin Laing Robin.Laing at drdc-rddc.gc.ca
Wed Apr 29 15:48:15 UTC 2009


Wolfgang S. Rupprecht wrote:
> After running flawlessly for 6+ months I just had my Seagate
> ST31500343AS (w. SD35 firmware) flake out.  Does this look like the NCQ
> bug or just a random event?  The final error msg was around the time the
> machine hung hard.
> 

> Apr 28 06:41:26 arbol kernel: ata1: exception Emask 0x10 SAct 0x0 SErr 0x90200 action 0xe frozen
> Apr 28 06:41:26 arbol kernel: ata1: irq_stat 0x00400000, PHY RDY changed
> Apr 28 06:41:26 arbol kernel: ata1: SError: { Persist PHYRdyChg 10B8B }
> Apr 28 06:41:26 arbol kernel: ata1: hard resetting link
> Apr 28 06:41:28 arbol kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
> Apr 28 06:41:33 arbol kernel: ata1.00: qc timeout (cmd 0xec)
> Apr 28 06:41:33 arbol kernel: ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
> Apr 28 06:41:33 arbol kernel: ata1.00: revalidation failed (errno=-5)
> Apr 28 06:41:33 arbol kernel: ata1: hard resetting link
> Apr 28 06:41:34 arbol kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
> Apr 28 06:41:44 arbol kernel: ata1.00: qc timeout (cmd 0xec)
> Apr 28 06:41:44 arbol kernel: ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
> Apr 28 06:41:44 arbol kernel: ata1.00: revalidation failed (errno=-5)
> Apr 28 06:41:44 arbol kernel: ata1: hard resetting link
> Apr 28 06:41:46 arbol kernel: ata1: softreset failed (device not ready)
> Apr 28 06:41:46 arbol kernel: ata1: failed due to HW bug, retry pmp=0
> Apr 28 06:41:46 arbol kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
> Apr 28 06:41:46 arbol kernel: ata1.00: configured for UDMA/133
> Apr 28 06:41:46 arbol kernel: ata1: EH complete
> Apr 28 06:41:46 arbol kernel: sd 0:0:0:0: [sda] 2930277168 512-byte hardware sectors (1500302 MB)
> Apr 28 06:41:46 arbol kernel: sd 0:0:0:0: [sda] Write Protect is off
> Apr 28 06:41:46 arbol kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> 
> 
> -wolfgang

I had errors like this when my system load got to high for my system to 
work with.  I later found out that the motherboard controller was to 
slow.  It is an older system.  Replaced the controllers with SATA cards 
and no errors since.

I could predict when the errors were going to occur and almost predict 
when the system would lock up using uptime.

What controller chip is used in your system?


-- 
Robin Laing




More information about the users mailing list