On Sat, Mar 14, 2015 at 5:56 PM, Tom Horsley horsley1953@gmail.com wrote:
On Sat, 14 Mar 2015 16:42:37 -0600 Chris Murphy wrote:
If there's a definite latent sector error, this shows up with a 'smarctl -t long' which will be aborted at the first error found. The LBA for this shows up under LBA_of_first_error.
I actually ran one of those when I first started seeing the messages (I've got another going now), and the prev test results were:
# 2 Extended offline Completed without error 00% 17259 -
So that lonely '-' out there apparently says there is no LBA with an error, the overall health assessment says PASSED, yet these have been showing up every half our or so for a week now:
Mar 14 19:46:52 zooty smartd[812]: Device: /dev/sdc [SAT], 8 Currently unreadable (pending) sectors Mar 14 19:46:52 zooty smartd[812]: Device: /dev/sdc [SAT], 8 Offline uncorrectable sectors
This is consistent with a single sector on a 512e AF drive. If it's unreadable, somewhere in the journal or messages is a read error or link reset. You could search for "media error" and "hard resetting link".
What do you get for: # smartctl -x /dev/sdc # parted /dev/sdc u s p
It seems to be telling me there is nothing wrong and something wrong at the same time. I'd probably just be happy with the "PASSED" health check if it wasn't constantly spewing these messages :-).
A valid option is to keep the backups current and ignore it until the number goes up again.
Another option is a non-destructive badblocks (omit the w) with -b 4096, and see if you can trigger a read error. A libata error will be a proper LBA. A badblocks error will need to be multiplied by 8 to get an LBA. This value then gets plugged into debugfs (this is ext4?) to find out what file is affected. And then it also gets plugged into a dd if=/dev/zero if=/dev/sdc bs=4096 seek=$((LBA/8)) to write over that sector - that'll fix this.