Problem with hard drive lock out

Mark LaPierre marklapier at aol.com
Sat Mar 1 15:31:30 UTC 2008


Timothy Murphy wrote:
> Mark LaPierre wrote:
>
>   
>> Well, it's been 24 hours and no reply.  Am I asking too hard a
>> question?  Maybe I should post this to a different list?
>>     
>
> You could try "sudo smartctl -a /dev/sdb".
>
>
>   
Thank you for the idea. I waited for the drive to go bonkers again 
before messing with it. This is what it looked like in /var/log/messages

Feb 29 19:27:01 mushroom pulseaudio[2736]: main.c: 
setrlimit(RLIMIT_NICE, (31, 31)) failed: Operation not permitted
Feb 29 19:27:01 mushroom pulseaudio[2736]: main.c: 
setrlimit(RLIMIT_RTPRIO, (9, 9)) failed: Operation not permitted
Feb 29 19:27:01 mushroom pulseaudio[2736]: alsa-util.c: Device (null) 
doesn't support 44100 Hz, changed to 48000 Hz.
Feb 29 19:27:02 mushroom gconfd (mlapier-2728): Resolved address 
"xml:readwrite:/home/mlapier/.gconf" to a writable configuration source 
at position 0
Mar 1 00:28:34 mushroom yum-updatesd-helper: error getting update info: 
failure: repodata/comps-f8.xml from updates: [Errno 256] No more mirrors 
to try.
Mar 1 00:55:31 mushroom kernel: ata3.01: exception Emask 0x0 SAct 0x0 
SErr 0x0 action 0x2 frozen
Mar 1 00:55:31 mushroom kernel: ata3.01: cmd 
a0/00:00:00:00:20/00:00:00:00:00/b0 tag 0 cdb 0x1e data 0
Mar 1 00:55:31 mushroom kernel: res 40/00:03:00:00:20/00:00:00:00:00/b0 
Emask 0x4 (timeout)
Mar 1 00:55:31 mushroom kernel: ata3: soft resetting port
Mar 1 00:55:56 mushroom gconfd (mlapier-2728): Exiting
Mar 1 00:56:01 mushroom kernel: agpgart: Found an AGP 3.0 compliant 
device at 0000:00:00.0.
Mar 1 00:56:01 mushroom kernel: agpgart: X tried to set rate=x12. 
Setting to AGP3 x8 mode.
Mar 1 00:56:01 mushroom kernel: agpgart: Putting AGP V3 device at 
0000:00:00.0 into 8x mode
Mar 1 00:56:01 mushroom kernel: agpgart: Putting AGP V3 device at 
0000:01:00.0 into 8x mode
Mar 1 00:56:01 mushroom kernel: irq 16: nobody cared (try booting with 
the "irqpoll" option)
Mar 1 00:56:01 mushroom kernel:
Mar 1 00:56:01 mushroom kernel: Call Trace:
Mar 1 00:56:01 mushroom kernel: <IRQ> [<ffffffff8106aa4f>] 
__report_bad_irq+0x30/0x72
Mar 1 00:56:01 mushroom kernel: [<ffffffff8106aca0>] 
note_interrupt+0x20f/0x253
Mar 1 00:56:01 mushroom kernel: [<ffffffff8106b58c>] 
handle_fasteoi_irq+0xa9/0xd1
Mar 1 00:56:01 mushroom kernel: [<ffffffff8100e0fc>] do_IRQ+0xf1/0x161
Mar 1 00:56:01 mushroom kernel: [<ffffffff8100adba>] default_idle+0x0/0x3d
Mar 1 00:56:01 mushroom kernel: [<ffffffff8100c0e1>] ret_from_intr+0x0/0xa
Mar 1 00:56:01 mushroom kernel: <EOI> [<ffffffff8100ade3>] 
default_idle+0x29/0x3d
Mar 1 00:56:01 mushroom kernel: [<ffffffff8100ae8b>] cpu_idle+0x94/0xbc
Mar 1 00:56:01 mushroom kernel: [<ffffffff81433baa>] 
start_kernel+0x2cf/0x2db
Mar 1 00:56:01 mushroom kernel: [<ffffffff81433140>] _sinittext+0x140/0x144
Mar 1 00:56:01 mushroom kernel:
Mar 1 00:56:01 mushroom kernel: handlers:
Mar 1 00:56:01 mushroom kernel: [<ffffffff8831b35a>] 
(via_driver_irq_handler+0x0/0x183 [via])
Mar 1 00:56:01 mushroom kernel: Disabling IRQ #16
Mar 1 00:56:02 mushroom kernel: ata3.00: qc timeout (cmd 0xef)
Mar 1 00:56:02 mushroom kernel: ata3.00: failed to IDENTIFY (SPINUP 
failed, err_mask=0x4)
Mar 1 00:56:02 mushroom kernel: ata3.00: revalidation failed (errno=-5)
Mar 1 00:56:02 mushroom kernel: ata3: failed to recover some devices, 
retrying in 5 secs
Mar 1 00:56:06 mushroom shutdown[3798]: shutting down for system reboot

I was tailing the messages file so I saw it drop out. I went straight to 
/etc/fstab and commented out the lines that apply to the drive. Then I 
rebooted the the machine.

When the machine came back up the two partitions on the drive were 
mounted under /media. The two desktop icons were there on my desktop. 
That's what prompted me to go look in mtab where I found the mounts.

That lead me to the question, If I have the partitions mounted under 
fstab why are the icons showing up on my desktop? Is the system mounting 
the partitions twice? Once under /sdb1 and /sdb2 according to the fstab 
and also on /media? Maybe that's what's causing the problem.

Anyway, I set to work on the smartctl as suggested by Tim Murphey. I ran 
the short test and the long test. Here's the results of that effort:

[root at mushroom ~]# smartctl -a /dev/sdb
smartctl version 5.37 [x86_64-redhat-linux-gnu] Copyright (C) 2002-6 
Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Model Family: IBM/Hitachi Deskstar GXP-180 family
Device Model: IC35L090AVV207-0
Serial Number: VNVC00G3C5687G
Firmware Version: V23OA63A
User Capacity: 82,348,277,760 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 6
ATA Standard is: ATA/ATAPI-6 T13 1410D revision 3a
Local Time is: Sat Mar 1 10:08:30 2008 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (2153) seconds.
Offline data collection
capabilities: (0x1b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
No Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 36) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED 
RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 060 Pre-fail Always - 0
2 Throughput_Performance 0x0005 100 100 050 Pre-fail Offline - 0
3 Spin_Up_Time 0x0007 116 116 024 Pre-fail Always - 223 (Average 248)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 161
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 100 100 020 Pre-fail Offline - 0
9 Power_On_Hours 0x0012 095 095 000 Old_age Always - 37120
10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 161
192 Power-Off_Retract_Count 0x0032 099 099 050 Old_age Always - 1707
193 Load_Cycle_Count 0x0012 099 099 050 Old_age Always - 1707
194 Temperature_Celsius 0x0002 137 137 000 Old_age Always - 40 (Lifetime 
Min/Max 11/68)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 37112 -
# 2 Short offline Completed without error 00% 37111 -

Device does not support Selective Self Tests/Logging
[root at mushroom ~]#

Someone suggested that it could be a PS problem and that I should test 
the voltages. Can someone tell me what the voltage reading should be and 
the allowable tolerance?

Thanks for the help guy and girls.




More information about the users mailing list