failed drive?

Thu Nov 13 20:23:44 UTC 2014

On 13.11.2014 21:18, dustin kempter wrote:
> Hi all, I'm having some issues. I'm a little confused, so I was checking 
> our servers today and saw something strange. cat /proc/mdstat shows that 
> 1 device md0 is inactive. I'm not really sure why. id did a bit more 
> digging and testing using smartctl and it says that the device /dev/sdg 
> (part of md0) is failing, estimated to fail within 24 hrs. but if i do 
> df -h it doesn't even show md0, and was talking to a friend and we 
> disagreed. I believe that based on what smrtctl says the drive is 
> failing but not failed yet. he doesn't think its a problem with the 
> drive. do you have any thoughts on this? and why would the device (md0) 
> suddenly be inactive but still show 2 working devices (sdg, sdh)?
> 
> *(proc/mdstat)*
> [root at csdatastandby3 bin]# cat /proc/mdstat
> Personalities : [raid1] [raid10]
> md125 : active raid10 sdf1[5] sdc1[2] sde1[4] sda1[0] sdb1[1] sdd1[3]
>        11720655360 blocks super 1.2 512K chunks 2 near-copies [6/6] [UUUUUU]
> 
> md126 : active raid1 sdg[1] sdh[0]
>        463992832 blocks super external:/md0/0 [2/2] [UU]
> 
> md0 : inactive sdh[1](S) sdg[0](S)
>        6306 blocks super external:imsm
> 
> unused devices: <none>
> [root at csdatastandby3 bin]#
> 
> 
> *(smartctl)*
> [root at csdatastandby3 bin]# smartctl -H /dev/sdg
> smartctl 5.43 2012-06-30 r3573 [x86_64-linux-2.6.32-431.17.1.el6.x86_64] 
> (local build)
> Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net
> 
> === START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: FAILED!
> Drive failure expected in less than 24 hours. SAVE ALL DATA.
> Failed Attributes:
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE UPDATED  
> WHEN_FAILED RAW_VALUE
>    5 Reallocated_Sector_Ct   0x0033   002   002   036    Pre-fail 
> Always   FAILING_NOW 32288
> 
> [root at csdatastandby3 bin]#
> 
> *(df -h)*
> [root at csdatastandby3 bin]# df -h
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/md126p4    404G  4.3G  379G   2% /
> tmpfs            16G  172K   16G   1% /dev/shm
> /dev/md126p2    936M   74M  815M   9% /boot
> /dev/md126p1    350M  272K  350M   1% /boot/efi
> /dev/md125       11T  4.2T  6.1T  41% /data
> [root at csdatastandby3 bin]#
> 
> 
> *(mdadm -D /dev/md0*
> [root at csdatastandby3 bin]# mdadm -D /dev/md0
> /dev/md0:
>          Version : imsm
>       Raid Level : container
>    Total Devices : 2
> 
> Working Devices : 2
> 
> 
>             UUID : 32c1fbb7:4479296b:53c02d9b:666a08f6
>    Member Arrays : /dev/md/Volume0
> 
>      Number   Major   Minor   RaidDevice
> 
>         0       8       96        -        /dev/sdg
>         1       8      112        -        /dev/sdh
> [root at csdatastandby3 bin]#
> 
> 
> thanks
> 
> 
> -dustink
> 
> 
> 

Maybe a coffee filter.