Dear All,
Today, I am getting this message at boot time:
"Notice - HD self monitoring system has reported that a parameter has exceeded its normal operating range. Dell recommends that you back up your data regularly. A parameter out of range may or may not indicate a potential Hard. Press F1 to continue , F2 to enter setup"
Is there any reason to be worried about?
My computer is about 3 years old.
Thanks in advance,
Paul
On 06/21/2013 11:17 AM, Paul Smith wrote:
Dear All,
Today, I am getting this message at boot time:
"Notice - HD self monitoring system has reported that a parameter has exceeded its normal operating range. Dell recommends that you back up your data regularly. A parameter out of range may or may not indicate a potential Hard. Press F1 to continue , F2 to enter setup"
Is there any reason to be worried about?
My computer is about 3 years old.
Thanks in advance,
Paul
For each drive on your system, try
sudo smartctl -H /dev/sdx
where x=a,b,c,.... etc
If any indicate poor health, then use "--all" in place of -H
However, yeah, it's probably time to get a really good backup and price a replacement drive.
On Fri, Jun 21, 2013 at 5:21 PM, Steven Stern subscribed-lists@sterndata.com wrote:
Today, I am getting this message at boot time:
"Notice - HD self monitoring system has reported that a parameter has exceeded its normal operating range. Dell recommends that you back up your data regularly. A parameter out of range may or may not indicate a potential Hard. Press F1 to continue , F2 to enter setup"
Is there any reason to be worried about?
My computer is about 3 years old.
For each drive on your system, try
sudo smartctl -H /dev/sdx
where x=a,b,c,.... etc
If any indicate poor health, then use "--all" in place of -H
However, yeah, it's probably time to get a really good backup and price a replacement drive.
Thanks, Steve. I am getting the following:
# smartctl -H /dev/sda smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.9.6-200.fc18.x86_64] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: FAILED! Drive failure expected in less than 24 hours. SAVE ALL DATA. Failed Attributes: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 5 Reallocated_Sector_Ct 0x0033 013 013 036 Pre-fail Always FAILING_NOW 3587
#
Paul
On 06/21/2013 11:24 AM, Paul Smith wrote:
On Fri, Jun 21, 2013 at 5:21 PM, Steven Stern subscribed-lists@sterndata.com wrote:
Today, I am getting this message at boot time:
"Notice - HD self monitoring system has reported that a parameter has exceeded its normal operating range. Dell recommends that you back up your data regularly. A parameter out of range may or may not indicate a potential Hard. Press F1 to continue , F2 to enter setup"
Is there any reason to be worried about?
My computer is about 3 years old.
For each drive on your system, try
sudo smartctl -H /dev/sdx
where x=a,b,c,.... etc
If any indicate poor health, then use "--all" in place of -H
However, yeah, it's probably time to get a really good backup and price a replacement drive.
Thanks, Steve. I am getting the following:
# smartctl -H /dev/sda smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.9.6-200.fc18.x86_64] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: FAILED! Drive failure expected in less than 24 hours. SAVE ALL DATA. Failed Attributes: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 5 Reallocated_Sector_Ct 0x0033 013 013 036 Pre-fail Always FAILING_NOW 3587
I tend to trust warnings coming from the hardware itself. Time to replace the drive. Have fun shopping! And don't forget to move your critical data onto a backup medium or into the cloud ASAP.
On Fri, Jun 21, 2013 at 5:38 PM, Steven Stern subscribed-lists@sterndata.com wrote:
Today, I am getting this message at boot time:
"Notice - HD self monitoring system has reported that a parameter has exceeded its normal operating range. Dell recommends that you back up your data regularly. A parameter out of range may or may not indicate a potential Hard. Press F1 to continue , F2 to enter setup"
Is there any reason to be worried about?
My computer is about 3 years old.
For each drive on your system, try
sudo smartctl -H /dev/sdx
where x=a,b,c,.... etc
If any indicate poor health, then use "--all" in place of -H
However, yeah, it's probably time to get a really good backup and price a replacement drive.
Thanks, Steve. I am getting the following:
# smartctl -H /dev/sda smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.9.6-200.fc18.x86_64] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: FAILED! Drive failure expected in less than 24 hours. SAVE ALL DATA. Failed Attributes: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 5 Reallocated_Sector_Ct 0x0033 013 013 036 Pre-fail Always FAILING_NOW 3587
I tend to trust warnings coming from the hardware itself. Time to replace the drive. Have fun shopping! And don't forget to move your critical data onto a backup medium or into the cloud ASAP.
Thanks, Steve and all other respondents.
To me, it is a bit surprising how the failure of the hard disk can be predicted (and with a time projection)!
I have already done a full backup. Should I wait for some days to check out whether the alarm is true or not?
Paul
Once upon a time, Paul Smith phhs80@gmail.com said:
To me, it is a bit surprising how the failure of the hard disk can be predicted (and with a time projection)!
One of the lines you listed was the "reallocated sector count". Basically, when you buy a 500G drive, it really has more space than that. Some of the sectors are reserved and not reported to the system. When the drive detects a bad sector, it relocates it to one of the reserved sectors. Once it has relocated a certain number, it reports the drive as failing (possibly because it is out of reserved sectors to use).
I have already done a full backup. Should I wait for some days to check out whether the alarm is true or not?
Given there were a bunch of reallocated sectors, I'd replace it as soon as possible and then destroy it.
On Fri, Jun 21, 2013 at 7:46 PM, Chris Adams linux@cmadams.net wrote:
To me, it is a bit surprising how the failure of the hard disk can be predicted (and with a time projection)!
One of the lines you listed was the "reallocated sector count". Basically, when you buy a 500G drive, it really has more space than that. Some of the sectors are reserved and not reported to the system. When the drive detects a bad sector, it relocates it to one of the reserved sectors. Once it has relocated a certain number, it reports the drive as failing (possibly because it is out of reserved sectors to use).
I have already done a full backup. Should I wait for some days to check out whether the alarm is true or not?
Given there were a bunch of reallocated sectors, I'd replace it as soon as possible and then destroy it.
Thanks, Chris, for your very useful clarification -- I now understand better the things at issue.
Paul
On 06/21/2013 11:57 AM, Paul Smith issued this missive:
On Fri, Jun 21, 2013 at 7:46 PM, Chris Adams linux@cmadams.net wrote:
To me, it is a bit surprising how the failure of the hard disk can be predicted (and with a time projection)!
One of the lines you listed was the "reallocated sector count". Basically, when you buy a 500G drive, it really has more space than that. Some of the sectors are reserved and not reported to the system. When the drive detects a bad sector, it relocates it to one of the reserved sectors. Once it has relocated a certain number, it reports the drive as failing (possibly because it is out of reserved sectors to use).
I have already done a full backup. Should I wait for some days to check out whether the alarm is true or not?
Given there were a bunch of reallocated sectors, I'd replace it as soon as possible and then destroy it.
Thanks, Chris, for your very useful clarification -- I now understand better the things at issue.
The warnings from smartctl aren't guarantees the drive is dying, but that there's a high probability that it will at some point in the very near future. If you get an alert, the standard recommendation is to back up your data and get a new drive ASAP.
Think of the warning as the "check engine" light on a car. If it goes on, it could mean something as innocuous as you're out of windscreen washer fluid to "your gearbox fell out on the tarmac two miles back." Either way, you need to look at the output from the ODB2 scanner to see what's really wrong. The Linux equivalent of the OBD2 scanner is the "smartctl -a -H /dev/sdX" command.
Based on that info, you can decide if you must replace the drive immediately or if you can wait a bit. Given the relatively low cost of new drives, I'd replace it sooner rather than later. In fact, when I see a "deal" on drives, I'll buy a couple so I can have spares for just this sort of situation. I believe in a belt and suspenders as far as my data is concerned. ---------------------------------------------------------------------- - Rick Stevens, Systems Engineer, AllDigital ricks@alldigital.com - - AIM/Skype: therps2 ICQ: 22643734 Yahoo: origrps2 - - - - Blessed are the peacekeepers...for they shall be shot at - - from both sides. --A.M. Greeley - ----------------------------------------------------------------------
On Fri, Jun 21, 2013 at 10:03 PM, Rick Stevens ricks@alldigital.com wrote:
To me, it is a bit surprising how the failure of the hard disk can be predicted (and with a time projection)!
One of the lines you listed was the "reallocated sector count". Basically, when you buy a 500G drive, it really has more space than that. Some of the sectors are reserved and not reported to the system. When the drive detects a bad sector, it relocates it to one of the reserved sectors. Once it has relocated a certain number, it reports the drive as failing (possibly because it is out of reserved sectors to use).
I have already done a full backup. Should I wait for some days to check out whether the alarm is true or not?
Given there were a bunch of reallocated sectors, I'd replace it as soon as possible and then destroy it.
Thanks, Chris, for your very useful clarification -- I now understand better the things at issue.
The warnings from smartctl aren't guarantees the drive is dying, but that there's a high probability that it will at some point in the very near future. If you get an alert, the standard recommendation is to back up your data and get a new drive ASAP.
Think of the warning as the "check engine" light on a car. If it goes on, it could mean something as innocuous as you're out of windscreen washer fluid to "your gearbox fell out on the tarmac two miles back." Either way, you need to look at the output from the ODB2 scanner to see what's really wrong. The Linux equivalent of the OBD2 scanner is the "smartctl -a -H /dev/sdX" command.
Based on that info, you can decide if you must replace the drive immediately or if you can wait a bit. Given the relatively low cost of new drives, I'd replace it sooner rather than later. In fact, when I see a "deal" on drives, I'll buy a couple so I can have spares for just this sort of situation. I believe in a belt and suspenders as far as my data is concerned.
Thanks, Rick. I am going to follow your advice.
Paul
Chiming in with some additional information that only *partially* contradicts certain things that have been said in this thread. First off though, the advice that drives are cheap and data is expensive is absolutely correct. Do NOT let anything I say talk you out of making sure any critical data on this drive is backed up.
Given that tee-up, smartctl/smartd reports that the disk has an "uncorrectable bad sector" when there is a read error from the drive for a sector. The error is "uncorrectable" because the sector cannot be read. Note that the detection of a bad read (or write) takes place at the physical and drive firmware level when the CRC is checked. The only thing that the drive has to work with is that there was an attempt to read a sector and that read resulted in a CRC error.
The bad sector is part of a file and only you, the user, can make a determination as to whether the rest of the file is still good or if the bad sector is throwing a CRC error but the file is still usable. That's also why the error is "uncorrctable". The drive doesn't have enough information to fix it and it can't silently remap the sector since it can't read the data. If it did, you would end up with a file with a null sector somewhere in it at the location that corresponds to the bad sector's data.
Write errors the drive takes care of through the reallocation process mentioned earlier in the thread (since data is being written, any existing data is being replaced so the data can be written to a remapped sector). Read errors the drive can only report the problem since the read error implies that data cannot be retrieved.
My advice: buy a new drive but run badblocks -w on the old drive once you have your data safely off of it. You will probably find that the badbloocks write test (-w) lets the drive see the bad sector being written to and then remaps the bad sector and you end up with a drive that is now completely usable again. Be absolutely sure you have your data off of the drive before running badblocks -w. It will overwrite any data on the drive.
I have "recovered" several drives by doing this. I've also had some that threw errors all over the place. Those became targets.
Cheers, Dave
On Sat, Jun 22, 2013 at 6:48 AM, David G. Miller dave@davenjudy.org wrote:
Chiming in with some additional information that only *partially* contradicts certain things that have been said in this thread. First off though, the advice that drives are cheap and data is expensive is absolutely correct. Do NOT let anything I say talk you out of making sure any critical data on this drive is backed up.
Given that tee-up, smartctl/smartd reports that the disk has an "uncorrectable bad sector" when there is a read error from the drive for a sector. The error is "uncorrectable" because the sector cannot be read. Note that the detection of a bad read (or write) takes place at the physical and drive firmware level when the CRC is checked. The only thing that the drive has to work with is that there was an attempt to read a sector and that read resulted in a CRC error.
The bad sector is part of a file and only you, the user, can make a determination as to whether the rest of the file is still good or if the bad sector is throwing a CRC error but the file is still usable. That's also why the error is "uncorrctable". The drive doesn't have enough information to fix it and it can't silently remap the sector since it can't read the data. If it did, you would end up with a file with a null sector somewhere in it at the location that corresponds to the bad sector's data.
Write errors the drive takes care of through the reallocation process mentioned earlier in the thread (since data is being written, any existing data is being replaced so the data can be written to a remapped sector). Read errors the drive can only report the problem since the read error implies that data cannot be retrieved.
My advice: buy a new drive but run badblocks -w on the old drive once you have your data safely off of it. You will probably find that the badbloocks write test (-w) lets the drive see the bad sector being written to and then remaps the bad sector and you end up with a drive that is now completely usable again. Be absolutely sure you have your data off of the drive before running badblocks -w. It will overwrite any data on the drive.
I have "recovered" several drives by doing this. I've also had some that threw errors all over the place. Those became targets.
Thanks, Dave, for your very clarifying answer. Should I conclude from your words that I have already some corrupted files? If so, is there some way to identify them?
Paul
Paul Smith <phhs80 <at> gmail.com> writes:
On Sat, Jun 22, 2013 at 6:48 AM, David G. Miller <dave <at> davenjudy.org>
wrote:
Chiming in with some additional information that only *partially* contradicts certain things that have been said in this thread. First off though, the advice that drives are cheap and data is expensive is absolutely correct. Do NOT let anything I say talk you out of making sure any critical data on this drive is backed up.
<SNIP>.
Thanks, Dave, for your very clarifying answer. Should I conclude from your words that I have already some corrupted files? If so, is there some way to identify them?
Paul
Paul -
Finding the files that may have been corrupted by a block going bad is a fairly long and involved process described here:
http://smartmontools.sourceforge.net/badblockhowto.html
For almost anything other than text files, finding the file that has a corrupted block doesn't do you any good unless you have a backup copy. But, if you don't, at least you know which file is probably not usable anymore.
For any installed application or OS files, you can always just re-install the package.
Cheers, Dave
Allegedly, on or about 21 June 2013, Tom Horsley sent:
Just remember what happened when HAL told Dave the AE-35 unit would fail in 24 hours :-).
Read the book, it's much more comprehensive than the film (which was shortened, significantly). There's a lot more to that part of the plot than the film conveys.
On 06/22/2013 05:12 AM, Tim wrote:
Allegedly, on or about 21 June 2013, Tom Horsley sent:
Just remember what happened when HAL told Dave the AE-35 unit would fail in 24 hours :-).
Read the book, it's much more comprehensive than the film (which was shortened, significantly). There's a lot more to that part of the plot than the film conveys.
ROTFLMAO! The film was based on a short story, *The Sentinal,* By Arthur C. Clark, who later wrote a novel *based on* the film. Naturally, in that novelization, he had room to add considerable back-story and detail that wasn't in the film.
Tim:
Read the book, it's much more comprehensive than the film (which was shortened, significantly). There's a lot more to that part of the plot than the film conveys.
Joe Zeff:
ROTFLMAO! The film was based on a short story, *The Sentinal,* By Arthur C. Clark,
I know, and there's several quite similar stories. Like a lot of authors, he seemed to do several variations on a theme.
who later wrote a novel *based on* the film.
Supposedly, the film and the novel were worked on concurrently.
Naturally, in that novelization, he had room to add considerable back-story and detail that wasn't in the film.
Supposedly, the first cut of the film was many hours long.
It's one of my favourites, but I have to be in a fairly lethargic mood to sit still and watch it uninterrupted.
On 22.06.2013 17:26, Joe Zeff wrote:
On 06/22/2013 05:12 AM, Tim wrote:
Allegedly, on or about 21 June 2013, Tom Horsley sent:
Just remember what happened when HAL told Dave the AE-35 unit would fail in 24 hours :-).
Read the book, it's much more comprehensive than the film (which was shortened, significantly). There's a lot more to that part of the plot than the film conveys.
ROTFLMAO! The film was based on a short story, *The Sentinal,* By Arthur C. Clark, who later wrote a novel *based on* the film. Naturally, in that novelization, he had room to add considerable back-story and detail that wasn't in the film.
Partially based. :)
poma
On Sat, Jun 22, 2013 at 08:26:37AM -0700, Joe Zeff wrote:
On 06/22/2013 05:12 AM, Tim wrote:
Allegedly, on or about 21 June 2013, Tom Horsley sent:
Just remember what happened when HAL told Dave the AE-35 unit would fail in 24 hours :-).
Read the book, it's much more comprehensive than the film (which was shortened, significantly). There's a lot more to that part of the plot than the film conveys.
ROTFLMAO! The film was based on a short story, *The Sentinal,* By Arthur C. Clark, who later wrote a novel *based on* the film. Naturally, in that novelization, he had room to add considerable back-story and detail that wasn't in the film.
It's been ages since I've read the book, but my recollection is that ACC wrote the book more or less in parallel with the movie, and that Kubrick took liberties therewith.
On 2013/06/21 09:24, Paul Smith wrote:
On Fri, Jun 21, 2013 at 5:21 PM, Steven Stern subscribed-lists@sterndata.com wrote:
Today, I am getting this message at boot time:
"Notice - HD self monitoring system has reported that a parameter has exceeded its normal operating range. Dell recommends that you back up your data regularly. A parameter out of range may or may not indicate a potential Hard. Press F1 to continue , F2 to enter setup"
Is there any reason to be worried about?
My computer is about 3 years old.
For each drive on your system, try
sudo smartctl -H /dev/sdxwhere x=a,b,c,.... etc
If any indicate poor health, then use "--all" in place of -H
However, yeah, it's probably time to get a really good backup and price a replacement drive.
Thanks, Steve. I am getting the following:
# smartctl -H /dev/sda smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.9.6-200.fc18.x86_64] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: FAILED! Drive failure expected in less than 24 hours. SAVE ALL DATA. Failed Attributes: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 5 Reallocated_Sector_Ct 0x0033 013 013 036 Pre-fail Always FAILING_NOW 3587
Paul, when disks start throwing that error it's typically "months" or less before it does something unfriendly like failing to spin up. Put a replacement drive high on your list of things to do. Done soon enough a "dd" disk copy may salvage your current installation with minimum headache. Wait very long and you may have grown problems in critical files.
{^_^}
On Fri, Jun 21, 2013 at 11:09 PM, jdow jdow@earthlink.net wrote:
Today, I am getting this message at boot time:
"Notice - HD self monitoring system has reported that a parameter has exceeded its normal operating range. Dell recommends that you back up your data regularly. A parameter out of range may or may not indicate a potential Hard. Press F1 to continue , F2 to enter setup"
Is there any reason to be worried about?
My computer is about 3 years old.
For each drive on your system, try
sudo smartctl -H /dev/sdxwhere x=a,b,c,.... etc
If any indicate poor health, then use "--all" in place of -H
However, yeah, it's probably time to get a really good backup and price a replacement drive.
Thanks, Steve. I am getting the following:
# smartctl -H /dev/sda smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.9.6-200.fc18.x86_64] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: FAILED! Drive failure expected in less than 24 hours. SAVE ALL DATA. Failed Attributes: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 5 Reallocated_Sector_Ct 0x0033 013 013 036 Pre-fail Always FAILING_NOW 3587
Paul, when disks start throwing that error it's typically "months" or less before it does something unfriendly like failing to spin up. Put a replacement drive high on your list of things to do. Done soon enough a "dd" disk copy may salvage your current installation with minimum headache. Wait very long and you may have grown problems in critical files.
Thanks for your advice, Jdow. i will soon get another disk.
By the way, how can I save my current installation with dd? And how I can restore that onto the new disk?
Paul
On 2013/06/21 15:32, Paul Smith wrote:
On Fri, Jun 21, 2013 at 11:09 PM, jdow jdow@earthlink.net wrote:
Today, I am getting this message at boot time:
"Notice - HD self monitoring system has reported that a parameter has exceeded its normal operating range. Dell recommends that you back up your data regularly. A parameter out of range may or may not indicate a potential Hard. Press F1 to continue , F2 to enter setup"
Is there any reason to be worried about?
My computer is about 3 years old.
For each drive on your system, try
sudo smartctl -H /dev/sdxwhere x=a,b,c,.... etc
If any indicate poor health, then use "--all" in place of -H
However, yeah, it's probably time to get a really good backup and price a replacement drive.
Thanks, Steve. I am getting the following:
# smartctl -H /dev/sda smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.9.6-200.fc18.x86_64] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: FAILED! Drive failure expected in less than 24 hours. SAVE ALL DATA. Failed Attributes: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 5 Reallocated_Sector_Ct 0x0033 013 013 036 Pre-fail Always FAILING_NOW 3587
Paul, when disks start throwing that error it's typically "months" or less before it does something unfriendly like failing to spin up. Put a replacement drive high on your list of things to do. Done soon enough a "dd" disk copy may salvage your current installation with minimum headache. Wait very long and you may have grown problems in critical files.
Thanks for your advice, Jdow. i will soon get another disk.
By the way, how can I save my current installation with dd? And how I can restore that onto the new disk?
Paul
Log in with a rescue or install disk and both drives. I am presuming old drives is /dev/sda and new drive is /dev/sdb. Make REAL sure the new drive is larger than the old drive. (If it is enough larger you can use the extra space as another partition by playing with the partitioning program of choice.)
dd conv=noerror,notrunc bs=1048576 if=/dev/sda of=/dev/sdb &
Note the process number. You can use kill -USR1 id (the process id).
{^_^}
On 06/21/2013 03:09 PM, jdow wrote:
Paul, when disks start throwing that error it's typically "months" or less before it does something unfriendly like failing to spin up. Put a replacement drive high on your list of things to do. Done soon enough a "dd" disk copy may salvage your current installation with minimum headache. Wait very long and you may have grown problems in critical files.
Even better, if your new drive is larger than the current on would be Clonezilla: http://clonezilla.org/ because you can have it adjust the size of the partition to match the drive instead of having wasted space at the end.
On Fri, Jun 21, 2013 at 11:34 PM, Joe Zeff joe@zeff.us wrote:
Even better, if your new drive is larger than the current on would be Clonezilla: http://clonezilla.org/ because you can have it adjust the size of the partition to match the drive instead of having wasted space at the end.
Thanks, Joe. After having created an iso file that mirrors my disk, how can I clone the new disk with this iso file?
Paul
On Sat, Jun 22, 2013 at 12:02 AM, Paul Smith phhs80@gmail.com wrote:
Even better, if your new drive is larger than the current on would be Clonezilla: http://clonezilla.org/ because you can have it adjust the size of the partition to match the drive instead of having wasted space at the end.
Thanks, Joe. After having created an iso file that mirrors my disk, how can I clone the new disk with this iso file?
I am learning now that I can use Clonezilla again to clone the new disk with the created iso file.
Paul
I apologize for not more closely following this thread; if this has already been suggested, please just take it as reinforcement.
Once, long ago--actually, on Fri, Jun 21, 2013 at 05:09:09PM CDT--jdow (jdow@earthlink.net) said:
Paul, when disks start throwing that error it's typically "months" or less before it does something unfriendly like failing to spin up.
Or less. Remember--when it's out of sectors for remapping, it's near the end.
Put a replacement drive high on your list of things to do.
Not only that--given that disks are (relatively) cheap, and data is expensive, this would be the time to consider putting in at least a RAID1 array. Life gets a lot less hectic when a disk failure is only one in an array.
I just went through this--a mdadm RAID5 with three disks, and one of my disks finally failed. It was just a matter of getting another disk at the best price, while the server limped along on two drives. And limping is only relative--there was no appreciable degradation in performance.
Replacing the disk took about a half-hour, and most of that was physical remove/replace effort. Told it about the new disk, and after a few hours it had rebuilt the array. No muss, no fuss.
Cheers, -- Dave Ihnat President, DMINET Consulting, Inc. dihnat@dminet.com
Paul Smith phhs80@gmail.com writes:
5 Reallocated_Sector_Ct 0x0033 013 013 036 Pre-fail Always FAILING_NOW 3587
What the other respondents missed is that this is the *reallocated* sector count. It is a count of the sectors that have already been replaced by spares. There are no unreadable files currently. The replacement only happens on writes, so you are probably seeing freshly written, pristine files.
As to the haste you should replace the disk with, that all depends on how the sectors got trashed in the first place. If you moved the computer while it was up and writing to the disk the arm could have been jiggled onto the next track during the write. I know that happens with 3-1/2" drives. I was seeing an ever increasing reallocated sectors on my desktop machine until I traced the lossage back to me tilting the case forward in order to get easier access to the connectors on the back.
Another computer, my laptop has had 6 reallocated sectors since the first few weeks after I got it. That was 7+ years ago. The disk is still going strong with no increase in reallocated sector count (or pending reallocation sector count) in all those years. Stable reallocated counts shouldn't bother you too much. It is when they go up that you should be concerned.
On the other hand, I do nightly rsync backups to a spare disk. According to google, which probably has more disks that the NSA, only half of the disk drive deaths are preceded by smartmon saying anything.
-wolfgang
On Sun, Jun 23, 2013 at 12:01 AM, Wolfgang S. Rupprecht wolfgang.rupprecht@gmail.com wrote:
5 Reallocated_Sector_Ct 0x0033 013 013 036 Pre-fail Always FAILING_NOW 3587
What the other respondents missed is that this is the *reallocated* sector count. It is a count of the sectors that have already been replaced by spares. There are no unreadable files currently. The replacement only happens on writes, so you are probably seeing freshly written, pristine files.
As to the haste you should replace the disk with, that all depends on how the sectors got trashed in the first place. If you moved the computer while it was up and writing to the disk the arm could have been jiggled onto the next track during the write. I know that happens with 3-1/2" drives. I was seeing an ever increasing reallocated sectors on my desktop machine until I traced the lossage back to me tilting the case forward in order to get easier access to the connectors on the back.
Another computer, my laptop has had 6 reallocated sectors since the first few weeks after I got it. That was 7+ years ago. The disk is still going strong with no increase in reallocated sector count (or pending reallocation sector count) in all those years. Stable reallocated counts shouldn't bother you too much. It is when they go up that you should be concerned.
On the other hand, I do nightly rsync backups to a spare disk. According to google, which probably has more disks that the NSA, only half of the disk drive deaths are preceded by smartmon saying anything.
I thank you, Wolfgang, and all other people who have helped me so much with this issue.
I have now a new disk installed in my machine, and I was successful with copying the content of the old disk to the new one with Clonezilla, which is a great tool to clone disks.
I will follow your recommendations and I will try to recover the old disk with
badblocks -w
as suggested.
Paul
On 06/21/2013 10:21 AM, Steven Stern wrote:
On 06/21/2013 11:17 AM, Paul Smith wrote:
Dear All,
Today, I am getting this message at boot time:
"Notice - HD self monitoring system has reported that a parameter has exceeded its normal operating range. Dell recommends that you back up your data regularly. A parameter out of range may or may not indicate a potential Hard. Press F1 to continue , F2 to enter setup"
Is there any reason to be worried about?
My computer is about 3 years old.
Thanks in advance,
Paul
For each drive on your system, try
sudo smartctl -H /dev/sdxwhere x=a,b,c,.... etc
If any indicate poor health, then use "--all" in place of -H
However, yeah, it's probably time to get a really good backup and price a replacement drive.
I recommend that you get a SSD drive. I use a Dell 9400 Inspiron and replaced my drive with a SSD 120g and haven't had anymore trouble.
On 06/21/2013 09:24 AM, Lawrence Graves issued this missive:
On 06/21/2013 10:21 AM, Steven Stern wrote:
On 06/21/2013 11:17 AM, Paul Smith wrote:
Dear All,
Today, I am getting this message at boot time:
"Notice - HD self monitoring system has reported that a parameter has exceeded its normal operating range. Dell recommends that you back up your data regularly. A parameter out of range may or may not indicate a potential Hard. Press F1 to continue , F2 to enter setup"
Is there any reason to be worried about?
My computer is about 3 years old.
Thanks in advance,
Paul
For each drive on your system, try
sudo smartctl -H /dev/sdxwhere x=a,b,c,.... etc
If any indicate poor health, then use "--all" in place of -H
However, yeah, it's probably time to get a really good backup and price a replacement drive.
I recommend that you get a SSD drive. I use a Dell 9400 Inspiron and replaced my drive with a SSD 120g and haven't had anymore trouble.
If you use SSDs, make danged sure you back up really regularly to a standard drive. When an SSD dies, they tend to die quickly (like "Poof!") and you have virtually no grace time to get off what you can. I've had a number of Macbooks here at the office do that. ---------------------------------------------------------------------- - Rick Stevens, Systems Engineer, AllDigital ricks@alldigital.com - - AIM/Skype: therps2 ICQ: 22643734 Yahoo: origrps2 - - - - ...Had this been an actual emergency, we would have fled in terror - - and you'd be on your own, pal! - ----------------------------------------------------------------------
On 06/21/2013 11:46 AM, Rick Stevens wrote:
On 06/21/2013 09:24 AM, Lawrence Graves issued this missive:
On 06/21/2013 10:21 AM, Steven Stern wrote:
On 06/21/2013 11:17 AM, Paul Smith wrote:
Dear All,
Today, I am getting this message at boot time:
"Notice - HD self monitoring system has reported that a parameter has exceeded its normal operating range. Dell recommends that you back up your data regularly. A parameter out of range may or may not indicate a potential Hard. Press F1 to continue , F2 to enter setup"
Is there any reason to be worried about?
My computer is about 3 years old.
Thanks in advance,
Paul
For each drive on your system, try
sudo smartctl -H /dev/sdxwhere x=a,b,c,.... etc
If any indicate poor health, then use "--all" in place of -H
However, yeah, it's probably time to get a really good backup and price a replacement drive.
I recommend that you get a SSD drive. I use a Dell 9400 Inspiron and replaced my drive with a SSD 120g and haven't had anymore trouble.
If you use SSDs, make danged sure you back up really regularly to a standard drive. When an SSD dies, they tend to die quickly (like "Poof!") and you have virtually no grace time to get off what you can. I've had a number of Macbooks here at the office do that.
- Rick Stevens, Systems Engineer, AllDigital ricks@alldigital.com -
- AIM/Skype: therps2 ICQ: 22643734 Yahoo: origrps2 -
- ...Had this been an actual emergency, we would have fled in terror -
and you'd be on your own, pal! -
I agree. This is a practice that should be done no matter what kind of drive you use.