I have a raid0 array, and I'm getting this:
Currently unreadable (pending) sectors detected: /dev/sda - 48 Time(s) 2 unreadable sectors detected
Offline uncorrectable sectors detected: /dev/sda - 48 Time(s) 2 offline uncorrectable sectors detected
What should I do? Run badblocks?
Neal Becker wrote:
I have a raid0 array, and I'm getting this:
...
What should I do? Run badblocks?
With RAID 0 you don't have any redundancy, so my suggestion would be:
1. backup 2. replace drive 3. restore
and hope that it isn't too late.
Mogens
On Thu, Oct 11, 2007 at 07:07:47 -0400, Neal Becker ndbecker2@gmail.com wrote:
I have a raid0 array, and I'm getting this:
Currently unreadable (pending) sectors detected: /dev/sda - 48 Time(s) 2 unreadable sectors detected
Offline uncorrectable sectors detected: /dev/sda - 48 Time(s) 2 offline uncorrectable sectors detected
What should I do? Run badblocks?
Not initially, the drive will handle this. It won't do anything until it gets a good read or the sectors are overwritten. Once that happens it may reallocate the sector based on whether or not the drive firmware thinks the sector is permanently hosed or not.
You can get the sector numbers by running self tests with smartctl. You can start with short self tests, but sometimes you need to run the long ones.
The big problem is figuring out what you lost. If those blocks are part of a file currently you have lost some data. I don't know if there is a howto anywhere for figuring out which sectors are in which files when using raid 0. If you are using propietary raid, it is probably going to be even harder to get this information.
So you will need to make a determination about your recovery path. If there isn't anything valuable on the machine you might just overwrite the disks (using dd running from a rescue cd is one way to do this), test the drives using badblocks and convince yourself they are OK or replace the drives, reinstall and reload from backups. If you have valuable data on the machine, then you will want to start by making a new back up and then you will have to figure out what you want from the new back up and what you want from your old backups when you do the reinstall.
You may also want to reconsider which raid you are using. Unless you are doing a lot of file I/O that really needs to be fast, raid 0 is not going to give you a lot of benefit and does make recovery harder. raid 1 allows you to repair the kind of damage you have without too much difficult, but at the cost of half of your disk space, which may be too high for you.
On Thu, Oct 11, 2007 at 08:43:03AM -0500, Bruno Wolff III wrote:
... raid 1 allows you to repair the kind of damage you have without too much difficult, but at the cost of half of your disk space, which may be too high for you.
Ah...a single 500Gb drive is about $100 or so street price these days. RAID0 over RAID1 is almost never justifiable, except maybe in a RAID10 config.
People time (recovery, reconfig) and data value massively trump disk storage space today.
Cheers, -- Dave Ihnat President, DMINET Consulting, Inc. dihnat@dminet.com
On 12/10/2007, Dave Ihnat dihnat@dminet.com wrote:
On Thu, Oct 11, 2007 at 08:43:03AM -0500, Bruno Wolff III wrote:
... raid 1 allows you to repair the kind of damage you have without too much difficult, but at the cost of half of your disk space, which may be too high for you.
Ah...a single 500Gb drive is about $100 or so street price these days. RAID0 over RAID1 is almost never justifiable, except maybe in a RAID10 config.
People time (recovery, reconfig) and data value massively trump disk storage space today.
Entirely agreed. One could argue that raid0 actually *increases* your chances of data loss, as you have twice the chance of a drive failing.
On 11Oct2007 07:07, Neal Becker ndbecker2@gmail.com wrote: | I have a raid0 array, and I'm getting this: | Currently unreadable (pending) sectors detected: | /dev/sda - 48 Time(s) | 2 unreadable sectors detected | | Offline uncorrectable sectors detected: | /dev/sda - 48 Time(s) | 2 offline uncorrectable sectors detected | | What should I do? Run badblocks?
As remarked, replacement drives may be the cheap and experient approach.
But I will remark that my venerable laptop drive started doing this to me some time ago.
Since it had a single ext3 filesystem I went:
e2fsck -c /dev/sda
That ran badblocks, and told the fs to avoid those sectors.
And it's been fine since.
It took a whole day as I recall; once the scan hits the bad sectors Linux stalls for a while retrying them before propagating the failure up for badblocks to see. And of course I've lost some data somewhere.
The prudent thing is new media, and a more robust RAID (1, or 5 if you've got enough $s and drive bays). The cheap but effictive thing, if $s are short and you can stand the downtime, may be an "e2fsck -c".
You're still at risk of course, but since I don't believe that bad sectors spread like a fungus I suspect you're not at much more risk than you already are with RAID0. It may buy you a lot of time while accruing $s and planning your storage upgrade.
Cheers,
On Fri, Oct 12, 2007 at 01:46:56PM +1000, Cameron Simpson wrote:
You're still at risk of course, but since I don't believe that bad sectors spread like a fungus I suspect you're not at much more risk than you already are with RAID0.
Ahh...that may not be quite right. If the bad blocks are just due to a failure in the oxide coating _in situ_, maybe. But I've seen cases where the failure actually resulted from some of the oxide flaking off--head strikes, most commonly--and this *can* "spread like a fungus" as the particulate can damage heads, stick to platter surfaces and cause further damage, etc.
Reflecting, I don't think I've seen that kind of failure since drives have been using cobalt oxides, but nevertheless, hard as it may be, it's still a coating. The general rule still applies--once you start seeing failures on your side of the interface, it means you've been undergoing enough failures that the on- drive sector remapping is exhausted; you've already experiencing multiple sector failures for some time. It's time to change drives.
Cheers, -- Dave Ihnat President, DMINET Consulting, Inc. dihnat@dminet.com