Reproducible Filesystem Corruption on FC4 (Long)

Wilbur Harvey wnh200405 at xphuang.com
Thu Jun 30 00:17:05 UTC 2005


I have a similiar but different problem.
I have a backup system, which is only used to do backups to. It is an 
Athlon Shuttle system 2ea SATA hard drives in a software raid 
configuration. There is an SATA to pci adapter to support these SATA drives.
Every week or 2 the rsync backup fails. Each time it turns our that the 
disk was full, and it was full because of file system corruption. It 
usually takes an hour or two to run the fsck to recover the file system, 
many times requiring manual intervention.
Just my 2 cents worth here just in case there is some type of pattern.
Wilbur

Tom Sightler wrote:

>On Wed, 2005-06-29 at 20:19 -0300, Ben Steeves wrote: 
>  
>
>>On 6/29/05, Tom Sightler <ttsig at tuxyturvy.com> wrote:
>>    
>>
>>>I decided to reinstall and try again.  This time, immediately after the
>>>install I ran fsck and found no errors.  I copied my directories from my
>>>backup again, and the corruption also returned.  I repeated again, this
>>>time I booted with ide=nodma before restoring my backup, this caused the
>>>restore to take so long that I wasn't sure it would ever finish.  I did
>>>not get corruption, but the system was far to slow to use with this
>>>option.
>>>      
>>>
>>This really, really sounds like a hardware problem.  I would check
>>your /var/log/messages and see what smartctl has to say about your
>>drives.  I'd also check the status of the drivers for your USB
>>controller chipset, since if it is a software bug, that's probably
>>where the problem lies.
>>    
>>
>
>I would agree that it sounds that way, but I simply don't think this is
>the case.  For one thing, if it were a hardware problem, the system
>wouldn't work with CentOS 4 or FC3 either, but both of those install and
>run fine.  I use this system 12-14 hours a day with CentOS 4 and have
>never experienced a single glitch.
>
>There were absolutely no errors in /var/log/messages or in dmesg in
>regards to the hardware, everything appeared to be working 100%
>correctly, it just silently corrupted the data, time and time again.
>
>I reinstalled CentOS 4, performed the identical steps, and everything
>works perfectly.  I can also install FC3 and perform the steps without
>issues, however, with FC4 I get silent corruption everytime I restore my
>data from the USB device.
>
>I suppose it's possible to be some issue with reading from the USB
>drive.  I found some notes claiming that recent improvement in usb-
>storage driver push the hardware harder and can sometimes expose USB
>chipset problems that previously were hidden.  I could possibly buy
>this, but even if the source drive is corrupt, that shouldn't corrupt
>the drive your writing too, and in my case it's the internal IDE drive
>that's being corrupted.  I can absolutely hammer this drive for days
>with CentOS 4 without even a slight glitch and zero corruption.
>
>I'm going to try tonight by installing FC4 and then replacing the kernel
>before doing the restore, that should give me a good clue.
>
>Thanks,
>Tom
>
>
>  
>




More information about the users mailing list