Another slip in the FC6 schedule

Jeff Vian jvian10 at charter.net
Tue Oct 17 22:53:34 UTC 2006


On Tue, 2006-10-17 at 18:35 -0400, Dave Jones wrote:
> On Wed, Oct 18, 2006 at 12:05:14AM +0200, Alfredo Ferrari wrote:
>  > Seriously, I believe this is a big issue. Let me summarize:
>  > 
>  > a) there was a kernel update for FC5
>  > b) this kernel has a known bug which could results in corrupting
>  >     ext3 filesystems with 1k block size under heavy load
> 
> it doesn't corrupt filesystems, it crashes instantly when the bug is hit.
> 
>  > c) ... nevertheless it has been pushed out with no special warning
>  > d) pratically all /boot partitions are ext3 1k (anaconda generated)
>  > e) many partitions on old machine upgraded from previous versions are
>  >     ext3 1k as well
> 
> /boot partitions don't see anywhere near the sustained IO that is needed
> to hit this bug.  it takes _hours_ of insane amounts of IO to hit it.
> It should be noted that I was the only person to ever see this.
> No bugzilla reports. No upstream reports.  This is a real corner case
> scenario, as usually filesystems that see that kind of IO want the higher
> throughput that a larger blocksize brings.
> 
Who in the world has a large amount of IO on /boot?

Since that is usually a separate filesystem and is usually only 100 Mb
in size, it is IME basically a static filesystem that only changes when
the kernel is updated.

I can easily see the reason that bug has not been encountered in the
past.

>  > What was the rationale for releasing an official kernel update under such
>  > dangerous conditions? Just "anaconda doesn't generate 1k partitions (not 
>  > true BTW)"? I still believe Linux is not (yet) Windows and if features are
>  > in the system (like 1k blocksize partitions) people can use them if 
>  > they feel appropriate and they must work. Or perhaps there was a rush to 
>  > push this 2.6.18 kernel out to get some extra guinea pigs finding all 
>  > residual bugs? But this could be fair for the FC6 betas, not for FC5 where 
>  > people is expecting reasonable stability, anyway no life-threatening
>  > issue like a (known) filesystem corruption bug.
> 
> That code hasn't changed in months, so the 2.6.17 kernel in FC5 likely 
> was already affected by the same bug, and yet despite this, no-one was
> hitting it because of the pathalogical circumstances needed to hit it.
> 
>  > Now how long do we have to wait before we have an update for FC5 fixing
>  > this critical issue? Or do we have to manually rollback kernels on all 
>  > machines?
> 
> I'm already working on the next update.
> 
> 	Dave
> 
> -- 
> http://www.codemonkey.org.uk
> 




More information about the users mailing list