random crashes

stan gryt2 at q.com
Sun Feb 27 21:54:42 UTC 2011


On Sun, 27 Feb 2011 20:08:58 +0100
Andras Simon <szajmi at gmail.com> wrote:

> On 2/27/11, Patrick Bartek <bartek047 at yahoo.com> wrote:
> 
> > TO: OP
> >
> > If you think it's specifically a Fedora problem (I don't), I would
> > install
> 
> Me neither. It's just that I still think it's possible that it's not a
> HW problem.

I'm going to offer some support for this opinion.  Anecdotal only,
though.  Since around F11 or F12, I've had problems with lockups when
running the stock Fedora kernel.  So in each case, I've compiled a
custom kernel from the src.rpm of the kernels that Fedora distributes.
I tune it to eliminate any hardware I don't have on my system in order
to cut down on the compile time (from an hour to as low as 10 minutes)
and also to tune for performance and eliminate features that I don't
use. The lockups then go away.  I never see one again.

For me, this seems to happen when I eliminate SMP.  I think this is
because I have a single core CPU, and the scheduler doesn't compensate
for this properly.  No proof, not even evidence other than when I do
this I no longer have lockups.  And it could be an interaction with
something else I've removed.  The kernel is a very complicated beast.

Starting with the 2.6.35 series used in F14, I have to patch the kernel
in order to compile the kernel with no SMP.  This is because the code
hasn't been properly fenced in with ifdefs.  I imagine that by this
point there is no one developing the kernel who is actually using a
single core machine, so it is understandable that they aren't testing
whether single core works or not.  I did open a bugzilla, but it is
unlikely to see any action for the same reason.

Here is the link for building a custom kernel.
http://fedoraproject.org/wiki/Building_a_custom_kernel

Here is the patch if you are going to compile with single core set.

--- kernel-2.6.35.noarch/kernel/sched.c 2010-10-16 09:27:21.017080819 -0700
+++ kernel-2.6.35.noarch/kernel/sched.c 2010-10-16 09:31:09.299373307 -0700
@@ -5273,7 +5273,9 @@ void __cpuinit init_idle(struct task_str
        unsigned long flags;
 
        local_irq_save(flags);
+#if defined(CONFIG_SMP)
        double_rq_lock(oldrq, rq);
+#endif
 
        __sched_fork(idle);
        idle->state = TASK_RUNNING;
@@ -5298,7 +5300,9 @@ void __cpuinit init_idle(struct task_str
 #if defined(CONFIG_SMP) && defined(__ARCH_WANT_UNLOCKED_CTXSW)
        idle->oncpu = 1;
 #endif
+#if defined(CONFIG_SMP)
        double_rq_unlock(oldrq, rq);
+#endif
        local_irq_restore(flags);
 
        /* Set the preempt count _outside_ the spinlocks! */



You could also try compiling a stock kernel as I've seen reports on
this list by people who use the latest and greatest from kernel.org
without any problems.  That does remove any fixes that Fedora / RH have
made that haven't made it into the mainline kernel yet though.


More information about the users mailing list