Recurring problem on 6-core FC14 system
William Austin
airedad at att.net
Sun Mar 6 05:38:31 UTC 2011
Recently I replaced MB/CPU/Memory on my main workstation, and a few
days later I installed FC14. The problems started then. I have tried
looking this one up in bugzilla, but so far with no luck.
The system ran cleanly for 4 days on FC13 and the problems I describing
started only after the new install.
THE PROBLEM (h/w s/w details after this section):
What is happening is that processes are dying randomly. So far no core
dumps however, so I can't go into it that way. I did start a background
process going however, and every 2 minutes I do a dump of dmseg to a file.
Currently I have about 3000 saves of dmesg to look at. Only a couple
of things fall out:
1) 16 times I have hit a combination of both the message
"BUG: unable to handle kernel NULL pointer dereference at 0000000000000049"
"Oops: 0000 [#2] SMP" (or [#14] which I presume means
both 1 & 4)
(16 out of 3000 is statistically below the noise threshold)
2) 88 times I got the message:
"last sysfs file: /sys/devices/pci0000:00/0000:00:11.0/host2/target2:0:0/2:0:0:0/block/sdb/sdb6/stat"
Those were the only consistent problems which came out of the dmesg
outputs - the 'tainted' process has most commonly been ps (54 times),
followed by jackd (11 times) and plasma-desktop (4 times)
I would have thought the message about the problems with /sdb6 was
significant but I tried installing on a different drive, and the same
problem continues. Then it hit me - I have accounting turned on and
that's where /var/log lives .... so that one was probably a false lead.
Finally I suspected a memory problem - but memtest86 couldn't find
the problem.
I still have about a week to return MB or CPU or Memory if I could
narrow the problem down to one of them - but without being able to
narrow the problem down to a specific component, I can't really do that.
Any suggestions would be greatly appreciated. I don't want to be stuck with a buggy system.
H/W S/W DETAILS:
Here are some of the pertinent details for this box:
(BTW: nothing is overclocked)
MB: Gigabyte 890XA-UD3
CPU: AMD Phenom(tm) II X6 1090T Processor
Mem: 8gb Kingston HyperX Blu 4GB 240-Pin DDR3 SDRAM DDR3 1333
HD's: 5 sata 3 drivers, 3.75Tb total
Installed cards:
- D-Link System Inc DGE-560T PCI Express Gigabit Ethernet Adapter (rev 13)\
- Adaptec AHA-2940U2/U2W
- Creative Labs SB Audigy (rev 04)
- JMicron Technology Corp. JMB362/JMB363 Serial ATA Controller*
(* for a set of 4 2GB drives to hold data for analysis - drives not\
currently on system)
Alien drivers: I'm using the nvidia drivers now, but the same thing
happened with the noiveau and vesa drivers, so I don't think that's it.
Finally, I'm a computer geek for a living, and I also build about 6-8
new systems a year for myself of friends and upgrade 12-15. I've been
doing this for at least 15 years and consider myself relatively
experienced with working at the hardware level.
The install was a fresh install, wiping out the previous FC13
installation.
Thanks,
-- William
william w. austin airedad at att.net
======================================================================
"life is just another phase i'm going through... this time anyway"
More information about the users
mailing list