I'm getting some strange things happening on F10 lately. In a terminal session if I "ps -ef" or "w" or "ls" or anything of that nature the command runs (sometimes) but never returns the prompt. Also, my loadavg seems to go sky high but "top" doesn't necessarily show anything using much CPU (actually, top will often show that my cpu utilization is 90+% idle while my loadavg is upwards of 10). I can usually shutdown my "gui"fied apps (like Thunderbird, Firefox, VirtualBox) but if I try and reboot the reboot hangs while trying to shutdown "automount" and I have to cycle the power on the box. I'm running the latest updates including kernel 2.6.27.15-170.2.24.fc10.i686. Any thoughts or suggestions are appreciated.
Thanks.
Kevin
Kevin Martin writes:
I'm getting some strange things happening on F10 lately. In a terminal session if I "ps -ef" or "w" or "ls" or anything of that nature the command runs (sometimes) but never returns the prompt. Also, my loadavg seems to go sky high but "top" doesn't necessarily show anything using much CPU (actually, top will often show that my cpu utilization is 90+% idle while my loadavg is upwards of 10). I can usually shutdown my "gui"fied apps (like Thunderbird, Firefox, VirtualBox) but if I try and reboot the reboot hangs while trying to shutdown "automount" and I have to cycle the power on the box. I'm running the latest updates including kernel 2.6.27.15-170.2.24.fc10.i686. Any thoughts or suggestions are appreciated.
Stuff like that usually indicates a kernel bug. Try rolling back to an earlier kernel, and see if it helps.
Perhaps a similar kernel bug that nailed me on one of my x86_64 servers has now migrated to i686 land. On one of my servers, 2.6.27.12 was the first kernel that could actually survive under load. Earlier kernels kept falling apart on me, with processes getting stuck in permanent TASK_UNINTERRUPTIBLE states, which prevented me from updating to F10 from F9. That was barrels of laughs to deal with.
Roll back to an earlier kernel, then just keep trying new kernels as they become released, and cross your fingers.
On Tue, 24 Feb 2009 18:42:53 -0500 Sam Varshavchik wrote:
I'm getting some strange things happening on F10 lately. In a terminal session if I "ps -ef" or "w" or "ls" or anything of that nature the command runs (sometimes) but never returns the prompt...
Stuff like that usually indicates a kernel bug. Try rolling back to an earlier kernel, and see if it helps.
Or a hardware bug. I always try things like running memtest86+ and examining logs for disk errors or SMART messages, or use smartctl to run tests on the disk.
I recently solved a completely frozen system problem by changing sata cables and switching to a different sata port (so don't neglect checking the simple stuff either).
On Tue, 2009-02-24 at 17:29 -0600, Kevin Martin wrote:
I'm getting some strange things happening on F10 lately. In a terminal session if I "ps -ef" or "w" or "ls" or anything of that nature the command runs (sometimes) but never returns the prompt. Also, my loadavg seems to go sky high but "top" doesn't necessarily show anything using much CPU (actually, top will often show that my cpu utilization is 90+% idle while my loadavg is upwards of 10). I can usually shutdown my "gui"fied apps (like Thunderbird, Firefox, VirtualBox) but if I try and reboot the reboot hangs while trying to shutdown "automount" and I have to cycle the power on the box. I'm running the latest updates including kernel 2.6.27.15-170.2.24.fc10.i686. Any thoughts or suggestions are appreciated.
Does this happen all the time? If not, do you have any hard mounts from NFS servers?
If it does happen all the time, try "strace ps -ef" and watch the output.
poc
On Tue, Feb 24, 2009 at 3:29 PM, Kevin Martin kevintm@ameritech.net wrote:
I'm getting some strange things happening on F10 lately. In a terminal session if I "ps -ef" or "w" or "ls" or anything of that nature the command runs (sometimes) but never returns the prompt. Also, my loadavg seems to go sky high but "top" doesn't necessarily show anything using much CPU (actually, top will often show that my cpu utilization is 90+% idle while my loadavg is upwards of 10). I can usually shutdown my "gui"fied apps (like Thunderbird, Firefox, VirtualBox) but if I try and reboot the reboot hangs while trying to shutdown "automount" and I have to cycle the power on the box. I'm running the latest updates including kernel 2.6.27.15-170.2.24.fc10.i686. Any thoughts or suggestions are appreciated.
Thanks.
Kevin
Try connecting remotely: are you able to login cleanly? can you issue commands such as 'pf -ef' in the remote terminal? I had a similar problem a bit ago and it was related to a large number of ssh processes "hanging around". I kill them all and it was all fine. Check the logs, your problem could be due to several other reasons. As mentioned earlier: don't forget the simple stuff. Check that the /, /tmp or /var filesystems are ok and not full. Some systems slow down when they overheat: check the fans. If you don't have enough physical memory, make sure to have enough swap space.
~af
Kevin Martin wrote:
I'm getting some strange things happening on F10 lately. In a terminal session if I "ps -ef" or "w" or "ls" or anything of that nature the command runs (sometimes) but never returns the prompt. Also, my loadavg seems to go sky high but "top" doesn't necessarily show anything using much CPU (actually, top will often show that my cpu utilization is 90+% idle while my loadavg is upwards of 10). I can usually shutdown my "gui"fied apps (like Thunderbird, Firefox, VirtualBox) but if I try and reboot the reboot hangs while trying to shutdown "automount" and I have to cycle the power on the box. I'm running the latest updates including kernel 2.6.27.15-170.2.24.fc10.i686. Any thoughts or suggestions are appreciated.
Thanks.
Kevin
Check the %wa field in top - most likely the CPU is busy with some I/O accesses (probably the HD). I always look at the %id to get the % CPU amount actually used because %wa is not added to %system or %user
Instead of power cycling, use REISUB - it should be easier on your file system (need to enable it with "kernel.sysrq = 1" in /etc/sysctl.conf)
Konstantin Svist wrote:
Kevin Martin wrote:
I'm getting some strange things happening on F10 lately. In a terminal session if I "ps -ef" or "w" or "ls" or anything of that nature the command runs (sometimes) but never returns the prompt. Also, my loadavg seems to go sky high but "top" doesn't necessarily show anything using much CPU (actually, top will often show that my cpu utilization is 90+% idle while my loadavg is upwards of 10). I can usually shutdown my "gui"fied apps (like Thunderbird, Firefox, VirtualBox) but if I try and reboot the reboot hangs while trying to shutdown "automount" and I have to cycle the power on the box. I'm running the latest updates including kernel 2.6.27.15-170.2.24.fc10.i686. Any thoughts or suggestions are appreciated.
Thanks.
Kevin
Check the %wa field in top - most likely the CPU is busy with some I/O accesses (probably the HD). I always look at the %id to get the % CPU amount actually used because %wa is not added to %system or %user
Instead of power cycling, use REISUB - it should be easier on your file system (need to enable it with "kernel.sysrq = 1" in /etc/sysctl.conf)
Looks like I misread - now I see that you wrote that idle is 90+... REISUB still applies though - http://fosswire.com/2007/09/08/fix-a-frozen-system-with-the-magic-sysrq-keys...
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 02/24/2009 07:39 PM, Konstantin Svist wrote:
Konstantin Svist wrote:
Kevin Martin wrote:
I'm getting some strange things happening on F10 lately. In a terminal session if I "ps -ef" or "w" or "ls" or anything of that nature the command runs (sometimes) but never returns the prompt. Also, my loadavg seems to go sky high but "top" doesn't necessarily show anything using much CPU (actually, top will often show that my cpu utilization is 90+% idle while my loadavg is upwards of 10). I can usually shutdown my "gui"fied apps (like Thunderbird, Firefox, VirtualBox) but if I try and reboot the reboot hangs while trying to shutdown "automount" and I have to cycle the power on the box. I'm running the latest updates including kernel 2.6.27.15-170.2.24.fc10.i686. Any thoughts or suggestions are appreciated.
Thanks.
Kevin
Check the %wa field in top - most likely the CPU is busy with some I/O accesses (probably the HD). I always look at the %id to get the % CPU amount actually used because %wa is not added to %system or %user
Instead of power cycling, use REISUB - it should be easier on your file system (need to enable it with "kernel.sysrq = 1" in /etc/sysctl.conf)
Looks like I misread - now I see that you wrote that idle is 90+... REISUB still applies though - http://fosswire.com/2007/09/08/fix-a-frozen-system-with-the-magic-sysrq-keys...
Thanks for the tip. I edited sysctl.conf and rebooted. Using the key sequence, I can reboot my machine when things are good. But, just a few minutes ago, it locked up completely. I was looking at a frozen screen and trying the alt-sysrq REISUB did nothing. I had to hit the power switch. Also, I'm unable to ssh into the box when its locked up.
Nothing seems to hit /var/log/messages right before the lockups. Is there any other location I can look at for post-mortem clues?
- --
Steve
Patrick O'Callaghan wrote:
On Tue, 2009-02-24 at 17:29 -0600, Kevin Martin wrote:
I'm getting some strange things happening on F10 lately. In a terminal session if I "ps -ef" or "w" or "ls" or anything of that nature the command runs (sometimes) but never returns the prompt. Also, my loadavg seems to go sky high but "top" doesn't necessarily show anything using much CPU (actually, top will often show that my cpu utilization is 90+% idle while my loadavg is upwards of 10). I can usually shutdown my "gui"fied apps (like Thunderbird, Firefox, VirtualBox) but if I try and reboot the reboot hangs while trying to shutdown "automount" and I have to cycle the power on the box. I'm running the latest updates including kernel 2.6.27.15-170.2.24.fc10.i686. Any thoughts or suggestions are appreciated.
Does this happen all the time? If not, do you have any hard mounts from NFS servers?
If it does happen all the time, try "strace ps -ef" and watch the output.
poc
No, no NFS mounts at all and it seems to happen sporadically.
dmesg doesn't show anything of particular interest at these times either (and it is one of the only commands that will finish and give back a prompt).
Kevin
On Wed, 2009-02-25 at 19:24 -0600, Kevin Martin wrote:
Patrick O'Callaghan wrote:
On Tue, 2009-02-24 at 17:29 -0600, Kevin Martin wrote:
I'm getting some strange things happening on F10 lately. In a terminal session if I "ps -ef" or "w" or "ls" or anything of that nature the command runs (sometimes) but never returns the prompt. Also, my loadavg seems to go sky high but "top" doesn't necessarily show anything using much CPU (actually, top will often show that my cpu utilization is 90+% idle while my loadavg is upwards of 10). I can usually shutdown my "gui"fied apps (like Thunderbird, Firefox, VirtualBox) but if I try and reboot the reboot hangs while trying to shutdown "automount" and I have to cycle the power on the box. I'm running the latest updates including kernel 2.6.27.15-170.2.24.fc10.i686. Any thoughts or suggestions are appreciated.
Does this happen all the time? If not, do you have any hard mounts from NFS servers?
If it does happen all the time, try "strace ps -ef" and watch the output.
poc
No, no NFS mounts at all and it seems to happen sporadically.
dmesg doesn't show anything of particular interest at these times either (and it is one of the only commands that will finish and give back a prompt).
I recall a long time ago having a problem that this one sounds a bit like. It turned out to be bad memory.
Any changes to your memory configuration recently? Have you run memtest?
Kevin