F29 updated.
For the last couple of days I've turned on my monitor in the morning to find Plasma unresponsive, i.e. clicking on the bottom panel does nothing, neither does right-clicking on the background.
The machine is left on at night (not suspended or hibernating, just the monitor turned off), and today the clock widget shows a frozen time of 4:55am. Nothing is running overnight except the DE and my backup software (rsnapshot) which does run at around that time though it has been doing that for years without issue and according to the logs completed on time as usual.
'killall -s SIGSEGV plasmashell' (which in the past would kick it into restarting) is ineffective. Maybe the magic incantation has changed, but as none of this stuff is documented in man pages I don't know.
'kill <PID>' does kill it, but how do I run it again without logging out? Simply running plasmashell from the command line does put up the widget panel but it's still frozen, and the command never returns.
poc
On 11/6/18 7:24 PM, Patrick O'Callaghan wrote:
F29 updated.
For the last couple of days I've turned on my monitor in the morning to find Plasma unresponsive, i.e. clicking on the bottom panel does nothing, neither does right-clicking on the background.
So, you can move the mouse?
The machine is left on at night (not suspended or hibernating, just the monitor turned off), and today the clock widget shows a frozen time of 4:55am. Nothing is running overnight except the DE and my backup software (rsnapshot) which does run at around that time though it has been doing that for years without issue and according to the logs completed on time as usual.
'killall -s SIGSEGV plasmashell' (which in the past would kick it into restarting) is ineffective. Maybe the magic incantation has changed, but as none of this stuff is documented in man pages I don't know.
And where are you typing these commands? Doing a Crtl-Alt-F2?
'kill <PID>' does kill it, but how do I run it again without logging out? Simply running plasmashell from the command line does put up the widget panel but it's still frozen, and the command never returns.
Back in the F26 or F27 days I was having some issues. Forgot the circumstances but back then I was advised that doing
kquitapp plasmashell
plasmashell &
Was the "preferred" method of killing/restarting.
My system is up 24/7 as well and most of the times I turn off the monitors. No issues for me.
On Tue, 2018-11-06 at 19:35 +0800, Ed Greshko wrote:
On 11/6/18 7:24 PM, Patrick O'Callaghan wrote:
F29 updated.
For the last couple of days I've turned on my monitor in the morning to find Plasma unresponsive, i.e. clicking on the bottom panel does nothing, neither does right-clicking on the background.
So, you can move the mouse?
The machine is left on at night (not suspended or hibernating, just the monitor turned off), and today the clock widget shows a frozen time of 4:55am. Nothing is running overnight except the DE and my backup software (rsnapshot) which does run at around that time though it has been doing that for years without issue and according to the logs completed on time as usual.
'killall -s SIGSEGV plasmashell' (which in the past would kick it into restarting) is ineffective. Maybe the magic incantation has changed, but as none of this stuff is documented in man pages I don't know.
And where are you typing these commands? Doing a Crtl-Alt-F2?
I sued Shift-Tab to get to an open Konsole. Note that everything except Plasma still works, my other windows are running etc. I even sent that report from Evolution before logging out.
'kill <PID>' does kill it, but how do I run it again without logging out? Simply running plasmashell from the command line does put up the widget panel but it's still frozen, and the command never returns.
Back in the F26 or F27 days I was having some issues. Forgot the circumstances but back then I was advised that doing
kquitapp plasmashell
plasmashell &
Yes, I remembered too late that I even have a script for that (BTW it's now kquitapp5). However:
It just happened again, this time when I came back from my lunch break. At least that eliminates by backup system from suspicion.
This time I just logged out using Ctrl-Alt-Del, then logged in again. Plasma started up but was still frozen (except the clock widget now showed the updated time). I tried the same thing again, with no difference.
Doing it a third time would indicate insanity, so this time I switched to another console (Shift-Alt-Fn), logged in as root and tried a 'telinit 3'. The DE disappeared, I ran 'telinit 5' and the DE didn't come back. However I now had a GUI cursor on my root console window, so presumably Xorg was running even though some component of KDE was not.
I rebooted.
poc
Il 11/6/18 12:24 PM, Patrick O'Callaghan ha scritto:
F29 updated.
For the last couple of days I've turned on my monitor in the morning to find Plasma unresponsive, i.e. clicking on the bottom panel does nothing, neither does right-clicking on the background.
I had this kind of problem in earlier stage of F29 development under a QEMU Virtual Machine. At that time I remember I was able to resurrect plasma by manually locking the screen from konsole (with 'qdbus org.freedesktop.ScreenSaver /ScreenSaver Lock') and the unlock again.
At a certain time I did not experienced this problem anymore so I never reported it. You can try my workaround to check if it's the same problem I had.
Mattia
On Tue, 2018-11-06 at 16:21 +0000, Mattia Verga wrote:
Il 11/6/18 12:24 PM, Patrick O'Callaghan ha scritto:
F29 updated.
For the last couple of days I've turned on my monitor in the morning to find Plasma unresponsive, i.e. clicking on the bottom panel does nothing, neither does right-clicking on the background.
I had this kind of problem in earlier stage of F29 development under a QEMU Virtual Machine. At that time I remember I was able to resurrect plasma by manually locking the screen from konsole (with 'qdbus org.freedesktop.ScreenSaver /ScreenSaver Lock') and the unlock again.
At a certain time I did not experienced this problem anymore so I never reported it. You can try my workaround to check if it's the same problem I had.
I'll try it if/when it happens again, thanks.
poc
On Tuesday, 6 November 2018 12:24:17 CET Patrick O'Callaghan wrote:
'killall -s SIGSEGV plasmashell' (which in the past would kick it into restarting) is ineffective. Maybe the magic incantation has changed, but as none of this stuff is documented in man pages I don't know.
'kill <PID>' does kill it, but how do I run it again without logging out? Simply running plasmashell from the command line does put up the widget panel but it's still frozen, and the command never returns.
Not that is going to solve your problem but.. have you tried:
kquitapp plasmashell
Are you running kwayland?
Is your session getting locked?
On Tue, 2018-11-06 at 18:17 +0100, Marc Deop i Argemí wrote:
On Tuesday, 6 November 2018 12:24:17 CET Patrick O'Callaghan wrote:
'killall -s SIGSEGV plasmashell' (which in the past would kick it into restarting) is ineffective. Maybe the magic incantation has changed, but as none of this stuff is documented in man pages I don't know.
'kill <PID>' does kill it, but how do I run it again without logging out? Simply running plasmashell from the command line does put up the widget panel but it's still frozen, and the command never returns.
Not that is going to solve your problem but.. have you tried:
kquitapp plasmashell
See previous answer in the thread. It's kquitapp5 by the way (they seem to be different).
Are you running kwayland?
No.
Is your session getting locked?
No. Everything works except Plasma.
poc
Patrick O'Callaghan wrote:
'kill <PID>' does kill it, but how do I run it again without logging out? Simply running plasmashell from the command line does put up the widget panel but it's still frozen, and the command never returns.
You can (re)launch 'plasmashell' via krunner (ALT-F2)
That said, may be worth trying to get a backtrace out of the deadlocked plasmashell process, if you have terminal access, run:
gdb -p <PID>
and issue command: thread apply all bt
Getting a worthwhile backtrace may require installing -debuginfo packages, via something like: dnf debuginfo-install plasma-workspace (warning, it's a lot, but will give much better results)
-- Rex
On Wed, 2018-11-07 at 11:18 -0600, Rex Dieter wrote:
Patrick O'Callaghan wrote:
'kill <PID>' does kill it, but how do I run it again without logging out? Simply running plasmashell from the command line does put up the widget panel but it's still frozen, and the command never returns.
You can (re)launch 'plasmashell' via krunner (ALT-F2)
I had another freeze overnight. This time I tried 'kquitapp5 plasmashell' and it did nothing. I also tried killing it with SEGV and again nothing. A straight kill did manage to kill it.
However re-rerunning with 'plasmashell &' just brought back a frozen widget panel. It looks like there is something more fundamental going on.
I've now installed this morning's updates which include a bunch of KDE stuff (mostly kf5* I think), and rebooted the machine just in case. I'll report back if the problem recurs.
That said, may be worth trying to get a backtrace out of the deadlocked plasmashell process, if you have terminal access, run:
gdb -p <PID>
and issue command: thread apply all bt
Getting a worthwhile backtrace may require installing -debuginfo packages, via something like: dnf debuginfo-install plasma-workspace (warning, it's a lot, but will give much better results)
I'll do that if and when it happens again.
poc
On Wed, 2018-11-07 at 11:18 -0600, Rex Dieter wrote:
Patrick O'Callaghan wrote:
'kill <PID>' does kill it, but how do I run it again without logging out? Simply running plasmashell from the command line does put up the widget panel but it's still frozen, and the command never returns.
You can (re)launch 'plasmashell' via krunner (ALT-F2)
That said, may be worth trying to get a backtrace out of the deadlocked plasmashell process, if you have terminal access, run:
gdb -p <PID>
and issue command: thread apply all bt
Getting a worthwhile backtrace may require installing -debuginfo packages, via something like: dnf debuginfo-install plasma-workspace (warning, it's a lot, but will give much better results)
Another freeze, this time while I was using the machine.
Gdb didn't give any results at all:
$ pgrep -fl plasma 7619 plasmashell 9471 plasma-browser- $ gdb -p 7619 GNU gdb (GDB) Fedora 8.2-3.fc29 Copyright (C) 2018 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: http://www.gnu.org/software/gdb/bugs/;. Find the GDB manual and other documentation resources online at: http://www.gnu.org/software/gdb/documentation/;.
For help, type "help". Type "apropos word" to search for commands related to "word". Attaching to process 7619 thread apply all bt
That's it. gdb froze. I then did a Ctrl-Z and
$ bg $ pgrep gdb 1049 $ strace -p 1049 strace: Process 1049 attached wait4(7619,
and again, that's it. So gdb is hung waiting for the plasmashell process. This looks like a kernel-level hard wait. A look at 'ps' shows:
$ ps axl|grep plasmashell ... 0 1000 7619 1 20 0 3259992 102920 - Dl ? 0:57 /usr/bin/plasmashell
so the WCHAN value is Dl
That's all I've got for now.
poc
On Thu, 2018-11-08 at 17:25 +0000, Patrick O'Callaghan wrote:
On Wed, 2018-11-07 at 11:18 -0600, Rex Dieter wrote:
Patrick O'Callaghan wrote:
'kill <PID>' does kill it, but how do I run it again without logging out? Simply running plasmashell from the command line does put up the widget panel but it's still frozen, and the command never returns.
You can (re)launch 'plasmashell' via krunner (ALT-F2)
That said, may be worth trying to get a backtrace out of the deadlocked plasmashell process, if you have terminal access, run:
gdb -p <PID>
and issue command: thread apply all bt
Getting a worthwhile backtrace may require installing -debuginfo packages, via something like: dnf debuginfo-install plasma-workspace (warning, it's a lot, but will give much better results)
Another freeze, this time while I was using the machine.
Gdb didn't give any results at all:
$ pgrep -fl plasma 7619 plasmashell 9471 plasma-browser- $ gdb -p 7619 GNU gdb (GDB) Fedora 8.2-3.fc29 Copyright (C) 2018 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: http://www.gnu.org/software/gdb/bugs/;;. Find the GDB manual and other documentation resources online at: http://www.gnu.org/software/gdb/documentation/;;.
For help, type "help". Type "apropos word" to search for commands related to "word". Attaching to process 7619 thread apply all bt
That's it. gdb froze. I then did a Ctrl-Z and
$ bg $ pgrep gdb 1049 $ strace -p 1049 strace: Process 1049 attached wait4(7619,
and again, that's it. So gdb is hung waiting for the plasmashell process. This looks like a kernel-level hard wait. A look at 'ps' shows:
$ ps axl|grep plasmashell ... 0 1000 7619 1 20 0 3259992 102920 - Dl ? 0:57 /usr/bin/plasmashell
so the WCHAN value is Dl
Following up on this last, I wondered if the problem could be NFS- related. I have an NFS volume only used for backup and mounted from my NAS, so tried using 'fuser' to pin down what could be using it, but 'fuser' itself just sat there with no results. Curiouser and curiouser.
I masked out the NFS mount from /etc/fstab and rebooted. I've now been running for 48 hours with no freezes or hangs, so I'm tentatively calling this an NFS bug. I might try re-enabling it using automount just for kicks.
Not sure what else I can provide to BZ to make it debuggable, as plasmashell seems to be the only thing that triggers it.
poc
On 11/13/18 7:33 AM, Patrick O'Callaghan wrote:
Following up on this last, I wondered if the problem could be NFS- related. I have an NFS volume only used for backup and mounted from my NAS, so tried using 'fuser' to pin down what could be using it, but 'fuser' itself just sat there with no results. Curiouser and curiouser.
I masked out the NFS mount from /etc/fstab and rebooted. I've now been running for 48 hours with no freezes or hangs, so I'm tentatively calling this an NFS bug. I might try re-enabling it using automount just for kicks.
Not sure what else I can provide to BZ to make it debuggable, as plasmashell seems to be the only thing that triggers it.
I've got 4 NFS mounts on one system and 1 NFS mount on another. In the case of the 1 mount it is actually on a system using an IPv6 tunnel and the mount is done on IPv6. This means the network path is from Taiwan, through the US, and back to Taiwan. (This is just for testing).
Neither of my systems are having any freezes and both are up 24/7.
I use...
nfs4 rw,soft,fg,x-systemd.automount
in my fstab
On Tue, 2018-11-13 at 08:17 +0800, Ed Greshko wrote:
On 11/13/18 7:33 AM, Patrick O'Callaghan wrote:
Following up on this last, I wondered if the problem could be NFS- related. I have an NFS volume only used for backup and mounted from my NAS, so tried using 'fuser' to pin down what could be using it, but 'fuser' itself just sat there with no results. Curiouser and curiouser.
I masked out the NFS mount from /etc/fstab and rebooted. I've now been running for 48 hours with no freezes or hangs, so I'm tentatively calling this an NFS bug. I might try re-enabling it using automount just for kicks.
Not sure what else I can provide to BZ to make it debuggable, as plasmashell seems to be the only thing that triggers it.
I've got 4 NFS mounts on one system and 1 NFS mount on another. In the case of the 1 mount it is actually on a system using an IPv6 tunnel and the mount is done on IPv6. This means the network path is from Taiwan, through the US, and back to Taiwan. (This is just for testing).
Neither of my systems are having any freezes and both are up 24/7.
I use...
nfs4 rw,soft,fg,x-systemd.automount
in my fstab
Thanks Ed. My entry is:
nfs user,rw,async,comment=systemd.mount 0 0
I'll try using your parameters and see what happens. The NAS is an old device, probably with NFS3 (or even 2).
poc
On Tue, 2018-11-13 at 09:16 +0000, Patrick O'Callaghan wrote:
On Tue, 2018-11-13 at 08:17 +0800, Ed Greshko wrote:
On 11/13/18 7:33 AM, Patrick O'Callaghan wrote:
Following up on this last, I wondered if the problem could be NFS- related. I have an NFS volume only used for backup and mounted from my NAS, so tried using 'fuser' to pin down what could be using it, but 'fuser' itself just sat there with no results. Curiouser and curiouser.
I masked out the NFS mount from /etc/fstab and rebooted. I've now been running for 48 hours with no freezes or hangs, so I'm tentatively calling this an NFS bug. I might try re-enabling it using automount just for kicks.
Not sure what else I can provide to BZ to make it debuggable, as plasmashell seems to be the only thing that triggers it.
I've got 4 NFS mounts on one system and 1 NFS mount on another. In the case of the 1 mount it is actually on a system using an IPv6 tunnel and the mount is done on IPv6. This means the network path is from Taiwan, through the US, and back to Taiwan. (This is just for testing).
Neither of my systems are having any freezes and both are up 24/7.
I use...
nfs4 rw,soft,fg,x-systemd.automount
in my fstab
Thanks Ed. My entry is:
nfs user,rw,async,comment=systemd.mount 0 0
I'll try using your parameters and see what happens. The NAS is an old device, probably with NFS3 (or even 2).
Nope, just froze again. This is with your options but I don't think that matters.
poc
On Tue, 2018-11-13 at 17:00 +0000, Patrick O'Callaghan wrote:
nfs user,rw,async,comment=systemd.mount 0 0
I'll try using your parameters and see what happens. The NAS is an old device, probably with NFS3 (or even 2).
Nope, just froze again. This is with your options but I don't think that matters.
Or maybe it does. After freezing for an appreciable time (minutes) it has now spontaneously unfrozen itself. I'll keep an eye on it and see if it happens again.
poc
Patrick O'Callaghan wrote:
On Tue, 2018-11-13 at 17:00 +0000, Patrick O'Callaghan wrote:
nfs user,rw,async,comment=systemd.mount 0 0
I'll try using your parameters and see what happens. The NAS is an old device, probably with NFS3 (or even 2).
Nope, just froze again. This is with your options but I don't think that matters.
Or maybe it does. After freezing for an appreciable time (minutes) it has now spontaneously unfrozen itself. I'll keep an eye on it and see if it happens again.
I suspect it may be a one or both of these that made a difference: soft, automount
(I'd bet on soft... ie, the mount becomes unresponsive, but now your system can continue despite that)
-- Rex
On Tue, 2018-11-13 at 14:25 -0600, Rex Dieter wrote:
Patrick O'Callaghan wrote:
On Tue, 2018-11-13 at 17:00 +0000, Patrick O'Callaghan wrote:
nfs user,rw,async,comment=systemd.mount 0 0
I'll try using your parameters and see what happens. The NAS is an old device, probably with NFS3 (or even 2).
Nope, just froze again. This is with your options but I don't think that matters.
Or maybe it does. After freezing for an appreciable time (minutes) it has now spontaneously unfrozen itself. I'll keep an eye on it and see if it happens again.
I suspect it may be a one or both of these that made a difference: soft, automount
(I'd bet on soft... ie, the mount becomes unresponsive, but now your system can continue despite that)
Perhaps. It just did it again. This time the pause was around 10-12 minutes. Note that the only thing that stops (apparently) is plasmashell. The rest of the system continues working normally and I can switch desktops using Ctrl-Fn with no problem.
poc
On 11/14/18 6:10 AM, Patrick O'Callaghan wrote:
On Tue, 2018-11-13 at 14:25 -0600, Rex Dieter wrote:
Patrick O'Callaghan wrote:
On Tue, 2018-11-13 at 17:00 +0000, Patrick O'Callaghan wrote:
nfs user,rw,async,comment=systemd.mount 0 0
I'll try using your parameters and see what happens. The NAS is an old device, probably with NFS3 (or even 2).
Nope, just froze again. This is with your options but I don't think that matters.
Or maybe it does. After freezing for an appreciable time (minutes) it has now spontaneously unfrozen itself. I'll keep an eye on it and see if it happens again.
I suspect it may be a one or both of these that made a difference: soft, automount
(I'd bet on soft... ie, the mount becomes unresponsive, but now your system can continue despite that)
Perhaps. It just did it again. This time the pause was around 10-12 minutes. Note that the only thing that stops (apparently) is plasmashell. The rest of the system continues working normally and I can switch desktops using Ctrl-Fn with no problem.
Is your mount point within your home directory space?
If so, can you mount it elsewhere to see if the problem continues?
On Wed, 2018-11-14 at 06:21 +0800, Ed Greshko wrote:
On 11/14/18 6:10 AM, Patrick O'Callaghan wrote:
On Tue, 2018-11-13 at 14:25 -0600, Rex Dieter wrote:
Patrick O'Callaghan wrote:
On Tue, 2018-11-13 at 17:00 +0000, Patrick O'Callaghan wrote:
nfs user,rw,async,comment=systemd.mount 0 0
I'll try using your parameters and see what happens. The NAS is an old device, probably with NFS3 (or even 2).
Nope, just froze again. This is with your options but I don't think that matters.
Or maybe it does. After freezing for an appreciable time (minutes) it has now spontaneously unfrozen itself. I'll keep an eye on it and see if it happens again.
I suspect it may be a one or both of these that made a difference: soft, automount
(I'd bet on soft... ie, the mount becomes unresponsive, but now your system can continue despite that)
Perhaps. It just did it again. This time the pause was around 10-12 minutes. Note that the only thing that stops (apparently) is plasmashell. The rest of the system continues working normally and I can switch desktops using Ctrl-Fn with no problem.
Is your mount point within your home directory space?
If so, can you mount it elsewhere to see if the problem continues?
No, it's under a root directory.
poc
On 11/14/18 5:51 PM, Patrick O'Callaghan wrote:
On Wed, 2018-11-14 at 06:21 +0800, Ed Greshko wrote:
On 11/14/18 6:10 AM, Patrick O'Callaghan wrote:
On Tue, 2018-11-13 at 14:25 -0600, Rex Dieter wrote:
Patrick O'Callaghan wrote:
On Tue, 2018-11-13 at 17:00 +0000, Patrick O'Callaghan wrote:
> nfs user,rw,async,comment=systemd.mount 0 0 > > I'll try using your parameters and see what happens. The NAS is an old > device, probably with NFS3 (or even 2). Nope, just froze again. This is with your options but I don't think that matters.
Or maybe it does. After freezing for an appreciable time (minutes) it has now spontaneously unfrozen itself. I'll keep an eye on it and see if it happens again.
I suspect it may be a one or both of these that made a difference: soft, automount
(I'd bet on soft... ie, the mount becomes unresponsive, but now your system can continue despite that)
Perhaps. It just did it again. This time the pause was around 10-12 minutes. Note that the only thing that stops (apparently) is plasmashell. The rest of the system continues working normally and I can switch desktops using Ctrl-Fn with no problem.
Is your mount point within your home directory space?
If so, can you mount it elsewhere to see if the problem continues?
No, it's under a root directory.
I see. How about a symlink from your home directory to it?
The reason I ask is that I had an issue a while back, can't recall the full details, where I'd have a pause of several seconds when there was a symlink to a mount point. Seemed like it may have been some sort of "look ahead" issue. I decided the symlink wasn't buying me anything so I removed it and really didn't investigate.
Does your NAS have any logs which may indicate a connection issue at some point?
On Wed, 2018-11-14 at 18:00 +0800, Ed Greshko wrote:
On 11/14/18 5:51 PM, Patrick O'Callaghan wrote:
On Wed, 2018-11-14 at 06:21 +0800, Ed Greshko wrote:
On 11/14/18 6:10 AM, Patrick O'Callaghan wrote:
On Tue, 2018-11-13 at 14:25 -0600, Rex Dieter wrote:
Patrick O'Callaghan wrote:
On Tue, 2018-11-13 at 17:00 +0000, Patrick O'Callaghan wrote: > > nfs user,rw,async,comment=systemd.mount 0 0 > > > > I'll try using your parameters and see what happens. The NAS is an old > > device, probably with NFS3 (or even 2). > Nope, just froze again. This is with your options but I don't think > that matters. Or maybe it does. After freezing for an appreciable time (minutes) it has now spontaneously unfrozen itself. I'll keep an eye on it and see if it happens again.
I suspect it may be a one or both of these that made a difference: soft, automount
(I'd bet on soft... ie, the mount becomes unresponsive, but now your system can continue despite that)
Perhaps. It just did it again. This time the pause was around 10-12 minutes. Note that the only thing that stops (apparently) is plasmashell. The rest of the system continues working normally and I can switch desktops using Ctrl-Fn with no problem.
Is your mount point within your home directory space?
If so, can you mount it elsewhere to see if the problem continues?
No, it's under a root directory.
I see. How about a symlink from your home directory to it?
The reason I ask is that I had an issue a while back, can't recall the full details, where I'd have a pause of several seconds when there was a symlink to a mount point. Seemed like it may have been some sort of "look ahead" issue. I decided the symlink wasn't buying me anything so I removed it and really didn't investigate.
No. (I double-checked to be sure). And we're not talking about several seconds but many minutes or even hours.
Does your NAS have any logs which may indicate a connection issue at some point?
No. When this has happened I've been able to log into the NAS with no problem. It's logs show nothing wrong.
It seems to be completely random. The panel freezes when I'm not even using it and I only notice when I try and click on something or notice the clock widget is showing the wrong time. Debugging doesn't work because gdb can't attach to a process in 'D' state (IIRC because ptrace() requires the kernel-level process to run code and it's blocked). Equally mysterious is how it suddenly unblocks itself.
poc
On 11/14/18 8:08 PM, Patrick O'Callaghan wrote:
Does your NAS have any logs which may indicate a connection issue at some point?
No. When this has happened I've been able to log into the NAS with no problem. It's logs show nothing wrong.
It seems to be completely random. The panel freezes when I'm not even using it and I only notice when I try and click on something or notice the clock widget is showing the wrong time. Debugging doesn't work because gdb can't attach to a process in 'D' state (IIRC because ptrace() requires the kernel-level process to run code and it's blocked). Equally mysterious is how it suddenly unblocks itself.
I see. I must admit I forget you can get to other applications on your desktop.
What are your collection of widgets on your desktop and/or in the systray?
On 11/14/18 8:08 PM, Patrick O'Callaghan wrote:
Does your NAS have any logs which may indicate a connection issue at some point?
No. When this has happened I've been able to log into the NAS with no problem. It's logs show nothing wrong.
It seems to be completely random. The panel freezes when I'm not even using it and I only notice when I try and click on something or notice the clock widget is showing the wrong time. Debugging doesn't work because gdb can't attach to a process in 'D' state (IIRC because ptrace() requires the kernel-level process to run code and it's blocked). Equally mysterious is how it suddenly unblocks itself.
Oh, and I guess, one other thing.
How about creating a new user and leave that vanilla user logged in over night?
On Wed, 2018-11-14 at 20:27 +0800, Ed Greshko wrote:
On 11/14/18 8:08 PM, Patrick O'Callaghan wrote:
Does your NAS have any logs which may indicate a connection issue at some point?
No. When this has happened I've been able to log into the NAS with no problem. It's logs show nothing wrong.
It seems to be completely random. The panel freezes when I'm not even using it and I only notice when I try and click on something or notice the clock widget is showing the wrong time. Debugging doesn't work because gdb can't attach to a process in 'D' state (IIRC because ptrace() requires the kernel-level process to run code and it's blocked). Equally mysterious is how it suddenly unblocks itself.
Oh, and I guess, one other thing.
How about creating a new user and leave that vanilla user logged in over night?
Yes, that's about all I can think of for now.
poc
On Wed, 2018-11-14 at 15:58 +0000, Patrick O'Callaghan wrote:
On Wed, 2018-11-14 at 20:27 +0800, Ed Greshko wrote:
On 11/14/18 8:08 PM, Patrick O'Callaghan wrote:
Does your NAS have any logs which may indicate a connection issue at some point?
No. When this has happened I've been able to log into the NAS with no problem. It's logs show nothing wrong.
It seems to be completely random. The panel freezes when I'm not even using it and I only notice when I try and click on something or notice the clock widget is showing the wrong time. Debugging doesn't work because gdb can't attach to a process in 'D' state (IIRC because ptrace() requires the kernel-level process to run code and it's blocked). Equally mysterious is how it suddenly unblocks itself.
Oh, and I guess, one other thing.
How about creating a new user and leave that vanilla user logged in over night?
Yes, that's about all I can think of for now.
I did that and it froze within about 20 minutes of logging in. This was in a parallel session while I continued working in my main session. That test session has nothing else running and is still frozen after about 18 hours.
Meanwhile my normal session was frozen for a few minutes when I came back this morning but has since woken up spontaneously.
All this is with the NFS NAS automounted but not being used.
poc
On 11/15/18 6:52 PM, Patrick O'Callaghan wrote:
On Wed, 2018-11-14 at 15:58 +0000, Patrick O'Callaghan wrote:
On Wed, 2018-11-14 at 20:27 +0800, Ed Greshko wrote:
Oh, and I guess, one other thing.
How about creating a new user and leave that vanilla user logged in over night?
Yes, that's about all I can think of for now.
I did that and it froze within about 20 minutes of logging in. This was in a parallel session while I continued working in my main session. That test session has nothing else running and is still frozen after about 18 hours.
Meanwhile my normal session was frozen for a few minutes when I came back this morning but has since woken up spontaneously.
All this is with the NFS NAS automounted but not being used.
I think I would use that term "not being used" lightly.
I only use NFSv4 and I am currently not reading or writing to the NAS. However, in looking at network traffic with wireshark I see my system sending periodic TCP Keep-Alive packets as well as NFS Renew CID packets. I would guess NFSv3 do do similar.
That said, I don't see how those interactions would cause Plasma Freeze.
On Thu, 2018-11-15 at 20:05 +0800, Ed Greshko wrote:
On 11/15/18 6:52 PM, Patrick O'Callaghan wrote:
On Wed, 2018-11-14 at 15:58 +0000, Patrick O'Callaghan wrote:
On Wed, 2018-11-14 at 20:27 +0800, Ed Greshko wrote:
Oh, and I guess, one other thing.
How about creating a new user and leave that vanilla user logged in over night?
Yes, that's about all I can think of for now.
I did that and it froze within about 20 minutes of logging in. This was in a parallel session while I continued working in my main session. That test session has nothing else running and is still frozen after about 18 hours.
Meanwhile my normal session was frozen for a few minutes when I came back this morning but has since woken up spontaneously.
All this is with the NFS NAS automounted but not being used.
I think I would use that term "not being used" lightly.
I only use NFSv4 and I am currently not reading or writing to the NAS. However, in looking at network traffic with wireshark I see my system sending periodic TCP Keep-Alive packets as well as NFS Renew CID packets. I would guess NFSv3 do do similar.
The NAS spins down when not being used, though I guess it could still respond to this kind of thing.
That said, I don't see how those interactions would cause Plasma Freeze.
Indeed not. Unless Plasma is for some reason checking all filesystems, but I don't think it's that stupid.
I'm running strace on the plasmashell process to see if anything jumps out when it freezes/unfreezes.
poc
On Thu, 2018-11-15 at 14:59 +0000, Patrick O'Callaghan wrote:
On Thu, 2018-11-15 at 20:05 +0800, Ed Greshko wrote:
On 11/15/18 6:52 PM, Patrick O'Callaghan wrote:
On Wed, 2018-11-14 at 15:58 +0000, Patrick O'Callaghan wrote:
On Wed, 2018-11-14 at 20:27 +0800, Ed Greshko wrote:
Oh, and I guess, one other thing.
How about creating a new user and leave that vanilla user logged in over night?
Yes, that's about all I can think of for now.
I did that and it froze within about 20 minutes of logging in. This was in a parallel session while I continued working in my main session. That test session has nothing else running and is still frozen after about 18 hours.
Meanwhile my normal session was frozen for a few minutes when I came back this morning but has since woken up spontaneously.
All this is with the NFS NAS automounted but not being used.
I think I would use that term "not being used" lightly.
I only use NFSv4 and I am currently not reading or writing to the NAS. However, in looking at network traffic with wireshark I see my system sending periodic TCP Keep-Alive packets as well as NFS Renew CID packets. I would guess NFSv3 do do similar.
The NAS spins down when not being used, though I guess it could still respond to this kind of thing.
That said, I don't see how those interactions would cause Plasma Freeze.
Indeed not. Unless Plasma is for some reason checking all filesystems, but I don't think it's that stupid.
I'm running strace on the plasmashell process to see if anything jumps out when it freezes/unfreezes.
Just an update in case anyone is wondering: I haven't had a freeze in several days now, since updating to kernel 4.19.2. Go figure ...
poc
PS Of course I soon as I send this I'll no doubt get a freeze. Such is the Law of Murphy.
On Mon, 2018-11-19 at 22:22 +0000, Patrick O'Callaghan wrote:
Just an update in case anyone is wondering: I haven't had a freeze in several days now, since updating to kernel 4.19.2. Go figure ...
poc
PS Of course I soon as I send this I'll no doubt get a freeze. Such is the Law of Murphy.
Yep, that happened. Getting really old.
poc
On Wed, Nov 14, 2018 at 7:59 AM Patrick O'Callaghan pocallaghan@gmail.com wrote:
On Wed, 2018-11-14 at 18:00 +0800, Ed Greshko wrote:
On 11/14/18 5:51 PM, Patrick O'Callaghan wrote:
On Wed, 2018-11-14 at 06:21 +0800, Ed Greshko wrote:
On 11/14/18 6:10 AM, Patrick O'Callaghan wrote:
On Tue, 2018-11-13 at 14:25 -0600, Rex Dieter wrote:
Patrick O'Callaghan wrote:
> On Tue, 2018-11-13 at 17:00 +0000, Patrick O'Callaghan wrote: > > > nfs user,rw,async,comment=systemd.mount 0 0 > > > > > > I'll try using your parameters and see what happens. The
NAS is an old
> > > device, probably with NFS3 (or even 2). > > Nope, just froze again. This is with your options but I
don't think
> > that matters. > Or maybe it does. After freezing for an appreciable time
(minutes) it
> has now spontaneously unfrozen itself. I'll keep an eye on it
and see
> if it happens again. I suspect it may be a one or both of these that made a
difference: soft,
automount
(I'd bet on soft... ie, the mount becomes unresponsive, but now
your system
can continue despite that)
Perhaps. It just did it again. This time the pause was around 10-12 minutes. Note that the only thing that stops (apparently) is plasmashell. The rest of the system continues working normally and
I
can switch desktops using Ctrl-Fn with no problem.
Is your mount point within your home directory space?
If so, can you mount it elsewhere to see if the problem continues?
No, it's under a root directory.
I see. How about a symlink from your home directory to it?
The reason I ask is that I had an issue a while back, can't recall the
full details, where
I'd have a pause of several seconds when there was a symlink to a mount
point. Seemed
like it may have been some sort of "look ahead" issue. I decided the
symlink wasn't
buying me anything so I removed it and really didn't investigate.
No. (I double-checked to be sure). And we're not talking about several seconds but many minutes or even hours.
Does your NAS have any logs which may indicate a connection issue at
some point?
No. When this has happened I've been able to log into the NAS with no problem. It's logs show nothing wrong.
It seems to be completely random. The panel freezes when I'm not even using it and I only notice when I try and click on something or notice the clock widget is showing the wrong time. Debugging doesn't work because gdb can't attach to a process in 'D' state (IIRC because ptrace() requires the kernel-level process to run code and it's blocked). Equally mysterious is how it suddenly unblocks itself.
I Just started following this thread because I think I might be experiencing this myself. Except I'm still back on F26. And in my case, I don't have any NFS mounts.
My symptoms and usage are the same as you described in your last paragraph above.
For me it happens randomly about once a week now. ISTR this started about 2 months ago ???
I haven't pursued it, but: killall plasmashell; kstart plasmashell; resolves it. Next time it happens (hasn't happened in the last 3 days) I'll let it sit, and see if it un-freezes.
On Wed, 2018-11-14 at 10:09 -0500, Fulko Hew wrote:
It seems to be completely random. The panel freezes when I'm not even using it and I only notice when I try and click on something or notice the clock widget is showing the wrong time. Debugging doesn't work because gdb can't attach to a process in 'D' state (IIRC because ptrace() requires the kernel-level process to run code and it's blocked). Equally mysterious is how it suddenly unblocks itself.
I Just started following this thread because I think I might be experiencing this myself. Except I'm still back on F26. And in my case, I don't have any NFS mounts.
My symptoms and usage are the same as you described in your last paragraph above.
For me it happens randomly about once a week now. ISTR this started about 2 months ago ???
I haven't pursued it, but: killall plasmashell; kstart plasmashell; resolves it. Next time it happens (hasn't happened in the last 3 days) I'll let it sit, and see if it un-freezes.
As I mentioned earlier in the thread, I've tried 'killall plasmashell; plasmashell' (i.e. without using kstart) and that doesn't work, i.e. when plasmashell comes back up it's still frozen. However that was when I wasn't using automount so it may be different now. Too many variables.
poc
Fulko Hew wrote:
I Just started following this thread because I think I might be experiencing this myself. Except I'm still back on F26. And in my case, I don't have any NFS mounts.
My symptoms and usage are the same as you described in your last paragraph above.
For me it happens randomly about once a week now. ISTR this started about 2 months ago ???
I haven't pursued it, but: killall plasmashell; kstart plasmashell; resolves it. Next time it happens (hasn't happened in the last 3 days) I'll let it sit, and see if it un-freezes.
I'd encourage you to try to get a backgrace out of the deadlocked plasmashell as mentioned elsewhere in this thread.
With NFS out of the picture, your cause is very likely different. (and I can offer an educated guess based on experience: video driver deadlocks)
-- Rex
On Wed, Nov 14, 2018 at 2:29 PM Rex Dieter rdieter@gmail.com wrote:
Fulko Hew wrote:
I Just started following this thread because I think I might be experiencing this myself. Except I'm still back on F26. And in my case, I don't have any NFS mounts.
My symptoms and usage are the same as you described in your last paragraph above.
For me it happens randomly about once a week now. ISTR this started about 2 months ago ???
I haven't pursued it, but: killall plasmashell; kstart plasmashell; resolves it. Next time it happens (hasn't happened in the last 3 days) I'll let it sit, and see if it un-freezes.
I'd encourage you to try to get a backgrace out of the deadlocked plasmashell as mentioned elsewhere in this thread.
With NFS out of the picture, your cause is very likely different. (and I can offer an educated guess based on experience: video driver deadlocks
OK. I'll try it next time it happens.
Patrick O'Callaghan wrote:
F29 updated.
For the last couple of days I've turned on my monitor in the morning to find Plasma unresponsive, i.e. clicking on the bottom panel does nothing, neither does right-clicking on the background.
The machine is left on at night (not suspended or hibernating, just the monitor turned off), and today the clock widget shows a frozen time of 4:55am. Nothing is running overnight except the DE and my backup software (rsnapshot) which does run at around that time though it has been doing that for years without issue and according to the logs completed on time as usual.
'killall -s SIGSEGV plasmashell' (which in the past would kick it into restarting) is ineffective. Maybe the magic incantation has changed, but as none of this stuff is documented in man pages I don't know.
'kill <PID>' does kill it, but how do I run it again without logging out? Simply running plasmashell from the command line does put up the widget panel but it's still frozen, and the command never returns.
poc
Since F29 I have seen plasma freezing every day. Nothing to do with NFS, I don't use it.
On Tue, 2018-11-20 at 13:44 -0500, Neal Becker wrote:
Patrick O'Callaghan wrote:
F29 updated.
For the last couple of days I've turned on my monitor in the morning to find Plasma unresponsive, i.e. clicking on the bottom panel does nothing, neither does right-clicking on the background.
The machine is left on at night (not suspended or hibernating, just the monitor turned off), and today the clock widget shows a frozen time of 4:55am. Nothing is running overnight except the DE and my backup software (rsnapshot) which does run at around that time though it has been doing that for years without issue and according to the logs completed on time as usual.
'killall -s SIGSEGV plasmashell' (which in the past would kick it into restarting) is ineffective. Maybe the magic incantation has changed, but as none of this stuff is documented in man pages I don't know.
'kill <PID>' does kill it, but how do I run it again without logging out? Simply running plasmashell from the command line does put up the widget panel but it's still frozen, and the command never returns.
poc
Since F29 I have seen plasma freezing every day. Nothing to do with NFS, I don't use it.
That's useful to know. I've turned off my NFS mounts completely to see what happens but my own reason for suspecting it was the 'D'-state WCHAN value plasmashell gets into. If it's not NFS then it's almost certainly some other kernel bug. This is a situation that should never arise even with a badly-behaved program, but so far I haven't found anything else which triggers it.
I find I have to reboot the machine to clear it, which again is typical of this kind of bug.
poc
On Tue, 2018-11-06 at 11:24 +0000, Patrick O'Callaghan wrote:
F29 updated.
For the last couple of days I've turned on my monitor in the morning to find Plasma unresponsive, i.e. clicking on the bottom panel does nothing, neither does right-clicking on the background.
[...]
I've reported it: https://bugs.kde.org/show_bug.cgi?id=401272