I'm frequently rebooting my new fedora 30 install as I test things, and on one reboot I got the entire boot process held up by a stop job for rngd.service.
I'm rebooting for God's sake. Why do you need to stop the reboot process to wait till you've gathered enough entropy which will be thrown away immediately on reboot?
Can a individual service file be configured to exit immediately on reboot?
I guess this is probably this bug:
Tom Horsley writes:
I'm frequently rebooting my new fedora 30 install as I test things, and on one reboot I got the entire boot process held up by a stop job for rngd.service.
I'm rebooting for God's sake. Why do you need to stop the reboot process to wait till you've gathered enough entropy which will be thrown away immediately on reboot?
Can a individual service file be configured to exit immediately on reboot?
There's a default 90 second timeout on stopping a service, before it gets forcefully stopped.
Maybe 1 in every 20 if my reboots gets held up for "stopping user processes".
Apparently, I am asked to believe that selecting "Reboot" from the desktop does not always stop every user process, for some reason, even though X is completely killed, as well as everything that should be started from it, or from the user login.
I just wait 90 seconds, in those instances, and write it off as yet another systemd brain damage.
According to systemd.service man page, TimeoutStopSec sets this timeout. So you can add that to rngd.service, I suppose. Or, if you want to bring out the big hammer, set DefaultTimeoutStopSec in /etc/systemd/system.conf, effectively changing the default timeout for everything.
How can you possibly get stopping a piddly daemon, like rngd, wrong? Who knows. It's brain damage.
On Sat, 04 May 2019 12:32:59 -0400 Sam Varshavchik wrote:
Maybe 1 in every 20 if my reboots gets held up for "stopping user processes".
That happens to me so often that I built an entire big hammer from scratch just to hit the system with when I reboot:
https://tomhorsley.com/game/punch.html
I've added more stuff to my pre-reboot script documented there from time to time. Apparently apache can imagine it is streaming to my tablet or something and won't shut down, so I now kill off apache in there before rebooting. Perhaps I'll need to add a kill -9 of rngd as well...
On 5/4/19 9:32 AM, Sam Varshavchik wrote:
I just wait 90 seconds, in those instances, and write it off as yet another systemd brain damage.
According to systemd.service man page, TimeoutStopSec sets this timeout. So you can add that to rngd.service, I suppose. Or, if you want to bring out the big hammer, set DefaultTimeoutStopSec in /etc/systemd/system.conf, effectively changing the default timeout for everything.
How can you possibly get stopping a piddly daemon, like rngd, wrong? Who knows. It's brain damage.
As usual, it is not a systemd problem, unless you consider that trying to do a clean shutdown is brain damage. The rngd process gets stuck sometimes (see the above mentioned bug) and systemd waits nicely for it to stop, but finally gives up and force kills it.
Samuel Sieb writes:
On 5/4/19 9:32 AM, Sam Varshavchik wrote:
How can you possibly get stopping a piddly daemon, like rngd, wrong? Who knows. It's brain damage.
As usual, it is not a systemd problem, unless you consider that trying to do a clean shutdown is brain damage. The rngd process gets stuck sometimes (see the above mentioned bug) and systemd waits nicely for it to stop, but finally gives up and force kills it.
In the good-old days, when integrating some new gizmo like rngd, by the nature of the beast you'll always check into how it works and make a minimal effort to learn its basics. Basic due diligence. From the linked bugzilla bug, it seems that rngd was coded to ignore signals. So, having learned that factoid, one would code the initscript to sigkill it, with no other options available. One would even likely run it, and test it, to be sure it works. Or maybe even not wasting time trying to stop it. The system's coming down. Who cares.
But now, the mindset is completely different, and that's what I am describing as the brain damage. All you need is a service file with an ExecStart, to launch the thing. That's it. Nothing else needs to be done. Don't worry about it. Systemd Knows Best. It'll take care of stopping it. You don't need to do due diligence any more. Just trust the systemd to stop it. And that's how you end up with 90 second shutdown delays.
On 5/4/19 4:13 PM, Sam Varshavchik wrote:
In the good-old days, when integrating some new gizmo like rngd, by the nature of the beast you'll always check into how it works and make a minimal effort to learn its basics. Basic due diligence. From the linked bugzilla bug, it seems that rngd was coded to ignore signals.
I don't think the conversation on that bug supports that conclusion. This looks like it's simply a bug in rngd that causes an intermittent failure to terminate. (If it intentionally ignored signals, the failure would not be intermittent.)
We don't need tortured logic to blame systemd. It's doing the right thing. There is a bug in rngd, and systemd is exposing that bug so that it can be fixed. That's how software should work. "Errors should never pass silently." (Zen of Python #10. Hello from Pycon!)
On Sat, 4 May 2019 18:58:32 -0700 Gordon Messmer wrote:
We don't need tortured logic to blame systemd. It's doing the right thing.
Though a sane person might ask, "Why is it the right thing to wait for a service gathering information which will be utterly discarded on the reboot anyway?"
On 05/04/2019 08:20 PM, Tom Horsley wrote:
On Sat, 4 May 2019 18:58:32 -0700 Gordon Messmer wrote:
We don't need tortured logic to blame systemd. It's doing the right thing.
Though a sane person might ask, "Why is it the right thing to wait for a service gathering information which will be utterly discarded on the reboot anyway?"
Because systemd has no way of knowing what the service is doing or that it's safe to kill it without waiting for it to finish.
On Sat, 4 May 2019 22:12:11 -0600 Joe Zeff wrote:
Because systemd has no way of knowing what the service is doing or that it's safe to kill it without waiting for it to finish.
But the service knows that. Why isn't there a way to tell systemd that in the .service file?
Tom Horsley writes:
On Sat, 4 May 2019 22:12:11 -0600 Joe Zeff wrote:
Because systemd has no way of knowing what the service is doing or that it's safe to kill it without waiting for it to finish.
But the service knows that. Why isn't there a way to tell systemd that in the .service file?
Yes, there is, the TimeouStopSec setting in [service], see systemd.service man page.
But you have to know where it's buried in systemd's documentation. But this goes towards the mainstream mindset: chuck a start command into the service file, and forget about it. You're done. Systemd Knows Best.
On 5/5/19 6:07 AM, Tom Horsley wrote:
But the service knows that. Why isn't there a way to tell systemd that in the .service file?
There's no use case for it. rngd is expected to terminate (more or less) immediately after it gets sigterm. If there were another directive to ignore shutdown status (as there is already a timeout setting), no one would have put that one in rngd's service file either.
systemd is doing what it was designed to do. It's operating correctly. Why is it more important to you to look for a reason to blame systemd than it is to simply correct the error in rngd? Does blaming systemd make a better community?
Allegedly, on or about 4 May 2019, Tom Horsley sent:
Though a sane person might ask, "Why is it the right thing to wait for a service gathering information which will be utterly discarded on the reboot anyway?"
Well, much as I hate to defend systemd, *it* doesn't know that *that* service is one it can kill with impunity. Only that service would know whether it was okay to do so.