when startup delays become bugs

Adam Williamson awilliam at redhat.com
Wed May 15 01:54:12 UTC 2013


On Tue, 2013-05-14 at 18:26 -0600, Chris Murphy wrote:

> But the sendmail issue is a distraction from the bigger question I was
> asking which is at what threshold are service startup times reasonable
> vs not reasonable and are their maintainers looking at this or do
> testers need to file bugs on them?

It seems like a futile question. All services are not created equal.
Some services just touch one file and then they're done; some services
are massively complex bits of the system. All parts of system startup
once initramfs is loaded are systemd units, ultimately. So it seems like
a pointless exercise to try and define a single cut off point for 'When
Is A Service Taking Too Long'.

This is ultimately performance optimization, and performance
optimization is rarely as simple as 'X takes 10 seconds and we know all
Xs should take 5 or less, fix it!'

Later in your mail you made, I think, a better effort at a starting
point:

"What's initiating my grip is that it's taking upwards of a minute to be
able to ssh into the system. I get a login prompt on the system's
display itself, before I'm able to ssh into it, which is the opposite of
my expectation."

So there are the actual issues you're trying to address:

1. Why can I log in locally before I can ssh in?
2. Can the amount of time before I am able to ssh in be reduced?

I think you need to start over with those questions, and take a broader
approach at trying to find out the answers. 'systemd-analyze blame' is a
reasonable starting point, but if you just list out the results and
basically say "can I send these numbers to maintainers and demand they
fix their shit?", the answer is probably "no". The more useful next step
would be to look at the big numbers, and try to verify, first of all,
which of them actually delay your practical interaction with the system.
As we've established, sendmail's one minute delay when the hostname is
not fully qualified looks ugly, but does not actually delay perceived
startup at all, because neither local nor remote login require
sendmail.service to be up. So you should check that issue for all the
other services with long start times: does this service actually need to
be up before I can log in / ssh in? If not, you can ignore it, for the
purposes of the question.

Once you have the list of services with significant start times which
delay interaction for you, the next step is to investigate why they take
a long time to start up. It might be inevitable, or the result of a
local configuration issue that you can change, or it might be a bug. You
really just need to take a look at the service file, see what it does,
see if you can figure out specifically what part(s) of that service's
startup process are slow, and if there's anything that can be done to
improve it.

Like I said, performance optimization is rarely simple. You're usually
dealing with a very complex system which inevitably means you need to be
isolating the places where you can make a practical difference, and
evaluating whether that's possible.
-- 
Adam Williamson
Fedora QA Community Monkey
IRC: adamw | Twitter: AdamW_Fedora | identi.ca: adamwfedora
http://www.happyassassin.net



More information about the devel mailing list