when startup delays become bugs

Dan Williams dcbw at redhat.com
Tue May 14 22:30:54 UTC 2013


On Tue, 2013-05-14 at 15:51 -0600, Chris Murphy wrote:
> This is not intended to be snarky, but I admit it could sound like it is.  When are long startup times for services considered to be bugs in their own right?
> 
> 
> [root at f19q ~]# systemd-analyze blame
>       1min 444ms sm-client.service
>       1min 310ms sendmail.service
>         18.602s firewalld.service
>          13.882s avahi-daemon.service
>          12.944s NetworkManager-wait-online.service

Is anything waiting on NetworkManager-wait-online in your install?  That
target is really intended for servers where you want to block Apache or
Sendmail or Database from starting until you're sure your networking is
fully configured.  If you don't have any services that require a network
to be up, then you can mask NetworkManager-wait-online and things will
be more parallel.

>          12.715s restorecond.service
>           2.911s abrt-uefioops.service
>           2.792s NetworkManager.service

For NM's part, it does a bunch of setup before daemonizing and creating
its D-Bus service, like reading your network config files.  This all
takes about 1 second on my SSD-enabled machine, but I've also got about
50 network config files.  Check out your systemd journal, and look for
the time spent between "NetworkManager (version
0.9.8.1-3.git20130510.fc19) is starting" and anything NM says about a
"killswitch"; that's the time that NM may potentially block services
that depend on the network.

The purpose of this behavior is to present an consistent, useful data
model when the D-Bus interface is created, rather than creating the
D-Bus interface first, then doing all the above, which avoids generating
a huge number of change-events over D-Bus that a bunch of clients have
to listen for, wake up for, and process.

>           2.634s spice-vdagentd.service
>           2.589s iprinit.service
>           2.583s iprupdate.service
>           2.319s chronyd.service
> 
> 
> 10 seconds for a service seems obscene. 1 minute is so bad it's hilarious, but also really annoying. I feel like filing a bug against anything that takes more than 1/2 second but maybe that's being overly generous (by filing the bug, that is).
> 
> In sendmail's defense, the time is about the same on F18. (It's consistently a bit faster in an F19 VM running on the same F18 system as host.)
> 
> But firewalld goes from 7 seconds to 18 seconds? Why? avahi-daemon, restorecond, all are an order of magnitude longer on F19 than F18. It's a 3+ minute userspace hit on the startup time where the kernel takes 1.9 seconds. Off hand this doesn't seem reasonable, especially sendmail. If the time can't be brought down by a lot, can it ship disabled by default?

Is the firewall loading more rules now?  If it stores any persistent
configuration, those rules have to get pushed to the kernel, and the
kernel API for doing that have been pretty slow in the past; the more
rules, the longer it takes.  Not sure if that's still the case though.

Dan



More information about the devel mailing list