when startup delays become bugs

Lennart Poettering mzerqung at 0pointer.de
Thu May 16 23:09:06 UTC 2013


On Thu, 16.05.13 14:41, Chris Adams (linux at cmadams.net) wrote:

> Once upon a time, Heiko Adams <heiko.adams at gmail.com> said:
> > My top 6 of extreme long starting services are:
> > $ systemd-analyze blame
> >          27.652s NetworkManager.service
> >          27.072s chronyd.service
> >          27.015s avahi-daemon.service
> >          26.899s tuned.service
> >          26.647s restorecond.service
> >          23.512s lightdm.service
> 
> Some people seem to be seeing some extremely long startup times for some
> services.  I can't see what would make some of these take that long.
> I'm wondering:
> 
> - is there some common bug that is causing a major slowdown for some of
>   these services?
> 
> - is there possibly a bug in how systemd is measuring and/or reporting
>   these times?
> 
> Some of the services you list could be running into some type of network
> timeout, but tuned? restorecond?  Those shouldn't be hitting the network
> AFAIK.
> 
> Rather than focusing on individual services, it would seem to me like a
> good idea to see if there is some underlying issue at work here.

So, the blame chart should not be misunderstood. It simply tells you how
much time passed between the time systemd forked off the process until
it completed initialization and told systemd about it. Now, within that
time there might be many things happening and the service might simply
wait for some resource to become available rather than be slow in its
own. That resource could be the CPU or IO or some other service or
device. For example, readahead might monopolize IO for some time during
boot, so that other processes get starved. These dependencies and
resource constraints are not visible in "systemd-analyze blame".

If you want to track this down, try systemd-bootchart. Simply boot with
"init=/usr/lib/systemd/systemd-bootchart", see systemd-bootchart(1) for
details. It will plot your CPU and IO consumption and can show you
resource usage by process, which is usually a more useful tool than
a simple "systemd-analyze blame".

For the super slow run above I'd be quite interested to have a look at
the bootchart actually. (Heiko? Can you upload that?)

Lennart

-- 
Lennart Poettering - Red Hat, Inc.


More information about the devel mailing list