Is this proof that systemd is completely broken?
Stephen Morris
samorris at netspace.net.au
Sun Jul 13 00:57:46 UTC 2014
Hi Sam,
I don't know anything about Systemd, nor have I read the rest of
the responses to this, but just looking at the logical interpretation of
your named-chroot.service statements it seems to me that you are
requesting that named-chroot.service be started after network.target but
before nss-lookup.target and that it needs nss-lookup.target to be
active, which to me seems to be a deadly embrace. Based on what I think
you are saying in your email I would have thought that logically your
"before" statement should be removed and your "after" statement should
be after network.target network-online.target nss-lookup.target ,
but then I am not sure how systemd works.
regards,
Steve
On 07/13/2014 12:00 AM, Sam Varshavchik wrote:
> Now that I have your attention, the background is as follows. This is
> a server with only statically configured network interfaces.
> NetworkManager is not installed. All network interfaces are statically
> configured via /etc/sysconfig/network-scripts.
>
> The server is regularly updated to current Fedora packages. For the
> last month, or so, the server has failed to come up in a sane state,
> reliably. After it responds to pings, after ssh-ing in, and examining
> the aftermath, the logs of all network services are consistent, in
> that they claim that each network service – which includes:
> named-chroot, httpd, dhcpd, and privoxy – their boot logs claim that
> no network interfaces were up at the time they're started.
>
> After finally getting pissed about having to manually re-brain the
> server, each time it boots, I attached a console monitor, and observed
> that the boot goes /very/ quickly, and the console login prompt comes
> up about 20-30 seconds before the server even starts responding to
> pings. Looks like the multi-user target is reached way long before
> networking even comes up.
>
> Last week, I've commented on the following curiosity: after sifting
> through systemd's documentation, their documentation claims that
> "network.target" gets reached only after basic networking is up, and
> "network-online.target" gets reached only after all network interfaces
> are initialized.
>
> Problem number one is that all servers specify "After=network.target",
> when, according to how I interpret this, they should all really
> specify "After=network-online.target".
>
> After that, it came to my attention that there's a NetworkManager
> optional subpackage that installs a service that waits for network
> interfaces to come up, and it's specified as "Before=network.target
> network-online.target". It seems fairly obvious to me that it should
> really be "Before=network-online.target" and "After=network.target",
> with all other services that require a functioning network specifying
> "After=network-online.target". That made logical sense to me, but it
> seems that this confusing arrangement makes logical sense to someone
> else, so, whatever. I do not have NetworkManager installed, but, I
> figure, why not take a crack at whipping up a dirty hack that
> basically does the same thing?
>
> But the unexpected result from the hack is that it seems to provide
> solid proof that systemd's dependency resolution is not working, but
> before I Bugzilla this (as little hope one might have from getting
> anything useful done by Bugzillaing this), I'd like to hear some
> consensus that I am interpreting the following data right. Who knows,
> I might actually have made a mistake, somewhere.
>
> Let's take a look at what named-chroot.service says:
>
> [Unit]
> Description=Berkeley Internet Name Domain (DNS)
> Wants=nss-lookup.target
> Before=nss-lookup.target
> After=network.target
>
> Are we all in agreement that named-chroot.service should only be
> started after network.target gets reached? Ok.
>
> Now, here's my hack, which is basically a clone of that NetworkManager
> subpackage:
>
> # cat /etc/systemd/system/wait-for-network.service
> [Unit]
> Description=Wait for network ports to be initialized
> Before=network.target network-online.target
>
> [Service]
> Type=oneshot
> ExecStart=/root/bin/wait-for-network
>
> [Install]
> WantedBy=multi-user.target
>
> Are we all in agreement that:
>
> 1) This is a one-shot service, and according to systemd's
> documentation, systemd must wait until this script is complete, before
> it's considered started.
>
> 2) Until it's complete, network.target isn't reached.
>
> 3) Therefore, this script must finish before systemd should start
> named-chroot.service
>
> Yet, after testing this script, then activating it, the server still
> came up utterly brainless after the reboot. The results:
>
> systemctl status named-chroot.service reports:
>
> named-chroot.service - Berkeley Internet Name Domain (DNS)
> Loaded: loaded (/usr/lib/systemd/system/named-chroot.service; enabled)
> Active: active (running) since Sat 2014-07-12 09:24:29 EDT; 3min 28s
> ago
> …
>
> So, systemd started named-chroot.service at 09:24:29.
>
> My script logs the current timestamp. The output from
> /root/bin/wait-for-network was as follows:
>
> Sat Jul 12 09:24:27 2014
> Interface: lo is up
> Sat Jul 12 09:24:32 2014
> Interface: lan0 is up
> Interface: lo is up
> Interface: wan0 is down
> Sat Jul 12 09:24:37 2014
> Interface: lan0 is up
> Interface: lo is up
> Interface: wan0 is up
>
> systemd started this script at 09:24:27. This script spun its wheels
> until 09:24:37, at which time all network interfaces finally came up.
> I'm happy to post the contents of this short script; however I don't
> think that it's relevant here, because the problem is that this script
> was running when systemd decided to run named-chroot.service, even
> though, according to the above, this should not happen.
>
> So, either I'm misreading the description of "oneshot" in
> systemd.service(5); and "Before" and "After" in systemd.unit(5), or
> systemd is broken completely. I think that my understanding of
> systemd's documentation is very reasonable. So, either systemd is
> broken, or, if it's supposedly working how it should be working, its
> documentation is crap, and is impossible to follow. I see no other
> possibilities.
>
>
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: samorris.vcf
Type: text/x-vcard
Size: 130 bytes
Desc: not available
URL: <http://lists.fedoraproject.org/pipermail/users/attachments/20140713/29991abb/attachment.vcf>
More information about the users
mailing list