Now that I have your attention, the background is as follows. This is a server with only statically configured network interfaces. NetworkManager is not installed. All network interfaces are statically configured via /etc/sysconfig/network-scripts.
The server is regularly updated to current Fedora packages. For the last month, or so, the server has failed to come up in a sane state, reliably. After it responds to pings, after ssh-ing in, and examining the aftermath, the logs of all network services are consistent, in that they claim that each network service – which includes: named-chroot, httpd, dhcpd, and privoxy – their boot logs claim that no network interfaces were up at the time they're started.
After finally getting pissed about having to manually re-brain the server, each time it boots, I attached a console monitor, and observed that the boot goes /very/ quickly, and the console login prompt comes up about 20-30 seconds before the server even starts responding to pings. Looks like the multi-user target is reached way long before networking even comes up.
Last week, I've commented on the following curiosity: after sifting through systemd's documentation, their documentation claims that "network.target" gets reached only after basic networking is up, and "network-online.target" gets reached only after all network interfaces are initialized.
Problem number one is that all servers specify "After=network.target", when, according to how I interpret this, they should all really specify "After=network-online.target".
After that, it came to my attention that there's a NetworkManager optional subpackage that installs a service that waits for network interfaces to come up, and it's specified as "Before=network.target network-online.target". It seems fairly obvious to me that it should really be "Before=network- online.target" and "After=network.target", with all other services that require a functioning network specifying "After=network-online.target". That made logical sense to me, but it seems that this confusing arrangement makes logical sense to someone else, so, whatever. I do not have NetworkManager installed, but, I figure, why not take a crack at whipping up a dirty hack that basically does the same thing?
But the unexpected result from the hack is that it seems to provide solid proof that systemd's dependency resolution is not working, but before I Bugzilla this (as little hope one might have from getting anything useful done by Bugzillaing this), I'd like to hear some consensus that I am interpreting the following data right. Who knows, I might actually have made a mistake, somewhere.
Let's take a look at what named-chroot.service says:
[Unit] Description=Berkeley Internet Name Domain (DNS) Wants=nss-lookup.target Before=nss-lookup.target After=network.target
Are we all in agreement that named-chroot.service should only be started after network.target gets reached? Ok.
Now, here's my hack, which is basically a clone of that NetworkManager subpackage:
# cat /etc/systemd/system/wait-for-network.service [Unit] Description=Wait for network ports to be initialized Before=network.target network-online.target
[Service] Type=oneshot ExecStart=/root/bin/wait-for-network
[Install] WantedBy=multi-user.target
Are we all in agreement that:
1) This is a one-shot service, and according to systemd's documentation, systemd must wait until this script is complete, before it's considered started.
2) Until it's complete, network.target isn't reached.
3) Therefore, this script must finish before systemd should start named- chroot.service
Yet, after testing this script, then activating it, the server still came up utterly brainless after the reboot. The results:
systemctl status named-chroot.service reports:
named-chroot.service - Berkeley Internet Name Domain (DNS) Loaded: loaded (/usr/lib/systemd/system/named-chroot.service; enabled) Active: active (running) since Sat 2014-07-12 09:24:29 EDT; 3min 28s ago …
So, systemd started named-chroot.service at 09:24:29.
My script logs the current timestamp. The output from /root/bin/wait-for- network was as follows:
Sat Jul 12 09:24:27 2014 Interface: lo is up Sat Jul 12 09:24:32 2014 Interface: lan0 is up Interface: lo is up Interface: wan0 is down Sat Jul 12 09:24:37 2014 Interface: lan0 is up Interface: lo is up Interface: wan0 is up
systemd started this script at 09:24:27. This script spun its wheels until 09:24:37, at which time all network interfaces finally came up. I'm happy to post the contents of this short script; however I don't think that it's relevant here, because the problem is that this script was running when systemd decided to run named-chroot.service, even though, according to the above, this should not happen.
So, either I'm misreading the description of "oneshot" in systemd.service(5); and "Before" and "After" in systemd.unit(5), or systemd is broken completely. I think that my understanding of systemd's documentation is very reasonable. So, either systemd is broken, or, if it's supposedly working how it should be working, its documentation is crap, and is impossible to follow. I see no other possibilities.
On Sat, 12 Jul 2014 10:00:45 -0400 Sam Varshavchik wrote:
Now, here's my hack, which is basically a clone of that NetworkManager subpackage:
You're willing to invest a lot more time in systemd than I am :-). I just put a batch of systemctl restart commands in the rc.local file with different time delays. It is the only way I've gotten all the network services to start reliably.
Tom Horsley writes:
On Sat, 12 Jul 2014 10:00:45 -0400 Sam Varshavchik wrote:
Now, here's my hack, which is basically a clone of that NetworkManager subpackage:
You're willing to invest a lot more time in systemd than I am :-). I just put a batch of systemctl restart commands in the rc.local file with different time delays. It is the only way I've gotten all the network services to start reliably.
Yeah, well, I don't really mind spending a few hours hacking up some script to workaround some stupid bug.
But this is now looking more than just a stupid bug, after I saw how the server immediately booted up to a console login prompt, but still failed to respond to pings for another thirty or so seconds, until the network stack came up. If I wanted to boot an OS that gave me a user desktop before it finished setting up its network connections, I'd boot Windows 7.
On Sat, Jul 12, 2014 at 12:08 PM, Sam Varshavchik mrsam@courier-mta.com wrote:
Tom Horsley writes:
On Sat, 12 Jul 2014 10:00:45 -0400 Sam Varshavchik wrote:
Now, here's my hack, which is basically a clone of that NetworkManager subpackage:
You're willing to invest a lot more time in systemd than I am :-). I just put a batch of systemctl restart commands in the rc.local file with different time delays. It is the only way I've gotten all the network services to start reliably.
Yeah, well, I don't really mind spending a few hours hacking up some script to workaround some stupid bug.
But this is now looking more than just a stupid bug, after I saw how the server immediately booted up to a console login prompt, but still failed to respond to pings for another thirty or so seconds, until the network stack came up. If I wanted to boot an OS that gave me a user desktop before it finished setting up its network connections, I'd boot Windows 7.
Is there a rule that says that a system's network should be set up before a getty login (which isn't a user desktop) is available?
Tom H writes:
On Sat, Jul 12, 2014 at 12:08 PM, Sam Varshavchik mrsam@courier-mta.com wrote:
Tom Horsley writes:
On Sat, 12 Jul 2014 10:00:45 -0400 Sam Varshavchik wrote:
Now, here's my hack, which is basically a clone of that NetworkManager subpackage:
You're willing to invest a lot more time in systemd than I am :-). I just put a batch of systemctl restart commands in the rc.local file with different time delays. It is the only way I've gotten all the network services to start reliably.
Yeah, well, I don't really mind spending a few hours hacking up some script to workaround some stupid bug.
But this is now looking more than just a stupid bug, after I saw how the server immediately booted up to a console login prompt, but still failed to respond to pings for another thirty or so seconds, until the network stack came up. If I wanted to boot an OS that gave me a user desktop before it finished setting up its network connections, I'd boot Windows 7.
Is there a rule that says that a system's network should be set up before a getty login (which isn't a user desktop) is available?
No, that's just the most visible symptom of major systemd fubarage.
On Sat, 2014-07-12 at 11:36 -0400, Tom Horsley wrote:
On Sat, 12 Jul 2014 10:00:45 -0400 Sam Varshavchik wrote:
Now, here's my hack, which is basically a clone of that NetworkManager subpackage:
You're willing to invest a lot more time in systemd than I am :-). I just put a batch of systemctl restart commands in the rc.local file with different time delays. It is the only way I've gotten all the network services to start reliably.
OMG :) that sounds, the SysV should be rewritten by the sysadmins if they want they system. But have you tried your issue on a 'fresh install' system? it could be compatibility issue between systemd and sysvinit script.
Balint
On Jul 12 10:00, Sam Varshavchik wrote:
Now that I have your attention, the background is as follows. This is a server with only statically configured network interfaces. NetworkManager is not installed. All network interfaces are statically configured via /etc/sysconfig/network-scripts.
The server is regularly updated to current Fedora packages. For the last month, or so, the server has failed to come up in a sane state, reliably. After it responds to pings, after ssh-ing in, and examining the aftermath, the logs of all network services are consistent, in that they claim that each network service – which includes: named-chroot, httpd, dhcpd, and privoxy – their boot logs claim that no network interfaces were up at the time they're started.
This is probably not exactly what you're looking for, but I had the same or, at least, a similar problem.
The network init script creates four interfaces. All network services which listen on 0.0.0.0 or :: start up fine, even if they start too early. Network services which are supposed to listen to explicit network addresses often fail to start, because the bind(2) calls fail.
What I did was to screw a new target between network-online and the affected services, like this:
$ cat /etc/systemd/system/multi-user.target.wants/network-waiter.target # Wait for Network being online. # Let all network services depend on this one.
[Unit] Description=Fake target to make sure network is online before starting dependent services After=network-online.target Before=dovecot.service named.service postfix.service sshd.service
This seems to help. It sure would be nice if these service files would be corrected instead.
Corinna
Corinna Vinschen writes:
On Jul 12 10:00, Sam Varshavchik wrote:
Now that I have your attention, the background is as follows. This is a server with only statically configured network interfaces. NetworkManager
is
not installed. All network interfaces are statically configured via /etc/sysconfig/network-scripts.
The server is regularly updated to current Fedora packages. For the last month, or so, the server has failed to come up in a sane state, reliably. After it responds to pings, after ssh-ing in, and examining the aftermath, the logs of all network services are consistent, in that they claim that each network service – which includes: named-chroot, httpd, dhcpd, and privoxy – their boot logs claim that no network interfaces were up at the time they're started.
This is probably not exactly what you're looking for, but I had the same or, at least, a similar problem.
The network init script creates four interfaces. All network services which listen on 0.0.0.0 or :: start up fine, even if they start too early. Network services which are supposed to listen to explicit network addresses often fail to start, because the bind(2) calls fail.
What I did was to screw a new target between network-online and the affected services, like this:
$ cat /etc/systemd/system/multi-user.target.wants/network-waiter.target # Wait for Network being online. # Let all network services depend on this one.
[Unit] Description=Fake target to make sure network is online before starting dependent services After=network-online.target Before=dovecot.service named.service postfix.service sshd.service
This seems to help. It sure would be nice if these service files would be corrected instead.
This is certainly a valid workaround for the bug.
But this wasn't my point. I am looking for validation, based on the data I posted, that systemd's dependency resolution is broken. Systemd's fancy dependency resolution is supposted to be its flagship feature. I installed a service that should execute before reaching network.target. Systemd should wait until my service finished executing, before reaching network.target, and then starting all the services that should be started after reaching network.target. Yet, the timestamps I collected from my log files appear to prove, fairly conclusively, that systemd is starting services after network.target before my service finishes executing.
Fail.
I'm perfectly open to the possibility that I am misinterpreting something. If so, please someone tell me what detail I missed; and how the data I posted actually makes sense.
I'll wait.
On Jul 12 12:01, Sam Varshavchik wrote:
Corinna Vinschen writes: [something] This is certainly a valid workaround for the bug.
But this wasn't my point. I am looking for validation, based on the data I posted, that systemd's dependency resolution is broken. [...]
Hmm, ok. Let's try.
Problem number one is that all servers specify "After=network.target", when, according to how I interpret this, they should all really specify "After=network-online.target".
Yes and no. The problem is not that network.target is always wrong, but it's wrong for services which don't use Linux-specific functionality like rtnetlink or IP_FREEBIND. OpenSSH's sshd, for instance, doesn't utilize these Linux-specials so it clearly should depend on network-online.target. It really doesn't help to blame OpenSSH and ignore the server's capabilities when writing the sshd.service file, IMHO. For services which are capable of handling network interface changes, network.target is fine.
[Unit] Description=Wait for network ports to be initialized Before=network.target network-online.target
[Service] Type=oneshot ExecStart=/root/bin/wait-for-network
[Install] WantedBy=multi-user.target
AFAIU Before/After only define the order, not requirements. From the man page systemd.unit:
Before=, After= A space-separated list of unit names. Configures ordering dependencies between units. If a unit foo.service contains a setting Before=bar.service and both units are being started, bar.service's start-up is delayed until foo.service is started up.
Note especially the last line I quoted. network.target is not necessarily reached only after the SysV Init network service script has run to completion as outlined in http://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/ So your service runs as designed, AFAICS.
Corinna
So, I would encourage you to ask the systemd list: http://lists.freedesktop.org/mailman/listinfo/systemd-devel for a probably more detailed answer than you might get here.
Or... file a bug if you think you have found one. https://bugzilla.redhat.com/enter_bug.cgi?product=Fedora&version=20&...
My theory of what you are seeing revolves around the fact that the /etc/init.d/network script is NOT a systemd unit file, it's a old sysvinit script which systemd runs under compatibility.
From http://www.freedesktop.org/wiki/Software/systemd/NetworkTarget
"network-online.target is a target that actively waits until the nework is "up", where the definition of "up" is defined by the network management software"
and I think /etc/init.d/network defines "up" in a way that doesn't mean that the interfaces have ip's and are passing packets. Just a guess. Alternately you found a nice bug. ;)
kevin
Kevin Fenzi writes:
So, I would encourage you to ask the systemd list: http://lists.freedesktop.org/mailman/listinfo/systemd-devel for a probably more detailed answer than you might get here.
There's a very specific reason why I don't wish to do that.
Or... file a bug if you think you have found one. https://bugzilla.redhat.com/enter_bug.cgi? product=Fedora&version=20&component=systemd
My theory of what you are seeing revolves around the fact that the /etc/init.d/network script is NOT a systemd unit file, it's a old sysvinit script which systemd runs under compatibility.
Given the criticality of /etc/rc.d/init.d/network, and the fact that a bucketload of servers are NOT going to run properly unless that script does its job, I would think that a little bit more consideration should've been given to make certain that it gets properly integrated into systemd; rather than waving it off as some legacy script.
From http://www.freedesktop.org/wiki/Software/systemd/NetworkTarget
"network-online.target is a target that actively waits until the nework is "up", where the definition of "up" is defined by the network management software"
and I think /etc/init.d/network defines "up" in a way that doesn't mean that the interfaces have ip's and are passing packets. Just a guess.
I get that. I read that ten times already.
But it's a moot point. Whatever "network up" means or does not mean is irrelevant. One service is declared to run "Before" a target. A second service is declared to run "After" a target. According to my timestamps, systemd is starting the second service before the first one is finished.
What the target does, if anything, I don't think it really matters.
Alternately you found a nice bug. ;)
Again, out of abundance of caution, I'll postpone claiming another scalp until a few more eyes look at what I have, and agree. But, presuming that I am right, this isn't exactly a bug. It's a major fail.
At this point, it doesn't look like systemd is really doing much of a dependency tracking and resolution. It just compiles a list of stuff to kick off, to reach a certain target. Then forks off and runs them all at once. Or almost, all at once; I do see a few seconds' of pauses, in the boot output, when systemd stops reporting that something has stopped or started, for a second or two. But, as a practical matter, everything just gets starte
Well, that's one way to speed up system boot.
P.S. On another server of mine, systemd figures out exactly the right moment to fork off innd so that innd starts listening on 127.0.0.1, but not on ::1 because it hasn't come up yet.
On every boot.
That's awesome.
On Sat, 12 Jul 2014 13:05:59 -0400 Sam Varshavchik mrsam@courier-mta.com wrote:
Kevin Fenzi writes:
So, I would encourage you to ask the systemd list: http://lists.freedesktop.org/mailman/listinfo/systemd-devel for a probably more detailed answer than you might get here.
There's a very specific reason why I don't wish to do that.
Fair enough.
...snip...
Alternately you found a nice bug. ;)
Again, out of abundance of caution, I'll postpone claiming another scalp until a few more eyes look at what I have, and agree. But, presuming that I am right, this isn't exactly a bug. It's a major fail.
Well, I have a machine here using network and haven't had any issues with it. You youself said (correct me If I'm wrong) that it only started doing this for you recently? Sounds like a bug to me...
Anyhow, good luck.
kevin
On 07/12/14 09:43, Kevin Fenzi wrote:
So, I would encourage you to ask the systemd list: http://lists.freedesktop.org/mailman/listinfo/systemd-devel for a probably more detailed answer than you might get here.
Or... file a bug if you think you have found one. https://bugzilla.redhat.com/enter_bug.cgi?product=Fedora&version=20&...
more likely, they'll try set_him_straight then address the actual problem;-)
On Sat, 12 Jul 2014, Edward M wrote:
On 07/12/14 09:43, Kevin Fenzi wrote:
So, I would encourage you to ask the systemd list: http://lists.freedesktop.org/mailman/listinfo/systemd-devel for a probably more detailed answer than you might get here.
Or... file a bug if you think you have found one. https://bugzilla.redhat.com/enter_bug.cgi?product=Fedora&version=20&...
more likely, they'll try set_him_straight then address the actual problem;-)
I really am starting to see this as an example of the Microsoft-ish philosophy of "tell the users where they must go and let them catch up as they can." It works when the users don't have any other choice. But then, I guess Gentoo is the only distro left that hasn't adopted systemd, or will be doing so shortly. I seriously considered switching to Debian, but no joy.
billo
On Sat, 12 Jul 2014 18:43:34 +0000 (UTC) Bill Oliver vendor@billoblog.com wrote:
On Sat, 12 Jul 2014, Edward M wrote:
On 07/12/14 09:43, Kevin Fenzi wrote:
So, I would encourage you to ask the systemd list: http://lists.freedesktop.org/mailman/listinfo/systemd-devel for a probably more detailed answer than you might get here.
Or... file a bug if you think you have found one. https://bugzilla.redhat.com/enter_bug.cgi?product=Fedora&version=20&...
more likely, they'll try set_him_straight then address the actual problem;-)
I really am starting to see this as an example of the Microsoft-ish philosophy of "tell the users where they must go and let them catch up as they can."
What would you suggest people do when they find a bug? Reporting it to the people who can fix it is very much more likely to get it fixed or worked around than complaining about it in a place where no one will fix it.
It works when the users don't have any other choice. But then, I guess Gentoo is the only distro left that hasn't adopted systemd, or will be doing so shortly. I seriously considered switching to Debian, but no joy.
I think slackware has stated that they don't intend to move to systemd.
Good luck wherever you end up.
kevin
Kevin Fenzi kevin@scrye.com writes:
What would you suggest people do when they find a bug? Reporting it to the people who can fix it is very much more likely to get it fixed or worked around than complaining about it in a place where no one will fix it.
Seeing the recent reactions in this list, what's the difference?
I'm affected by a systemd bug, that I have no intention of reporting. My workaround takes me far less time, than getting through the endless layers of fanboys would.
On 07/12/2014 12:52 PM, Anders Wegge Keller wrote:
I'm affected by a systemd bug, that I have no intention of reporting. My workaround takes me far less time, than getting through the endless layers of fanboys would.
You don't care, then, that if you're affected by this bug others are and that it needs to be swatted. If you're not willing to help others, why should anybody bother to help you?
Joe Zeff joe@zeff.us writes:
On 07/12/2014 12:52 PM, Anders Wegge Keller wrote:
I'm affected by a systemd bug, that I have no intention of reporting. My workaround takes me far less time, than getting through the endless layers of fanboys would.
You don't care, then, that if you're affected by this bug others are and that it needs to be swatted. If you're not willing to help others, why should anybody bother to help you?
Why bother? When trying to diagnose the problem, I found several well-documented bug reports with the soame contet. Some were almost old enough to get rid of their diapers. They haven't been resolved, so why should I spend time on another futile report?
On 07/12/2014 01:20 PM, Anders Wegge Keller wrote:
Why bother? When trying to diagnose the problem, I found several well-documented bug reports with the soame contet. Some were almost old enough to get rid of their diapers. They haven't been resolved, so why should I spend time on another futile report?
Maybe if enough people report the same bug the devs will get tired of looking at them, get off their asses and Do Something about it. Ever heard of a squeaky wheel?
Joe Zeff joe@zeff.us writes:
On 07/12/2014 01:20 PM, Anders Wegge Keller wrote:
Why bother? When trying to diagnose the problem, I found several well-documented bug reports with the soame contet. Some were almost old enough to get rid of their diapers. They haven't been resolved, so why should I spend time on another futile report?
Maybe if enough people report the same bug the devs will get tired of looking at them, get off their asses and Do Something about it. Ever heard of a squeaky wheel?
In this case, the wheel will get a pail of water, when the squeek starts smoldering. So - even if I jinx my effort by stating the intention - my best option is angst-inducing vague posts on a non-technical mailing list.
Mind that the primary systemd developer is employed by the company that thought udev was just the right thing to push on a server distro. I can ran't about that for hours, but sadly, I haven't found a place where such things are ontopic.
On 07/12/2014 01:34 PM, Anders Wegge Keller wrote:
In this case, the wheel will get a pail of water, when the squeek starts smoldering. So - even if I jinx my effort by stating the intention - my best option is angst-inducing vague posts on a non-technical mailing list.
I may well be wrong, but it's beginning to look as though you're more interested in complaining than in getting something done. Go ahead if that's what you want, but I doubt if I'll be responding in the future.
Joe Zeff joe@zeff.us writes:
On 07/12/2014 01:34 PM, Anders Wegge Keller wrote:
In this case, the wheel will get a pail of water, when the squeek starts smoldering. So - even if I jinx my effort by stating the intention - my best option is angst-inducing vague posts on a non-technical mailing list.
I may well be wrong, but it's beginning to look as though you're more interested in complaining than in getting something done. Go ahead if that's what you want, but I doubt if I'll be responding in the future.
I have stopped worrying about "getting anything done". I just answered a rhetorical question.
On 12 Jul 2014 21:52:22 +0200 Anders Wegge Keller wegge@wegge.dk wrote:
Kevin Fenzi kevin@scrye.com writes:
What would you suggest people do when they find a bug? Reporting it to the people who can fix it is very much more likely to get it fixed or worked around than complaining about it in a place where no one will fix it.
Seeing the recent reactions in this list, what's the difference?
I'm not following your logic here.
Given 2 ways of reacting to a bug: A) complain about it on a users mailing list that few developers follow. B) report a bug on it that developers will see and be able to respond to.
You are suggesting that because everyone did A and 0 people did B, there's no difference between them? How can you tell if you dismiss the second option?
I'm affected by a systemd bug, that I have no intention of reporting. My workaround takes me far less time, than getting through the endless layers of fanboys would.
Alright. Good luck.
kevin
Kevin Fenzi kevin@scrye.com writes:
On 12 Jul 2014 21:52:22 +0200 Anders Wegge Keller wegge@wegge.dk wrote:
Kevin Fenzi kevin@scrye.com writes:
What would you suggest people do when they find a bug? Reporting it to the people who can fix it is very much more likely to get it fixed or worked around than complaining about it in a place where no one will fix it.
Seeing the recent reactions in this list, what's the difference?
I'm not following your logic here.
Given 2 ways of reacting to a bug: A) complain about it on a users mailing list that few developers follow. B) report a bug on it that developers will see and be able to respond to.
C) Report a bug and be ignored, told to fsck off to someplace else, and be ridiculed to boot.
If you won't admit having seen option C on the bug tracker, I have no need for your opinions.
On 12 Jul 2014 22:24:13 +0200 Anders Wegge Keller wegge@wegge.dk wrote:
I'm not following your logic here.
Given 2 ways of reacting to a bug: A) complain about it on a users mailing list that few developers follow. B) report a bug on it that developers will see and be able to respond to.
C) Report a bug and be ignored, told to fsck off to someplace else, and be ridiculed to boot.
If you won't admit having seen option C on the bug tracker, I have no need for your opinions.
Sure, that happens sadly. ;(
However, with a bug report (even if it's closed or not acted on):
* If the package gets taken over by someone else, they can see the problem and look to fixing it.
* Other interested parties can see the bug and add their data to it, note that it's still a problem, add workarounds, propose patches.
* There's a solid record of the issue/reports. "look at bug 1234" is much better than "read 10,000 posts on the users list around june/july and try and find the actual bug report in the off topic stuff"
* Even just adding yourself to the bug report can indicate that the problem is more widespread than the one user.
Anyhow, if you don't want to file or add info to bugs, that's your decision.
kevin
Kevin Fenzi writes:
On 12 Jul 2014 22:24:13 +0200 Anders Wegge Keller wegge@wegge.dk wrote:
I'm not following your logic here.
Given 2 ways of reacting to a bug: A) complain about it on a users mailing list that few developers follow. B) report a bug on it that developers will see and be able to respond to.
C) Report a bug and be ignored, told to fsck off to someplace else, and be ridiculed to boot.
If you won't admit having seen option C on the bug tracker, I have no need for your opinions.
Sure, that happens sadly. ;(
However, with a bug report (even if it's closed or not acted on):
If the package gets taken over by someone else, they can see the problem and look to fixing it.
Other interested parties can see the bug and add their data to it, note that it's still a problem, add workarounds, propose patches.
There's a solid record of the issue/reports. "look at bug 1234" is much better than "read 10,000 posts on the users list around june/july and try and find the actual bug report in the off topic stuff"
Even just adding yourself to the bug report can indicate that the problem is more widespread than the one user.
Anyhow, if you don't want to file or add info to bugs, that's your decision.
I think that the forest here is being missed for the trees. Previously, I mentioned that I do not want to submit any systemd bug reports. Besides bitching just to make me feel better, I am more interested in analyzing systemd's bugs and developing temporary workarounds, like the one that involves this dummy service, in order to be able to bring up network services sanely.
I think that systemd should not be fixed. I think it should be dumped, and replaced. It's fundamentally broken, even without this latest fallout. So, why would I want to report systemd bugs, and help improve it? I can't think of any reason why I'd want to do that.
And I suspect that, perhaps subconsciously, he feels exactly the same way. And I can't blame him.
Sam Varshavchik mrsam@courier-mta.com writes:
I think that systemd should not be fixed. I think it should be dumped, and replaced. It's fundamentally broken, even without this latest fallout. So, why would I want to report systemd bugs, and help improve it? I can't think of any reason why I'd want to do that.
And I suspect that, perhaps subconsciously, he feels exactly the same way. And I can't blame him.
Spot on, except for the subconscious part. In my opinion, systemd is broken by design.
On 12 Jul 2014 22:59:31 +0200 Anders Wegge Keller wegge@wegge.dk wrote:
Sam Varshavchik mrsam@courier-mta.com writes:
I think that systemd should not be fixed. I think it should be dumped, and replaced. It's fundamentally broken, even without this latest fallout. So, why would I want to report systemd bugs, and help improve it? I can't think of any reason why I'd want to do that.
And I suspect that, perhaps subconsciously, he feels exactly the same way. And I can't blame him.
Spot on, except for the subconscious part. In my opinion, systemd is broken by design.
Alright. I'm sorry you think so, and I disagree with you, but that's perfectly fine.
Lets agree to disagree and go back to helping out folks using Fedora.
kevin
On 07/12/2014 10:59 PM, Anders Wegge Keller wrote:
Sam Varshavchik mrsam@courier-mta.com writes:
I think that systemd should not be fixed. I think it should be dumped, and replaced. It's fundamentally broken, even without this latest fallout. So, why would I want to report systemd bugs, and help improve it? I can't think of any reason why I'd want to do that.
And I suspect that, perhaps subconsciously, he feels exactly the same way. And I can't blame him.
Spot on, except for the subconscious part. In my opinion, systemd is broken by design.
Some of you really sound like this, http://goo.gl/ySCTWx :) No pun intended.
Besides why would anyone spend valuable time on outdated network scripts on top of something called http://man7.org/linux/man-pages/man8/systemd-networkd.8.html
poma
Actually, I take this opportunity to thanks all systemd associates for their great effort, commitment and understanding. Godspeed.
poma
poma pomidorabelisima@gmail.com writes:
Besides why would anyone spend valuable time on outdated network scripts on top of something called http://man7.org/linux/man-pages/man8/systemd-networkd.8.html
IMO, if you cannot think of applications, where systemd is inappropriate, you do not have the maturity to go out and recommend it in the first place.
On Sun, 13 Jul 2014 10:39:00 +0200 poma wrote:
Besides why would anyone spend valuable time on outdated network scripts
Probably because NetworkManager still has about 10 years to go before it manages to replicate all the functionality of network, and then another 10 years before users figure out how to use it...
H
On Sun, Jul 13, 2014 at 8:53 AM, Tom Horsley wrote:
Probably because NetworkManager still has about 10 years to go before it manages to replicate all the functionality of network, and then another 10 years before users figure out how to use it...
You haven't looked at the latest release yet I guess
http://blogs.gnome.org/dcbw/2014/06/20/well-build-a-dream-house-of-net/
Checkout whats available in Rawhide including the split packages
Rahul
Rahul Sundaram wrote:
You haven't looked at the latest release yet I guess
http://blogs.gnome.org/dcbw/2014/06/20/well-build-a-dream-house-of-net/
I'm afraid it does not reassure me at all when Dan Williams says he will be "granting every wish you dream of". One of the problems with NM, in my view, is that it tries to do too much. I'd like it just to concentrate on one thing, WiFi.
Ken Thompson's adage still applies: "A program should do one thing, and do it well."
On Sun, 13 Jul 2014 19:18:35 +0200 Timothy Murphy wrote:
I'm afraid it does not reassure me at all when Dan Williams says he will be "granting every wish you dream of".
Yea, and my wish and dream is that NetworkManager goes away, and network simply gets better support for wi-fi, which should have been what happened to begin with :-).
On 14/07/14 01:23, Tom Horsley wrote:
On Sun, 13 Jul 2014 19:18:35 +0200 Timothy Murphy wrote:
I'm afraid it does not reassure me at all when Dan Williams says he will be "granting every wish you dream of".
Yea, and my wish and dream is that NetworkManager goes away, and network simply gets better support for wi-fi, which should have been what happened to begin with :-).
Come on Tom, you know that's not how it works. First of all you have to declare the original code legacy and broken because it doesn't support feature Z. Then you have to write a 'complete' replacement which does support feature Z, but it doesn't support feature X and Y. That doesn't matter really, because not many people used feature X and Y anyway. Plus, they're on the roadmap to be implemented at some point somewhere in the future. Besides the legacy system is still around if you need X and Y. You just can't have X, Y and Z.
Hi
On Sun, Jul 13, 2014 at 1:18 PM, Timothy Murphy wrote:
One of the problems with NM, in my view, is that it tries to do too much.
This is one of the things that NM addresses with plugins so you can pick and choose which features you want out of it. Also by integrating with existing tools, you don't have to go all or nothing. As I noted earlier, in Rawhide and Fedora 21, packages have already been split up.
Rahul
Rahul Sundaram writes:
On Sun, Jul 13, 2014 at 1:18 PM, Timothy Murphy wrote:
One of the problems with NM, in my view, is that it tries to do too much.
This is one of the things that NM addresses with plugins so you can pick and choose which features you want out of it. Also by integrating with existing tools, you don't have to go all or nothing. As I noted earlier, in Rawhide and Fedora 21, packages have already been split up.
Just wondering what I have to look forward to after upgrading a working Fedora 20 server, that currently sets up all network interfaces with static IP addresses, from /etc/sysconfig/network-scripts/ifcfg*; including several <ifname>:1 aliases, and including /etc/udev/rules.d/70-persistent-net.rules which assigned the right network interface name to the appropriate hardware network port – which has worked for about a decade now – to Fedora 21.
So, what exactly would be the probability of the server figuring out how to get back up on its network, after upgrading to Fedora 21; just wondering my chances.
It appears that the entire Linux world hates your guts so perhaps you might consider buying a Mac and using Mac OS X?
At the very least might I suggest the you STFU? Since you appear to not be smart enough to deal with this.
Gee. Are there no moderators here?
On Sun, Jul 13, 2014 at 8:50 PM, Sam Varshavchik mrsam@courier-mta.com wrote:
Rahul Sundaram writes:
On Sun, Jul 13, 2014 at 1:18 PM, Timothy Murphy wrote:
One of the problems with NM, in my view, is that it tries to do too much.
This is one of the things that NM addresses with plugins so you can pick and choose which features you want out of it. Also by integrating with existing tools, you don't have to go all or nothing. As I noted earlier, in Rawhide and Fedora 21, packages have already been split up.
Just wondering what I have to look forward to after upgrading a working Fedora 20 server, that currently sets up all network interfaces with static IP addresses, from /etc/sysconfig/network-scripts/ifcfg*; including several <ifname>:1 aliases, and including /etc/udev/rules.d/70-persistent-net.rules which assigned the right network interface name to the appropriate hardware network port – which has worked for about a decade now – to Fedora 21.
So, what exactly would be the probability of the server figuring out how to get back up on its network, after upgrading to Fedora 21; just wondering my chances.
-- users mailing list users@lists.fedoraproject.org To unsubscribe or change subscription options: https://admin.fedoraproject.org/mailman/listinfo/users Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines Have a question? Ask away: http://ask.fedoraproject.org
On Sun, 13 Jul 2014 21:01:35 -0400 David Boles thetinsmith@gmail.com wrote:
It appears that the entire Linux world hates your guts so perhaps you might consider buying a Mac and using Mac OS X?
At the very least might I suggest the you STFU? Since you appear to not be smart enough to deal with this.
Gee. Are there no moderators here?
There are.
First, last and final warning that posts like this are not acceptable.
kevin
So then tell me where the heck have you have been for the past several weeks? All the time this user has been whining and crying over this 3 or so year old change that he can not comprehend.?
On Sun, Jul 13, 2014 at 9:06 PM, Kevin Fenzi kevin@scrye.com wrote:
On Sun, 13 Jul 2014 21:01:35 -0400 David Boles thetinsmith@gmail.com wrote:
It appears that the entire Linux world hates your guts so perhaps you might consider buying a Mac and using Mac OS X?
At the very least might I suggest the you STFU? Since you appear to not be smart enough to deal with this.
Gee. Are there no moderators here?
There are.
First, last and final warning that posts like this are not acceptable.
kevin
-- users mailing list users@lists.fedoraproject.org To unsubscribe or change subscription options: https://admin.fedoraproject.org/mailman/listinfo/users Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines Have a question? Ask away: http://ask.fedoraproject.org
On Sun, 13 Jul 2014 21:38:33 -0400 David Boles thetinsmith@gmail.com wrote:
So then tell me where the heck have you have been for the past several weeks? All the time this user has been whining and crying over this 3 or so year old change that he can not comprehend.?
I was not a moderator for this list until the 10th, when the Fedora project leader asked me to close that one thread.
Personal attacks are unacceptable and will result in moderation/removal from the list.
kevin
07/13/2014 06:01 PM, David Boles wrote:
It appears that the entire Linux world hates your guts so perhaps you might consider buying a Mac and using Mac OS X?
At the very least might I suggest the you STFU? Since you appear to not be smart enough to deal with this.
Gee. Are there no moderators here?
-1 IMO that's WAY out of line... list guidelines explicitly proscribe ad hominum attacks.
David Boles writes:
It appears that the entire Linux world hates your guts so perhaps you might consider buying a Mac and using Mac OS X?
At the very least might I suggest the you STFU? Since you appear to not be smart enough to deal with this.
Gee. Are there no moderators here?
Mr. Boles,
You are obviously unfamiliar with Sam Varshavchik mrsam@courier-mta.com.
Do you see the domain name in his email address? Courier is one of the major mail transfer agents out there, a well-developed alternative to postfix. Sam is largely responsible for it.
As such, he has contributed more to free software than, I dare say, you ever will.
'Linux' most certainly does not hate him. Many of us recognize a considerable debt of gratitude.
On Sun, Jul 13, 2014 at 8:50 PM, Sam Varshavchik mrsam@courier-mta.com wrote:
Rahul Sundaram writes:
This is one of the things that NM addresses with plugins so you can pick and choose which features you want out of it. Also by integrating with existing tools, you don't have to go all or nothing. As I noted earlier, in Rawhide and Fedora 21, packages have already been split up.
Just wondering what I have to look forward to after upgrading a working Fedora 20 server, that currently sets up all network interfaces with static IP addresses, from /etc/sysconfig/network-scripts/ifcfg*; including several <ifname>:1 aliases, and including /etc/udev/rules.d/70-persistent-net.rules which assigned the right network interface name to the appropriate hardware network port – which has worked for about a decade now – to Fedora 21.
So, what exactly would be the probability of the server figuring out how to get back up on its network, after upgrading to Fedora 21; just wondering my chances.
"/etc/rc.d/init.d/network" will most likely disappear one day without being replaced by "/usr/lib/systemd/system/network.service"...
You'll have a choice of networkd, NM, or a home-grown "network.service".
For networkd:
I have a Rawhide VM with a udev rule and an ifcfg file (both of which still work). I moved them out of the way and replaced them with:
# cat /etc/systemd/network/net0.link [Match] MACAddress=52:54:00:16:16:16 [Link] Name=net0
# cat /etc/systemd/network/net0.network [Match] Name=net0 [Address] Address=10.0.2.16/24 [Route] Gateway=10.0.2.2
and enabled "/usr/lib/systemd/system/systemd-networkd.service". The result is:
# ip -4 a sh dev net0 2: net0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 inet 10.0.2.16/24 brd 10.0.2.255 scope global net0 valid_lft forever preferred_lft forever
# ip r default via 10.0.2.2 dev net0 10.0.2.0/24 dev net0 proto kernel scope link src 10.0.2.16
For NM:
You'd create:
# cat /etc/NetworkManager/system-connections/net0 [connection] id=net0 type=802-3-ethernet [ipv4] method=manual address1=10.0.2.16/24,10.0.2.2 [802-3-ethernet] mac-address=52:54:00:16:16:16
For networkd:
To create an alias, you have to add [Address] Label=net0:0 Address=10.0.2.16/24
For NM:
I have no idea how to mimic the label part of "ip a add <ip> dev net0 label net0:0".
poma writes:
Besides why would anyone spend valuable time on outdated network scripts on top of something called http://man7.org/linux/man-pages/man8/systemd-networkd.8.html
yum is unable to find this package. As such, even if it's the greatest thing since sliced bread, it means nothing as far as Fedora is concerned.
On Sun, 13 Jul 2014 20:37:40 -0400 Sam Varshavchik mrsam@courier-mta.com wrote:
poma writes:
Besides why would anyone spend valuable time on outdated network scripts on top of something called http://man7.org/linux/man-pages/man8/systemd-networkd.8.html
yum is unable to find this package. As such, even if it's the greatest thing since sliced bread, it means nothing as far as Fedora is concerned.
It's in f21/rawhide.
kevin
On 14.07.2014 02:49, Kevin Fenzi wrote:
On Sun, 13 Jul 2014 20:37:40 -0400 Sam Varshavchik mrsam@courier-mta.com wrote:
poma writes:
Besides why would anyone spend valuable time on outdated network scripts on top of something called http://man7.org/linux/man-pages/man8/systemd-networkd.8.html
yum is unable to find this package. As such, even if it's the greatest thing since sliced bread, it means nothing as far as Fedora is concerned.
It's in f21/rawhide.
kevin
Besides it's not the package per se, $ rpm -q --whatprovides /usr/lib/systemd/systemd-networkd systemd-215-4.git3864c28.20140711.fc21.x86_64
Even though it is not offered as part of the official Fedora stable repo, it is still very possible. I successfully tested it, not only within Fedora 20, and not only this particular version. And not only this particular package. :) OK Sam?
poma
poma writes:
Besides it's not the package per se, $ rpm -q --whatprovides /usr/lib/systemd/systemd-networkd systemd-215-4.git3864c28.20140711.fc21.x86_64
Even though it is not offered as part of the official Fedora stable repo, it is still very possible. I successfully tested it, not only within Fedora 20, and not only this particular version. And not only this particular package. :) OK Sam?
Ok, ok, fair enough.
If I get drunk enough, I might even try installing F21's systemd onto F20. But it's going to require a lot of drinking…
On 12.07.2014, Anders Wegge Keller wrote:
C) Report a bug and be ignored, told to fsck off to someplace else, and be ridiculed to boot.
I fully understand your reaction.
I reported a (quite different) bug with systemd and got zero response. After some (longer) time, I finally got a reaction, which was a single comment accusing me for using a "weird" system/configuration (which was stock Fedora, by the way, and thus common for all Fedora users which used this feature).
So next time I'll use my time to workaround future bugs or finding any other solution on my own rather than writing bug reports. And it's not accusing me for using a weird system which is the main cause, but the ignorance to even try to understand or to look deeper. My report wasn't worth it, obviously. After all, it's free software without any guarantee. A short message saying "Well, I see, but unfortunately I've not the time to look at this any further" or something similar would have been quite ok for me.
On 07/12/2014 11:43 AM, Bill Oliver wrote:
I really am starting to see this as an example of the Microsoft-ish philosophy of "tell the users where they must go and let them catch up as they can." It works when the users don't have any other choice. But then, I guess Gentoo is the only distro left that hasn't adopted systemd, or will be doing so shortly. I seriously considered switching to Debian, but no joy.
If you're really that unhappy about such things as systemd, there's nothing to stop you from rolling your own distro. Start with Linux from Scratch (The Wikipedia article still shows it using Sysvinit.) put in what you think an "old-school" linux should have and call it Steampunk Linux to grab a bit of glamour and avoid looking like you're just unwilling to accept progress. (Not that I'm accusing you of that; from what I can see, you just don't consider systemd to *be* progress.) Who knows? It might even be popular.
On Sat, 12 Jul 2014 13:45:56 -0700 Joe Zeff wrote:
put in what you think an "old-school" linux should have and call it Steampunk Linux to grab a bit of glamour
No, call it "potskedeerf" linux, and make the guiding principle be that nothing advocated by freedesktop is included in the distro :-).
On Sat, 12 Jul 2014, Joe Zeff wrote:
On 07/12/2014 11:43 AM, Bill Oliver wrote:
I really am starting to see this as an example of the Microsoft-ish philosophy of "tell the users where they must go and let them catch up as they can." It works when the users don't have any other choice. But then, I guess Gentoo is the only distro left that hasn't adopted systemd, or will be doing so shortly. I seriously considered switching to Debian, but no joy.
If you're really that unhappy about such things as systemd, there's nothing to stop you from rolling your own distro. Start with Linux from Scratch (The Wikipedia article still shows it using Sysvinit.) put in what you think an "old-school" linux should have and call it Steampunk Linux to grab a bit of glamour and avoid looking like you're just unwilling to accept progress. (Not that I'm accusing you of that; from what I can see, you just don't consider systemd to *be* progress.) Who knows? It might even be popular.
Heh. The reminds me of the head of TSA, I think it was, saying that if people didn't like having their genitals fondled, then they were free to walk from New York to San Francisco -- the didn't *have* to fly.
Yeah, I know. Thanks for that useful suggestion.
billo
On 07/12/2014 02:10 PM, Bill Oliver wrote:
Yeah, I know. Thanks for that useful suggestion.
Any time. And, I wasn't even intending to be sarcastic. I wanted to point out that, unlike Windows, if you don't like the way Linux works, you're free in every sense of the word to Do Something about it, including create a new distro if you really want to.
BTW, it took me a little bit of careful thinking to come up with that name, because I didn't want to come across as trying to make you look like King Canute, trying to hold back the tide. If nothing else, nobody with that attitude has any business being involved with a bleeding edge distro such as Fedora!
On 07/12/2014 11:38 PM, Joe Zeff wrote:
On 07/12/2014 02:10 PM, Bill Oliver wrote:
Yeah, I know. Thanks for that useful suggestion.
Any time. And, I wasn't even intending to be sarcastic. I wanted to point out that, unlike Windows, if you don't like the way Linux works, you're free in every sense of the word to Do Something about it, including create a new distro if you really want to.
BTW, it took me a little bit of careful thinking to come up with that name, because I didn't want to come across as trying to make you look like King Canute, trying to hold back the tide. If nothing else, nobody with that attitude has any business being involved with a bleeding edge distro such as Fedora!
... "Am I the greatest man in the world?" he asked. "O king!" they cried, "there is no one so mighty as you." "Do all things obey me?" he asked. "There is nothing that dares to disobey you, O king!" they said. "The world bows before you, and gives you honor." "Will the sea obey me?" he asked; and he looked down at the little waves which were lapping the sand at his feet. The foolish officers were puzzled, but they did not dare to say "No." "Command it, O king! and it will obey," said one. "Sea," cried Canute, "I command you to come no farther! Waves, stop your rolling, and do not dare to touch my feet!"
King Canute on the Seashore
poma
On Sat, Jul 12, 2014 at 10:00 AM, Sam Varshavchik mrsam@courier-mta.com wrote:
Now that I have your attention, the background is as follows. This is a server with only statically configured network interfaces. NetworkManager is not installed. All network interfaces are statically configured via /etc/sysconfig/network-scripts.
The server is regularly updated to current Fedora packages. For the last month, or so, the server has failed to come up in a sane state, reliably. After it responds to pings, after ssh-ing in, and examining the aftermath, the logs of all network services are consistent, in that they claim that each network service – which includes: named-chroot, httpd, dhcpd, and privoxy – their boot logs claim that no network interfaces were up at the time they're started.
After finally getting pissed about having to manually re-brain the server, each time it boots, I attached a console monitor, and observed that the boot goes /very/ quickly, and the console login prompt comes up about 20-30 seconds before the server even starts responding to pings. Looks like the multi-user target is reached way long before networking even comes up.
Last week, I've commented on the following curiosity: after sifting through systemd's documentation, their documentation claims that "network.target" gets reached only after basic networking is up, and "network-online.target" gets reached only after all network interfaces are initialized.
Problem number one is that all servers specify "After=network.target", when, according to how I interpret this, they should all really specify "After=network-online.target".
After that, it came to my attention that there's a NetworkManager optional subpackage that installs a service that waits for network interfaces to come up, and it's specified as "Before=network.target network-online.target". It seems fairly obvious to me that it should really be "Before=network-online.target" and "After=network.target", with all other services that require a functioning network specifying "After=network-online.target". That made logical sense to me, but it seems that this confusing arrangement makes logical sense to someone else, so, whatever. I do not have NetworkManager installed, but, I figure, why not take a crack at whipping up a dirty hack that basically does the same thing?
But the unexpected result from the hack is that it seems to provide solid proof that systemd's dependency resolution is not working, but before I Bugzilla this (as little hope one might have from getting anything useful done by Bugzillaing this), I'd like to hear some consensus that I am interpreting the following data right. Who knows, I might actually have made a mistake, somewhere.
Let's take a look at what named-chroot.service says:
[Unit] Description=Berkeley Internet Name Domain (DNS) Wants=nss-lookup.target Before=nss-lookup.target After=network.target
Are we all in agreement that named-chroot.service should only be started after network.target gets reached? Ok.
Now, here's my hack, which is basically a clone of that NetworkManager subpackage:
# cat /etc/systemd/system/wait-for-network.service [Unit] Description=Wait for network ports to be initialized Before=network.target network-online.target
[Service] Type=oneshot ExecStart=/root/bin/wait-for-network
[Install] WantedBy=multi-user.target
Are we all in agreement that:
- This is a one-shot service, and according to systemd's documentation,
systemd must wait until this script is complete, before it's considered started.
Until it's complete, network.target isn't reached.
Therefore, this script must finish before systemd should start
named-chroot.service
Yet, after testing this script, then activating it, the server still came up utterly brainless after the reboot. The results:
systemctl status named-chroot.service reports:
named-chroot.service - Berkeley Internet Name Domain (DNS) Loaded: loaded (/usr/lib/systemd/system/named-chroot.service; enabled) Active: active (running) since Sat 2014-07-12 09:24:29 EDT; 3min 28s ago …
So, systemd started named-chroot.service at 09:24:29.
My script logs the current timestamp. The output from /root/bin/wait-for-network was as follows:
Sat Jul 12 09:24:27 2014 Interface: lo is up Sat Jul 12 09:24:32 2014 Interface: lan0 is up Interface: lo is up Interface: wan0 is down Sat Jul 12 09:24:37 2014 Interface: lan0 is up Interface: lo is up Interface: wan0 is up
systemd started this script at 09:24:27. This script spun its wheels until 09:24:37, at which time all network interfaces finally came up. I'm happy to post the contents of this short script; however I don't think that it's relevant here, because the problem is that this script was running when systemd decided to run named-chroot.service, even though, according to the above, this should not happen.
So, either I'm misreading the description of "oneshot" in systemd.service(5); and "Before" and "After" in systemd.unit(5), or systemd is broken completely. I think that my understanding of systemd's documentation is very reasonable. So, either systemd is broken, or, if it's supposedly working how it should be working, its documentation is crap, and is impossible to follow. I see no other possibilities.
"NetworkManager-wait-online.service" has:
<begin> After=NetworkManager.service Wants=network.target Before=network.target network-online.target </end>
Perhaps you should replicate that in "wait-for-network.service" for it to behave as you intend it.
I assume that you can replicate "After=NetworkManager.service" with "After=network.service" even though your network's being brought up by "/etc/rc.d/init.d/network".
What do you have in "/root/bin/wait-for-network"?
Tom H writes:
"NetworkManager-wait-online.service" has:
<begin> After=NetworkManager.service Wants=network.target Before=network.target network-online.target </end>
Perhaps you should replicate that in "wait-for-network.service" for it to behave as you intend it.
I assume that you can replicate "After=NetworkManager.service" with "After=network.service" even though your network's being brought up by "/etc/rc.d/init.d/network".
s/even though/because/
Ok, I tried the following variation:
[Unit] Description=Wait for network ports to be initialized Before=network.target network-online.target After=network.service Wants=network.target
[Service] Type=oneshot ExecStart=/root/bin/wait-for-network
[Install] WantedBy=multi-user.target
With this, the first reboot initialized all servers properly, and the console logging during the boot went back to what it was about a month or two ago, before everything went off the rails.
What do you have in "/root/bin/wait-for-network"?
Nothing particularly exciting. I'll attach it below. But whatever it does or does not have, it should not affect the order in which systemd starts processes. It can't possibly affect the order in which systemd starts processes. It has no means of controlling the order; yet, with the previous service file, with having just a "Before=network.target", systemd was clearly launching a different service that stated "After=network.target", before this one finished. Which was broken.
On the last boot, the following script logged that all interfaces were already up when it started. Previously, it showed that only lo0 was up, and the rest came up after it waited ten seconds. So the difference now is clearly because of "After=network.service", but that alone doesn't explain it. Even without it, this service doesn't officially start, according to the documented specification of a one-shot service, until network.service finished bringing up all the interfaces.
So that leaves the addition of "Wants=network.target". It apparently has an effect on the ordering of services that get started. Now, you read its description, in systemd's documentation and tell me if it says anything about controlling the order of services. The plain reading of the documentation only suggests that this controls what gets started, and not which order anything gets started with.
After grepping around, it became even more obvious how brain-dead systemd is. The new service that I added is the only service that specifies "Wants=network.target". None of the stock Fedora services – including the ones that fail to start properly until the network connections are up – specify "Wants=network.target". No Fedora server package, that I have installed, specifyis "Wants=network-online.target" either; but I have some non-Fedora packages that specify "Wants=network-online.target". And until I installed this custom service, nothing else wanted network.target either.
So… Looks like the bug is as follows. The system comes up with systemd configured to bring up multi-user.target. But unless something explicitly specifies Wants=network.target, systemd is going to completely ignore this target, and completely ignore all Before or After network.target specification, from any service that it is starting, and ignore any obvious dependencies that result from Before or After network.target. All without giving any hint as to what it's doing.
As I said: this used to work, at least for me, reliably, until about a month or two ago, and I see that there have been a couple of systemd updates since. I can almost hear someone already bleating "this is how systemd is designed to work, and the problem is with incorrect service files".
Presuming that my guess as to what the bug is, is correct: this is a bunk answer. Either this was a major change in systemd's behavior, recently, or not. If not: 1) systemd documentation is crap, and this should've been documented under in the "Wants" documentation a long time ago, 2) systemd just broke everything, without bothering to announce this major change in behavior. And this is also crap.
But this should not surprise anyone. Recall, earlier this year, Linus calling out systemd's maintainer for pulling the exact same kind of a stunt: breaking something, in this case a kernel boot with the "debug" option, and then acting as if it's the kernel's problem to solve. Same exact snobbery and arrogance.
Well, anyway, here's my script, FWIW, which is really a moot point; since by using After=network.service it's not really necessary, any more.
#! /usr/bin/perl
use IO::File; use strict; use warnings;
open(LOG, ">/tmp/wait-for-network.log");
sub get_ifconfig { my $fh=IO::File->new;
open($fh, "-|", "/usr/sbin/ifconfig") or exit 0;
my $up=0;
print LOG ((scalar localtime) . "\n"); while (defined ($_=<$fh>)) { chomp;
next unless /^([^ ]+):/;
my $ifname=$1;
my $found=0;
while (defined ($_=<$fh>)) { chomp; last unless /^ /; $found=1 if /^ +inet/; }
print LOG "Interface: $ifname is " . ($found ? "up":"down") . "\n";
return 0 unless $found; # Some network interface does not have an IP address, yet ++$up unless $ifname eq "lo"; } return $up; }
foreach (1..60) { exit 0 if get_ifconfig; sleep(5); }
On Sat, Jul 12, 2014 at 3:16 PM, Sam Varshavchik mrsam@courier-mta.com wrote:
Tom H writes:
"NetworkManager-wait-online.service" has:
<begin> After=NetworkManager.service Wants=network.target Before=network.target network-online.target </end>
Perhaps you should replicate that in "wait-for-network.service" for it to behave as you intend it.
I assume that you can replicate "After=NetworkManager.service" with "After=network.service" even though your network's being brought up by "/etc/rc.d/init.d/network".
Ok, I tried the following variation:
[Unit] Description=Wait for network ports to be initialized Before=network.target network-online.target After=network.service Wants=network.target
[Service] Type=oneshot ExecStart=/root/bin/wait-for-network
[Install] WantedBy=multi-user.target
With this, the first reboot initialized all servers properly, and the console logging during the boot went back to what it was about a month or two ago, before everything went off the rails.
Good; although Corinna's "/etc/systemd/system/network-waiter.service" is a cleaner solution.
You could also override the Fedora-provided service with "/etc/systemd/system/named-chroot.service": [Unit] Description=Berkeley Internet Name Domain (DNS) Wants=nss-lookup.target Before=nss-lookup.target After=network-online.target ## or network.target network-online.target
But, if I understand correctly, you want to figure out why your unit isn't ordered the way that it ought to be, so you're not just interested in a solution that works.
What do you have in "/root/bin/wait-for-network"?
Nothing particularly exciting. I'll attach it below. But whatever it does or does not have, it should not affect the order in which systemd starts processes. It can't possibly affect the order in which systemd starts processes. It has no means of controlling the order;
I wasn't questioning whether your script was affecting the startup order.
yet, with the previous service file, with having just a "Before=network.target", systemd was clearly launching a different service that stated "After=network.target", before this one finished. Which was broken.
On the last boot, the following script logged that all interfaces were already up when it started. Previously, it showed that only lo0 was up, and the rest came up after it waited ten seconds. So the difference now is clearly because of "After=network.service", but that alone doesn't explain it. Even without it, this service doesn't officially start, according to the documented specification of a one-shot service, until network.service finished bringing up all the interfaces.
So that leaves the addition of "Wants=network.target". It apparently has an effect on the ordering of services that get started. Now, you read its description, in systemd's documentation and tell me if it says anything about controlling the order of services. The plain reading of the documentation only suggests that this controls what gets started, and not which order anything gets started with.
From systemd.special:
network-online.target Units that strictly require a configured network connection should pull in network-online.target (via a Wants= type dependency) and order themselves after it. This target unit is intended to pull in a service that delays further execution until the network is sufficiently set up. What precisely this requires is left to the implementation of the network managing service. Note the distinction between this unit and network.target. This unit is an active unit (i.e. pulled in by the consumer rather than the provider of this functionality) and pulls in a service which possibly adds substantial delays to further execution. In contrast, network.target is a passive unit (i.e. pulled in by the provider of the functionality, rather than the consumer) that usually does not delay execution much. Usually, network.target is part of the boot of most systems, while network-online.target is not, except when at least one unit requires it. Also see Running Services After the Network is up[1] for more information. All mount units for remote network file systems automatically pull in this unit, and order themselves after it. Note that networking daemons that simply provide functionality to other hosts generally do not need to pull this in.
("[1]" is the freedesktop url that Kevin posted.)
After grepping around, it became even more obvious how brain-dead systemd is. The new service that I added is the only service that specifies "Wants=network.target". None of the stock Fedora services – including the ones that fail to start properly until the network connections are up – specify "Wants=network.target". No Fedora server package, that I have installed, specifyis "Wants=network-online.target" either; but I have some non-Fedora packages that specify "Wants=network-online.target". And until I installed this custom service, nothing else wanted network.target either.
"NetworkManager-wait-online.service" has "Wants=network.target", which is why I suggested the same for "wait-for-network.service".
So… Looks like the bug is as follows. The system comes up with systemd configured to bring up multi-user.target. But unless something explicitly specifies Wants=network.target, systemd is going to completely ignore this target, and completely ignore all Before or After network.target specification, from any service that it is starting, and ignore any obvious dependencies that result from Before or After network.target. All without giving any hint as to what it's doing.
As I said: this used to work, at least for me, reliably, until about a month or two ago, and I see that there have been a couple of systemd updates since. I can almost hear someone already bleating "this is how systemd is designed to work, and the problem is with incorrect service files".
Presuming that my guess as to what the bug is, is correct: this is a bunk answer. Either this was a major change in systemd's behavior, recently, or not. If not: 1) systemd documentation is crap, and this should've been documented under in the "Wants" documentation a long time ago, 2) systemd just broke everything, without bothering to announce this major change in behavior. And this is also crap.
But this should not surprise anyone. Recall, earlier this year, Linus calling out systemd's maintainer for pulling the exact same kind of a stunt: breaking something, in this case a kernel boot with the "debug" option, and then acting as if it's the kernel's problem to solve. Same exact snobbery and arrogance.
Well, anyway, here's my script, FWIW, which is really a moot point; since by using After=network.service it's not really necessary, any more.
Thanks. I was just curious what was creating the output that you showed in your earlier email.
Hi Sam, I don't know anything about Systemd, nor have I read the rest of the responses to this, but just looking at the logical interpretation of your named-chroot.service statements it seems to me that you are requesting that named-chroot.service be started after network.target but before nss-lookup.target and that it needs nss-lookup.target to be active, which to me seems to be a deadly embrace. Based on what I think you are saying in your email I would have thought that logically your "before" statement should be removed and your "after" statement should be after network.target network-online.target nss-lookup.target , but then I am not sure how systemd works.
regards, Steve
On 07/13/2014 12:00 AM, Sam Varshavchik wrote:
Now that I have your attention, the background is as follows. This is a server with only statically configured network interfaces. NetworkManager is not installed. All network interfaces are statically configured via /etc/sysconfig/network-scripts.
The server is regularly updated to current Fedora packages. For the last month, or so, the server has failed to come up in a sane state, reliably. After it responds to pings, after ssh-ing in, and examining the aftermath, the logs of all network services are consistent, in that they claim that each network service – which includes: named-chroot, httpd, dhcpd, and privoxy – their boot logs claim that no network interfaces were up at the time they're started.
After finally getting pissed about having to manually re-brain the server, each time it boots, I attached a console monitor, and observed that the boot goes /very/ quickly, and the console login prompt comes up about 20-30 seconds before the server even starts responding to pings. Looks like the multi-user target is reached way long before networking even comes up.
Last week, I've commented on the following curiosity: after sifting through systemd's documentation, their documentation claims that "network.target" gets reached only after basic networking is up, and "network-online.target" gets reached only after all network interfaces are initialized.
Problem number one is that all servers specify "After=network.target", when, according to how I interpret this, they should all really specify "After=network-online.target".
After that, it came to my attention that there's a NetworkManager optional subpackage that installs a service that waits for network interfaces to come up, and it's specified as "Before=network.target network-online.target". It seems fairly obvious to me that it should really be "Before=network-online.target" and "After=network.target", with all other services that require a functioning network specifying "After=network-online.target". That made logical sense to me, but it seems that this confusing arrangement makes logical sense to someone else, so, whatever. I do not have NetworkManager installed, but, I figure, why not take a crack at whipping up a dirty hack that basically does the same thing?
But the unexpected result from the hack is that it seems to provide solid proof that systemd's dependency resolution is not working, but before I Bugzilla this (as little hope one might have from getting anything useful done by Bugzillaing this), I'd like to hear some consensus that I am interpreting the following data right. Who knows, I might actually have made a mistake, somewhere.
Let's take a look at what named-chroot.service says:
[Unit] Description=Berkeley Internet Name Domain (DNS) Wants=nss-lookup.target Before=nss-lookup.target After=network.target
Are we all in agreement that named-chroot.service should only be started after network.target gets reached? Ok.
Now, here's my hack, which is basically a clone of that NetworkManager subpackage:
# cat /etc/systemd/system/wait-for-network.service [Unit] Description=Wait for network ports to be initialized Before=network.target network-online.target
[Service] Type=oneshot ExecStart=/root/bin/wait-for-network
[Install] WantedBy=multi-user.target
Are we all in agreement that:
- This is a one-shot service, and according to systemd's
documentation, systemd must wait until this script is complete, before it's considered started.
Until it's complete, network.target isn't reached.
Therefore, this script must finish before systemd should start
named-chroot.service
Yet, after testing this script, then activating it, the server still came up utterly brainless after the reboot. The results:
systemctl status named-chroot.service reports:
named-chroot.service - Berkeley Internet Name Domain (DNS) Loaded: loaded (/usr/lib/systemd/system/named-chroot.service; enabled) Active: active (running) since Sat 2014-07-12 09:24:29 EDT; 3min 28s ago …
So, systemd started named-chroot.service at 09:24:29.
My script logs the current timestamp. The output from /root/bin/wait-for-network was as follows:
Sat Jul 12 09:24:27 2014 Interface: lo is up Sat Jul 12 09:24:32 2014 Interface: lan0 is up Interface: lo is up Interface: wan0 is down Sat Jul 12 09:24:37 2014 Interface: lan0 is up Interface: lo is up Interface: wan0 is up
systemd started this script at 09:24:27. This script spun its wheels until 09:24:37, at which time all network interfaces finally came up. I'm happy to post the contents of this short script; however I don't think that it's relevant here, because the problem is that this script was running when systemd decided to run named-chroot.service, even though, according to the above, this should not happen.
So, either I'm misreading the description of "oneshot" in systemd.service(5); and "Before" and "After" in systemd.unit(5), or systemd is broken completely. I think that my understanding of systemd's documentation is very reasonable. So, either systemd is broken, or, if it's supposedly working how it should be working, its documentation is crap, and is impossible to follow. I see no other possibilities.
Stephen Morris writes:
Hi Sam, I don't know anything about Systemd, nor have I read the rest of the responses to this, but just looking at the logical interpretation of your named-chroot.service statements it seems to me that you are requesting that
Well, technically it's not "my" named-chroot.service. This is the standard named-chroot RPM's service file, that it installs. You do a "yum install bind-chroot", that's what you'll end up getting installed.
named-chroot.service be started after network.target but before nss- lookup.target and that it needs nss-lookup.target to be active, which to me seems to be a deadly embrace. Based on what I think you are saying in your email I would have thought that logically your "before" statement should be removed and your "after" statement should be after network.target network-online.target nss-lookup.target , but then I am not sure how systemd works.
Don't feel bad about not knowing how systemd works. You're in good company.
named-chroot.service wants to be started before nss-lookup.target because bind's mission in life is to provide host/IP lookups, and everything that requires working host/IP lookup, according to systemd.special(7), specifies itself to be after nss-lookup.target; so if bind is going to provide host/IP lookup, it needs to come up first. That makes logical sense.
But systemd's entire dependency model is headache-inducing. It naturally leads to impossible situations. More than once, I observed systemd spewing errors about circular dependencies, at system boot.
Oy, what a mess. At least today, with invaluable help from fellow sufferers, I think finally figured out what's wrong with the stock service files installed by a bunch of server RPMs, and how to work around it.
as I know, systemd will start the *.service first! and then the sysvinit scripts. I think that can cause the problem. Maybe if you create a foo.service which point to your sysvinit script and set the right order in dependency list..... maybe, I'm just guessing.....
On Sun, 2014-07-13 at 10:57 +1000, Stephen Morris wrote:
Hi Sam, I don't know anything about Systemd, nor have I read the rest of the responses to this, but just looking at the logical interpretation of your named-chroot.service statements it seems to me that you are requesting that named-chroot.service be started after network.target but before nss-lookup.target and that it needs nss-lookup.target to be active, which to me seems to be a deadly embrace. Based on what I think you are saying in your email I would have thought that logically your "before" statement should be removed and your "after" statement should be after network.target network-online.target nss-lookup.target , but then I am not sure how systemd works.
regards, Steve
On 07/13/2014 12:00 AM, Sam Varshavchik wrote:
Now that I have your attention, the background is as follows. This is a server with only statically configured network interfaces. NetworkManager is not installed. All network interfaces are statically configured via /etc/sysconfig/network-scripts.
The server is regularly updated to current Fedora packages. For the last month, or so, the server has failed to come up in a sane state, reliably. After it responds to pings, after ssh-ing in, and examining the aftermath, the logs of all network services are consistent, in that they claim that each network service – which includes: named-chroot, httpd, dhcpd, and privoxy – their boot logs claim that no network interfaces were up at the time they're started.
After finally getting pissed about having to manually re-brain the server, each time it boots, I attached a console monitor, and observed that the boot goes /very/ quickly, and the console login prompt comes up about 20-30 seconds before the server even starts responding to pings. Looks like the multi-user target is reached way long before networking even comes up.
Last week, I've commented on the following curiosity: after sifting through systemd's documentation, their documentation claims that "network.target" gets reached only after basic networking is up, and "network-online.target" gets reached only after all network interfaces are initialized.
Problem number one is that all servers specify "After=network.target", when, according to how I interpret this, they should all really specify "After=network-online.target".
After that, it came to my attention that there's a NetworkManager optional subpackage that installs a service that waits for network interfaces to come up, and it's specified as "Before=network.target network-online.target". It seems fairly obvious to me that it should really be "Before=network-online.target" and "After=network.target", with all other services that require a functioning network specifying "After=network-online.target". That made logical sense to me, but it seems that this confusing arrangement makes logical sense to someone else, so, whatever. I do not have NetworkManager installed, but, I figure, why not take a crack at whipping up a dirty hack that basically does the same thing?
But the unexpected result from the hack is that it seems to provide solid proof that systemd's dependency resolution is not working, but before I Bugzilla this (as little hope one might have from getting anything useful done by Bugzillaing this), I'd like to hear some consensus that I am interpreting the following data right. Who knows, I might actually have made a mistake, somewhere.
Let's take a look at what named-chroot.service says:
[Unit] Description=Berkeley Internet Name Domain (DNS) Wants=nss-lookup.target Before=nss-lookup.target After=network.target
Are we all in agreement that named-chroot.service should only be started after network.target gets reached? Ok.
Now, here's my hack, which is basically a clone of that NetworkManager subpackage:
# cat /etc/systemd/system/wait-for-network.service [Unit] Description=Wait for network ports to be initialized Before=network.target network-online.target
[Service] Type=oneshot ExecStart=/root/bin/wait-for-network
[Install] WantedBy=multi-user.target
Are we all in agreement that:
- This is a one-shot service, and according to systemd's
documentation, systemd must wait until this script is complete, before it's considered started.
Until it's complete, network.target isn't reached.
Therefore, this script must finish before systemd should start
named-chroot.service
Yet, after testing this script, then activating it, the server still came up utterly brainless after the reboot. The results:
systemctl status named-chroot.service reports:
named-chroot.service - Berkeley Internet Name Domain (DNS) Loaded: loaded (/usr/lib/systemd/system/named-chroot.service; enabled) Active: active (running) since Sat 2014-07-12 09:24:29 EDT; 3min 28s ago …
So, systemd started named-chroot.service at 09:24:29.
My script logs the current timestamp. The output from /root/bin/wait-for-network was as follows:
Sat Jul 12 09:24:27 2014 Interface: lo is up Sat Jul 12 09:24:32 2014 Interface: lan0 is up Interface: lo is up Interface: wan0 is down Sat Jul 12 09:24:37 2014 Interface: lan0 is up Interface: lo is up Interface: wan0 is up
systemd started this script at 09:24:27. This script spun its wheels until 09:24:37, at which time all network interfaces finally came up. I'm happy to post the contents of this short script; however I don't think that it's relevant here, because the problem is that this script was running when systemd decided to run named-chroot.service, even though, according to the above, this should not happen.
So, either I'm misreading the description of "oneshot" in systemd.service(5); and "Before" and "After" in systemd.unit(5), or systemd is broken completely. I think that my understanding of systemd's documentation is very reasonable. So, either systemd is broken, or, if it's supposedly working how it should be working, its documentation is crap, and is impossible to follow. I see no other possibilities.