Hi all,
I have no idea why this might be happening. I've looked at the systemd control file for postfix and compared it to dovecot's. Both want the network.target before starting.
The difference is that dovecot starts correctly. Postfix does not. The problem has only appeared since Saturday, when I probably did a yum update.
Postfix issues an error from one of the pre-commands having to do with the aliases database saying its network interface (and it duly lists the IPv4 address) doesn't exist. Starting postfix manually later works.
Thanks!
On 07/01/2014 01:13 PM, David Benfell issued this missive:
Hi all,
I have no idea why this might be happening. I've looked at the systemd control file for postfix and compared it to dovecot's. Both want the network.target before starting.
"want" is a weakened version of "requires". It doesn't guarantee start sequence and if the target fails to start, it doesn't stop THIS service from starting ("requires" WOULD prevent this service from starting if the target of the "requires" failed).
The difference is that dovecot starts correctly. Postfix does not. The problem has only appeared since Saturday, when I probably did a yum update.
Postfix issues an error from one of the pre-commands having to do with the aliases database saying its network interface (and it duly lists the IPv4 address) doesn't exist. Starting postfix manually later works.
Have you checked the postfix.service file and verified it has "After=network.target" in it? That should make sure the network starts before postfix.
In other words, if you have both "wants=network.target" and "after=network.target" in the "[Unit]" part of the postfix.service file, the system would try to start network first, then start postfix. Note that because you have "wants" and not "requires", postfix would try to start even if network.target fails. ---------------------------------------------------------------------- - Rick Stevens, Systems Engineer, AllDigital ricks@alldigital.com - - AIM/Skype: therps2 ICQ: 22643734 Yahoo: origrps2 - - - - The problem with being poor is that it takes up all of your time - ----------------------------------------------------------------------
I gave up trying to analyze stuff like this soon after systemd appeared, I just have a bunch of things in my /etc/rc.d/rc.local file to restart various services after a brief delay. Things like:
/bin/bash -c 'sleep 5 ; service stunnel restart' > /dev/null 2>&1 < /dev/null & /bin/bash -c 'sleep 7 ; service postfix restart' > /dev/null 2>&1 < /dev/null & ...
All things which at one time or another failed to start correctly, and I also have the opposite - a script to fix things before actually rebooting so systemd won't hang forever at reboot:
Tom Horsley writes:
I gave up trying to analyze stuff like this soon after systemd appeared,
It definitely does seem like systemd is a specialization all on its own. Having used Arch Linux, I've been fighting it on and off for a while.
For now, I'm modifying the afflicted service files and adding a cron job to run systemctl --failed.
Thanks!
Allegedly, on or about 01 July 2014, David Benfell sent:
For now, I'm modifying the afflicted service files and adding a cron job to run systemctl --failed.
Sounds like a good idea to have something by default (i.e. as part the system, not just a user kludge) that looks out for failed services post boot, and tries to get them working after a small delay.
In the past, I had to have a network manager script that would restart the NTP server after a network came up. If NTP tried to run and no network was available, it *never* recovered by itself.
Tim writes:
Sounds like a good idea to have something by default (i.e. as part the system, not just a user kludge) that looks out for failed services post boot, and tries to get them working after a small delay.
I'm about half-way to figuring out how to kludge this with:
systemctl --failed | grep "failed"
I'll finish this tomorrow after it fails again (I reboot daily to limit the impact of memory leaks).
I do dimly recall that there was a cleaner way to do this. But I'm completely failing to find what it is.
I've had similar issues. Google for NetworkManager-wait-online.service and you will probably find something to help. This is a service that uses "nm-online" to wait for NetworkManager to report itself ready before exiting. I can't remember all the details of making use of this, but one of my systems needed it to avoid this sort of issue, where something would try to start before all the network interfaces were actually up.
--Greg
Rick Stevens writes:
On 07/01/2014 01:13 PM, David Benfell issued this missive:
Hi all,
I have no idea why this might be happening. I've looked at the systemd control file for postfix and compared it to dovecot's. Both want the network.target before starting.
"want" is a weakened version of "requires". It doesn't guarantee start sequence and if the target fails to start, it doesn't stop THIS service from starting ("requires" WOULD prevent this service from starting if the target of the "requires" failed).
So this is a bug, then. Neither 'Wants' nor "Requires' was present. It only had After.
The difference is that dovecot starts correctly. Postfix does not. The problem has only appeared since Saturday, when I probably did a yum update.
Postfix issues an error from one of the pre-commands having to do with the aliases database saying its network interface (and it duly lists the IPv4 address) doesn't exist. Starting postfix manually later works.
Have you checked the postfix.service file and verified it has "After=network.target" in it? That should make sure the network starts before postfix.
Actually, this is backwards. It didn't have the 'Wants', only the 'After'. But following your suggestion above, I am adding a 'Requires' instead of the 'Wants'.
Thanks!
Hi all,
This is still going awry....
[root@munich]/home/benfell# systemctl status postfix postfix.service - Postfix Mail Transport Agent Loaded: loaded (/etc/systemd/system/postfix.service; enabled) Active: failed (Result: exit-code) since Wed 2014-07-02 05:00:31 PDT; 9h ago Process: 1196 ExecStart=/usr/sbin/postfix start (code=exited, status=1/FAILURE) Process: 1194 ExecStartPre=/usr/libexec/postfix/chroot-update (code=exited, status=0/SUCCESS) Process: 371 ExecStartPre=/usr/libexec/postfix/aliasesdb (code=exited, status=75)
The restart (several hours later when I discover the problem) succeeds.
[root@munich]/home/benfell# ls -al /etc/systemd/system/multi- user.target.wants/postfix.service lrwxrwxrwx 1 root root 35 Jul 1 18:42 /etc/systemd/system/multi- user.target.wants/postfix.service -> /etc/systemd/system/postfix.service [root@munich]/home/benfell# cat /etc/systemd/system/multi- user.target.wants/postfix.service [Unit] Description=Postfix Mail Transport Agent Requires=network.target After=syslog.target network.target Conflicts=sendmail.service exim.service
[Service] Type=forking PIDFile=/var/spool/postfix/pid/master.pid EnvironmentFile=-/etc/sysconfig/network ExecStartPre=-/usr/libexec/postfix/aliasesdb ExecStartPre=-/usr/libexec/postfix/chroot-update ExecStart=/usr/sbin/postfix start ExecReload=/usr/sbin/postfix reload ExecStop=/usr/sbin/postfix stop
[Install] WantedBy=multi-user.target
This is the postfix service file as modified per the exchange yesterday. I'm sure glad I have a secondary server for both mail and DNS, because nsd is failing also.
David Benfell writes:
Hi all,
This is still going awry....
[root@munich]/home/benfell# systemctl status postfix postfix.service - Postfix Mail Transport Agent Loaded: loaded (/etc/systemd/system/postfix.service; enabled) Active: failed (Result: exit-code) since Wed 2014-07-02 05:00:31 PDT; 9h ago Process: 1196 ExecStart=/usr/sbin/postfix start (code=exited, status=1/FAILURE) Process: 1194 ExecStartPre=/usr/libexec/postfix/chroot-update (code=exited, status=0/SUCCESS) Process: 371 ExecStartPre=/usr/libexec/postfix/aliasesdb (code=exited, status=75)
The restart (several hours later when I discover the problem) succeeds.
[root@munich]/home/benfell# ls -al /etc/systemd/system/multi- user.target.wants/postfix.service lrwxrwxrwx 1 root root 35 Jul 1 18:42 /etc/systemd/system/multi- user.target.wants/postfix.service -> /etc/systemd/system/postfix.service [root@munich]/home/benfell# cat /etc/systemd/system/multi- user.target.wants/postfix.service [Unit] Description=Postfix Mail Transport Agent Requires=network.target After=syslog.target network.target
--snip--
After poking around a bit, it appears the network.target is insufficiently stringent. I have modified postfix.service, nsd.service, and ejabberd.service to use network-online.target instead.
I will not be surprised if other services also need this change. I don't know what the shortcut is that's taken with network.target, but such a shortcut doesn't sound right to me.
David Benfell writes:
The difference is that dovecot starts correctly. Postfix does not. The problem has only appeared since Saturday, when I probably did a yum update.
I have since discovered that nsd also failed to start and that ejabberd apparently failed to start correctly--pidgin on my desktop somehow managed to connect, but not xabber or yaxim on my Android phone.
Something is not right with the network target.