https://bugzilla.redhat.com/show_bug.cgi?id=1104843
--- Comment #1 from John Eckersberg <jeckersb(a)redhat.com> ---
Some more details on this.
I used systemtap to prove that systemd is explicitly killing the epmd process:
Wed Jun 4 20:18:09 2014 : sh (16202) is exec'ing
"/usr/lib64/erlang/erts-5.10.4/bin/epmd"
Wed Jun 4 20:18:09 2014 : epmd (16202) created 16204
Wed Jun 4 20:18:09 2014 : epmd (16202) is exiting
Wed Jun 4 20:18:09 2014 : epmd (16204) created 16205
Wed Jun 4 20:18:09 2014 : epmd (16204) is exiting
Wed Jun 4 20:18:09 2014 : sh (16203) is exec'ing
"/usr/lib64/erlang/erts-5.10.4/bin/epmd"
Wed Jun 4 20:18:09 2014 : epmd (16203) created 16206
Wed Jun 4 20:18:09 2014 : epmd (16203) is exiting
Wed Jun 4 20:18:09 2014 : epmd (16206) created 16209
Wed Jun 4 20:18:09 2014 : epmd (16206) is exiting
Wed Jun 4 20:18:09 2014 : epmd (16209) is exiting
Wed Jun 4 20:18:09 2014 : sh (16247) is exec'ing
"/usr/lib64/erlang/erts-5.10.4/bin/epmd"
Wed Jun 4 20:18:09 2014 : epmd (16247) created 16248
Wed Jun 4 20:18:09 2014 : epmd (16247) is exiting
Wed Jun 4 20:18:09 2014 : epmd (16248) created 16250
Wed Jun 4 20:18:09 2014 : epmd (16248) is exiting
Wed Jun 4 20:18:09 2014 : epmd (16250) is exiting
SIGKILL was sent to epmd (pid:16205) by systemd (pid:1) uid:0
Wed Jun 4 20:18:12 2014 : epmd (16205) is exiting
I think this is resulting from a combination of:
- rabbitmq-server is of Type=simple in the systemd unit file
- It double forks off epmd, so it's no longer parented to the main process
- The unit file has an ExecStartPost line
Systemd does not allow long-running processes to be started from
ExecStartPre/ExecStartPost so it will purposefully try to kill off anything
that is hanging around. It's kinda hard to see in the above log, but the epmd
daemon gets started by rabbitmq-server[1], then the ExecStartPost runs to wait
on the pidfile, and then finally systemd sends SIGKILL to the epmd process. I
believe systemd thinks the epmd process is from the ExecStartPost. To further
this theory, if I comment out the ExecStartPost line, systemd does *not* send
SIGKILL and everything works as expected.
Fortunately, I think the correct fix is to make sure bug 1103524 gets fixed as
soon as possible. If the service type is changed to notify, then both the
ExecStartPre and ExecStartPost lines can be removed, thus avoiding the
undesirable kill behavior.
[1] I've explicitly commented out the ExecStartPre cookie race hack in my
testing
--
You are receiving this mail because:
You are on the CC list for the bug.