> Date: Thu, 22 Nov 2012 15:42:59 +0100
> From: jhrozek@redhat.com
> To: sssd-devel@lists.fedorahosted.org
> Subject: Re: [SSSD] [PATCH] when monitor_quit is call and the process not exists
>
> On Wed, Nov 21, 2012 at 03:15:43PM -0500, Ariel Barria wrote:
> >
> > Hi.
> > well, I'm not sure that this patch have the behavior that is it hope. You have the word.
> > But when monitor_quit is call and the process not exists, this remains in loop and not allow signal.
> > I try it with Monitor Task and nothing.
> > I had to reboot
>
> Thanks a lot, Ariel, this is quite a bad bug. The concept is good, but
> can you change the patch a little?
>
> instead of:
> if (error != EINTR) {
> kill()
> if (error == ESRCH || error == ECHILD) {
> killed = true;
> }
> }


OK.

> I would use:
> if (error == ECHILD) {
> /* Skip this process */
> killed = true;
> } else if (error != EINTR) {
> /* Keep the block with the kill function around */
> }
>


> Also, are you sure about the ESRCH error code? The waitpid man page only
> mentions ECHILD..


well, really not is waipit.
For example 1
(Thu Nov 22 10:17:44 2012) [sssd] [monitor_quit] (0x0040): Returned with: 1
(Thu Nov 22 10:17:44 2012) [sssd] [monitor_quit] (0x0020): Terminating [pam][7494]
(Thu Nov 22 10:17:44 2012) [sssd] [monitor_quit] (0x0020): Couldn't kill [pam][7494]: [3] [No such process] *********
(Thu Nov 22 10:17:44 2012) [sssd] [monitor_quit] (0x0010): [10][No child processes] while waiting for [pam] ****waitpid
(Thu Nov 22 10:17:44 2012) [sssd] [monitor_quit] (0x0020): Terminating [ssh][7493]
(Thu Nov 22 10:17:44 2012) [sssd] [monitor_quit] (0x0020): Couldn't kill [ssh][7493]: [3] [No such process]
(Thu Nov 22 10:17:44 2012) [sssd] [monitor_quit] (0x0010): [10][No child processes] while waiting for [ssh]
(Thu Nov 22 10:17:44 2012) [sssd] [monitor_quit] (0x0020): Terminating [nss][7492]
(Thu Nov 22 10:17:44 2012) [sssd] [monitor_quit] (0x0020): Couldn't kill [nss][7492]: [3] [No such process]
(Thu Nov 22 10:17:44 2012) [sssd] [monitor_quit] (0x0010): [10][No child processes] while waiting for [nss]
(Thu Nov 22 10:17:44 2012) [sssd] [sbus_remove_watch] (0x2000): 0x9ee5f0/0x9f1430

I think that if the process not exists, is not necessary waitpit. you is agree?
Based on this, the result would be:
Example 2
Returned with: 1
(Thu Nov 22 12:42:37 2012) [sssd] [monitor_quit] (0x0020): Terminating [nss][27350]
(Thu Nov 22 12:42:37 2012) [sssd] [monitor_quit] (0x0020): Couldn't kill [nss][27350]: [No such process]
(Thu Nov 22 12:42:37 2012) [sssd] [monitor_quit] (0x0020): Terminating [ssh][27349]
(Thu Nov 22 12:42:37 2012) [sssd] [monitor_quit] (0x0020): Couldn't kill [ssh][27349]: [No such process]
(Thu Nov 22 12:42:37 2012) [sssd] [monitor_quit] (0x0020): Terminating [pam][27348]
(Thu Nov 22 12:42:37 2012) [sssd] [monitor_quit] (0x0020): Couldn't kill [pam][27348]: [No such process]
(Thu Nov 22 12:42:37 2012) [sssd] [sbus_remove_watch] (0x2000): 0x237a080/0x237beb0


otherwise (only ECHILD), the result would be how the first example.

new patch.
Thanks.

> _______________________________________________
> sssd-devel mailing list
> sssd-devel@lists.fedorahosted.org
> https://lists.fedorahosted.org/mailman/listinfo/sssd-devel