[SSSD] [PATCH] MONITOR: Correctly detect lack of response from services

Wednesday, 14 September 2011

We were incorrectly using DBUS_ERROR_TIMEOUT here. The correct
behaviour is to check for DBUS_ERROR_NO_REPLY. This way we will
properly handle the three-tries in the tasks_check_handler().

D-BUS is rather confusing with these error codes.
DBUS_ERROR_NO_REPLY: No reply to a message expecting one, usually means
a timeout occurred.

DBUS_ERROR_TIMEOUT: Certain timeout errors, possibly ETIMEDOUT on a
socket. 

And just for added confusion, there's also:
DBUS_ERROR_TIMED_OUT: Certain timeout errors, e.g. while starting a
service.

DBUS_ERROR_NO_REPLY is the only correct one for our usage. This explains
the intermittent bug we were seeing where the monitor lost communication
with its services (usually the data providers). Because of this loss of
communication, the monitor was unable to notify the providers of changes
to the routing table or resolv.conf, leading to being stuck offline
until SSSD was restarted.

This is probably the root cause of
https://bugzilla.redhat.com/show_bug.cgi?id=728343 

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

[SSSD] [PATCH] MONITOR: Correctly detect lack of response from services