On 09/14/2011 01:22 PM, Stephen Gallagher wrote:
We were incorrectly using DBUS_ERROR_TIMEOUT here. The correct
behaviour is to check for DBUS_ERROR_NO_REPLY. This way we will
properly handle the three-tries in the tasks_check_handler().

D-BUS is rather confusing with these error codes.
DBUS_ERROR_NO_REPLY: No reply to a message expecting one, usually means
a timeout occurred.

DBUS_ERROR_TIMEOUT: Certain timeout errors, possibly ETIMEDOUT on a
socket. 

And just for added confusion, there's also:
DBUS_ERROR_TIMED_OUT: Certain timeout errors, e.g. while starting a
service.

DBUS_ERROR_NO_REPLY is the only correct one for our usage. This explains
the intermittent bug we were seeing where the monitor lost communication
with its services (usually the data providers). Because of this loss of
communication, the monitor was unable to notify the providers of changes
to the routing table or resolv.conf, leading to being stuck offline
until SSSD was restarted.

Is it safe to use strcmp in this case? Though the string is returned by the library in case of a bug and bad string being returned we would crash.
Also does the call to get the string always guarantee a not NULL result?


This is probably the root cause of
https://bugzilla.redhat.com/show_bug.cgi?id=728343 
_______________________________________________ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/sssd-devel


-- 
Thank you,
Dmitri Pal

Sr. Engineering Manager IPA project,
Red Hat Inc.


-------------------------------
Looking to carve out IT costs?
www.redhat.com/carveoutcosts/