On Mon, Aug 13, 2012 at 01:30:21PM +0200, Sumit Bose wrote:
On Fri, Aug 10, 2012 at 06:40:35PM +0200, Jakub Hrozek wrote:
https://fedorahosted.org/sssd/ticket/1460
Please see the commit. I'm wondering if there is still a (small) race condition between the call to pthread_cleanup_pop() and unlocking the mutex. Would it be better to i.e. always call the cleanup handler with pthread_cleanup_pop(1) and disconnect from the fd based on some other condition?
this patch works as expected for me on F17. I have some issues on F16, but this might have other reason I still try to investigate. While reading a bit about pthread I can across robust mutexes http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_mutexattr_... I wonder if we can better avoid races with those and maybe even make the implementation cleaner? E.g. if we get EOWNERDEAD we can call sss_cli_close_socket(); pthread_mutex_consistent(); run the request which finally will call pthread_mutex_unlock().
As discussed briefly on IRC, I agree. I like any approach that removes the pthread_cleanup_push/pop hack.
There are also pthread_mutexattr_getrobust_np() and pthread_mutex_consistent_np() in glibc. I do not have an idea how they differ and which one should be preferred. But the *_np version only need __USE_GNU while the others need __USE_XOPEN2K8.
bye, Sumit
RHEL5 only contains the _np versions, but they still seem to work. If the approach is deemed OK, I'll detect the available versions during configure.
Traditionally the _np suffix means "non-portable GNU extension".
Please see the attached patch. Would that work for you? If so, I'll extend it to work on all supported RHEL releases.