On Mon, Dec 10, 2012 at 12:35:29PM +0100, Pavel Březina wrote:
From 246cf7edc92ed2826cf8295787cfb4e78d28ffd4 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Pavel=20B=C5=99ezina?= pbrezina@redhat.com Date: Mon, 10 Dec 2012 12:23:25 +0100 Subject: [PATCH] let ldap_chpass_uri failover work when using same hostname
https://fedorahosted.org/sssd/ticket/1699
We want to continue with the next server on all errors, not only on ETIMEDOUT.
I'm not quite sure, I think the intent was to only retry on network-related errors and fail right away with fatal errors like ENOMEM.
This particullar ticket was dealing with ECONNREFUSED.
Can you check where the ECONNREFUSED error came from? Maybe we're just missing another special case for lret like the ones at the end of sdap_sys_connect_done().
Also what about the same condition in auth_bind_user_done() ?
On Tue, Dec 11, 2012 at 03:53:25PM +0100, Jakub Hrozek wrote:
On Mon, Dec 10, 2012 at 12:35:29PM +0100, Pavel Březina wrote:
From 246cf7edc92ed2826cf8295787cfb4e78d28ffd4 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Pavel=20B=C5=99ezina?= pbrezina@redhat.com Date: Mon, 10 Dec 2012 12:23:25 +0100 Subject: [PATCH] let ldap_chpass_uri failover work when using same hostname
https://fedorahosted.org/sssd/ticket/1699
We want to continue with the next server on all errors, not only on ETIMEDOUT.
I'm not quite sure, I think the intent was to only retry on network-related errors and fail right away with fatal errors like ENOMEM.
This particullar ticket was dealing with ECONNREFUSED.
Can you check where the ECONNREFUSED error came from? Maybe we're just missing another special case for lret like the ones at the end of sdap_sys_connect_done().
Also what about the same condition in auth_bind_user_done() ?
One more question -- the QE have identified this problem as a regression since 1.8.x The code you changed predates 1.8, I think...so while it might be fixing the symptoms, it probably is not the place that regressed.
On 12/11/2012 03:58 PM, Jakub Hrozek wrote:
On Tue, Dec 11, 2012 at 03:53:25PM +0100, Jakub Hrozek wrote:
On Mon, Dec 10, 2012 at 12:35:29PM +0100, Pavel Březina wrote:
From 246cf7edc92ed2826cf8295787cfb4e78d28ffd4 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Pavel=20B=C5=99ezina?= pbrezina@redhat.com Date: Mon, 10 Dec 2012 12:23:25 +0100 Subject: [PATCH] let ldap_chpass_uri failover work when using same hostname
https://fedorahosted.org/sssd/ticket/1699
We want to continue with the next server on all errors, not only on ETIMEDOUT.
I'm not quite sure, I think the intent was to only retry on network-related errors and fail right away with fatal errors like ENOMEM.
I followed the code in sdap_cli_connect_done(), which uses the same connection API. We don't differentiate between errors there.
This particullar ticket was dealing with ECONNREFUSED.
Can you check where the ECONNREFUSED error came from? Maybe we're just missing another special case for lret like the ones at the end of sdap_sys_connect_done().
Also what about the same condition in auth_bind_user_done() ?
One more question -- the QE have identified this problem as a regression since 1.8.x The code you changed predates 1.8, I think...so while it might be fixing the symptoms, it probably is not the place that regressed.
On 12/11/2012 03:58 PM, Jakub Hrozek wrote:
On Tue, Dec 11, 2012 at 03:53:25PM +0100, Jakub Hrozek wrote:
On Mon, Dec 10, 2012 at 12:35:29PM +0100, Pavel Březina wrote:
From 246cf7edc92ed2826cf8295787cfb4e78d28ffd4 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Pavel=20B=C5=99ezina?= pbrezina@redhat.com Date: Mon, 10 Dec 2012 12:23:25 +0100 Subject: [PATCH] let ldap_chpass_uri failover work when using same hostname
https://fedorahosted.org/sssd/ticket/1699
We want to continue with the next server on all errors, not only on ETIMEDOUT.
I'm not quite sure, I think the intent was to only retry on network-related errors and fail right away with fatal errors like ENOMEM.
This particullar ticket was dealing with ECONNREFUSED.
Can you check where the ECONNREFUSED error came from? Maybe we're just missing another special case for lret like the ones at the end of sdap_sys_connect_done().
Also what about the same condition in auth_bind_user_done() ?
One more question -- the QE have identified this problem as a regression since 1.8.x The code you changed predates 1.8, I think...so while it might be fixing the symptoms, it probably is not the place that regressed.
I just tried it on 1.8 and the failover doesn't work there either.
As noticed by Jakub, the same issue was also in auth_bind_user_done(). I'm sending new patch that fixes both places.
On Fri, Dec 14, 2012 at 07:02:00PM +0100, Pavel Březina wrote:
On 12/11/2012 03:58 PM, Jakub Hrozek wrote:
On Tue, Dec 11, 2012 at 03:53:25PM +0100, Jakub Hrozek wrote:
On Mon, Dec 10, 2012 at 12:35:29PM +0100, Pavel Březina wrote:
From 246cf7edc92ed2826cf8295787cfb4e78d28ffd4 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Pavel=20B=C5=99ezina?= pbrezina@redhat.com Date: Mon, 10 Dec 2012 12:23:25 +0100 Subject: [PATCH] let ldap_chpass_uri failover work when using same hostname
https://fedorahosted.org/sssd/ticket/1699
We want to continue with the next server on all errors, not only on ETIMEDOUT.
I'm not quite sure, I think the intent was to only retry on network-related errors and fail right away with fatal errors like ENOMEM.
This particullar ticket was dealing with ECONNREFUSED.
Can you check where the ECONNREFUSED error came from? Maybe we're just missing another special case for lret like the ones at the end of sdap_sys_connect_done().
Also what about the same condition in auth_bind_user_done() ?
One more question -- the QE have identified this problem as a regression since 1.8.x The code you changed predates 1.8, I think...so while it might be fixing the symptoms, it probably is not the place that regressed.
I just tried it on 1.8 and the failover doesn't work there either.
As noticed by Jakub, the same issue was also in auth_bind_user_done(). I'm sending new patch that fixes both places.
OK, the worst thing that can happen is longer timeouts if we actually were relying on a different error code to set us offline. But I tested the usual cases failover cases (hostname can't be resolved at all, no LDAP server running on the hostname) and they worked fine.
Ack
On Sat, Dec 15, 2012 at 11:26:20AM +0100, Jakub Hrozek wrote:
On Fri, Dec 14, 2012 at 07:02:00PM +0100, Pavel Březina wrote:
On 12/11/2012 03:58 PM, Jakub Hrozek wrote:
On Tue, Dec 11, 2012 at 03:53:25PM +0100, Jakub Hrozek wrote:
On Mon, Dec 10, 2012 at 12:35:29PM +0100, Pavel Březina wrote:
From 246cf7edc92ed2826cf8295787cfb4e78d28ffd4 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Pavel=20B=C5=99ezina?= pbrezina@redhat.com Date: Mon, 10 Dec 2012 12:23:25 +0100 Subject: [PATCH] let ldap_chpass_uri failover work when using same hostname
https://fedorahosted.org/sssd/ticket/1699
We want to continue with the next server on all errors, not only on ETIMEDOUT.
I'm not quite sure, I think the intent was to only retry on network-related errors and fail right away with fatal errors like ENOMEM.
This particullar ticket was dealing with ECONNREFUSED.
Can you check where the ECONNREFUSED error came from? Maybe we're just missing another special case for lret like the ones at the end of sdap_sys_connect_done().
Also what about the same condition in auth_bind_user_done() ?
One more question -- the QE have identified this problem as a regression since 1.8.x The code you changed predates 1.8, I think...so while it might be fixing the symptoms, it probably is not the place that regressed.
I just tried it on 1.8 and the failover doesn't work there either.
As noticed by Jakub, the same issue was also in auth_bind_user_done(). I'm sending new patch that fixes both places.
OK, the worst thing that can happen is longer timeouts if we actually were relying on a different error code to set us offline. But I tested the usual cases failover cases (hostname can't be resolved at all, no LDAP server running on the hostname) and they worked fine.
Ack
Pushed to master and sssd-1-9
sssd-devel@lists.fedorahosted.org