https://fedorahosted.org/sssd/ticket/867
channel->tries is only used when a query fails. We were forcibly setting it to 0, which meant we never retried another server.
On Thu, 2011-05-12 at 13:07 +0200, Jakub Hrozek wrote:
https://fedorahosted.org/sssd/ticket/867
channel->tries is only used when a query fails. We were forcibly setting it to 0, which meant we never retried another server.
From the c-ares manual:
ARES_OPT_TRIES int tries; The number of tries the resolver will try contacting each name server before giving up. The default is four tries.
We originally set it to zero because the above entry used to read "the number of retries", not tries.So zero retries was supposed to mean one try per server.
Can you verify in the c-ares code that this isn't a behavior that CHANGED between versions (and thus we might need to detect the version in use and handle it?)
On 05/12/2011 02:23 PM, Stephen Gallagher wrote:
On Thu, 2011-05-12 at 13:07 +0200, Jakub Hrozek wrote:
https://fedorahosted.org/sssd/ticket/867
channel->tries is only used when a query fails. We were forcibly setting it to 0, which meant we never retried another server.
From the c-ares manual:
ARES_OPT_TRIES int tries; The number of tries the resolver will try contacting each name server before giving up. The default is four tries.
We originally set it to zero because the above entry used to read "the number of retries", not tries.So zero retries was supposed to mean one try per server.
Can you verify in the c-ares code that this isn't a behavior that CHANGED between versions (and thus we might need to detect the version in use and handle it?)
The code in question in c-ares is called only when a search fails and reads (simplified):
------- while (++try < (nservers * tries)) { /* Get next server */ /* Send the query again */ }
/* Mark query as failed */ -------
Since we had tries=0, we skipped the failover part and immediately marked the query as failed.
The code is the same in c-ares 1.6 (RHEL5) and 1.7 (RHEL6 and supported Fedora versions).
Jakub
On Thu, 2011-05-12 at 17:48 +0200, Jakub Hrozek wrote:
On 05/12/2011 02:23 PM, Stephen Gallagher wrote:
On Thu, 2011-05-12 at 13:07 +0200, Jakub Hrozek wrote:
https://fedorahosted.org/sssd/ticket/867
channel->tries is only used when a query fails. We were forcibly setting it to 0, which meant we never retried another server.
From the c-ares manual:
ARES_OPT_TRIES int tries; The number of tries the resolver will try contacting each name server before giving up. The default is four tries.
We originally set it to zero because the above entry used to read "the number of retries", not tries.So zero retries was supposed to mean one try per server.
Can you verify in the c-ares code that this isn't a behavior that CHANGED between versions (and thus we might need to detect the version in use and handle it?)
The code in question in c-ares is called only when a search fails and reads (simplified):
while (++try < (nservers * tries)) { /* Get next server */ /* Send the query again */ }
/* Mark query as failed */
Since we had tries=0, we skipped the failover part and immediately marked the query as failed.
The code is the same in c-ares 1.6 (RHEL5) and 1.7 (RHEL6 and supported Fedora versions).
With this in mind, Ack.
On Thu, 2011-05-12 at 11:49 -0400, Stephen Gallagher wrote:
On Thu, 2011-05-12 at 17:48 +0200, Jakub Hrozek wrote:
On 05/12/2011 02:23 PM, Stephen Gallagher wrote:
On Thu, 2011-05-12 at 13:07 +0200, Jakub Hrozek wrote:
https://fedorahosted.org/sssd/ticket/867
channel->tries is only used when a query fails. We were forcibly setting it to 0, which meant we never retried another server.
From the c-ares manual:
ARES_OPT_TRIES int tries; The number of tries the resolver will try contacting each name server before giving up. The default is four tries.
We originally set it to zero because the above entry used to read "the number of retries", not tries.So zero retries was supposed to mean one try per server.
Can you verify in the c-ares code that this isn't a behavior that CHANGED between versions (and thus we might need to detect the version in use and handle it?)
The code in question in c-ares is called only when a search fails and reads (simplified):
while (++try < (nservers * tries)) { /* Get next server */ /* Send the query again */ }
/* Mark query as failed */
Since we had tries=0, we skipped the failover part and immediately marked the query as failed.
The code is the same in c-ares 1.6 (RHEL5) and 1.7 (RHEL6 and supported Fedora versions).
With this in mind, Ack.
Pushed to master and sssd-1-5.
sssd-devel@lists.fedorahosted.org