[389-users] Error code 51 and replication errors

Wed Oct 22 16:14:37 UTC 2014

On 10/22/2014 10:10 AM, Shilen Patel wrote:
>
> 389-ds-base-1.2.11.32-1.el6.x86_64
>

I would strongly encourage you to use the version provided with EL 6.6, 
which is 389-ds-base-1.2.11.15-47.  It looks like you are using a build 
from the old rmeggins repo or the newer copr repo.  These are really 
only for those users who needed critical fixes or features not yet in 
the "supported" EL6.6 version.  I don't know if that will fix your 
problem, but it will make it a lot easier to support.

>
> Thanks!
>
> — Shilen
>
> From: Rich Megginson <rmeggins at redhat.com <mailto:rmeggins at redhat.com>>
> Reply-To: "389-users at lists.fedoraproject.org 
> <mailto:389-users at lists.fedoraproject.org>" 
> <389-users at lists.fedoraproject.org 
> <mailto:389-users at lists.fedoraproject.org>>
> Date: Wednesday, October 22, 2014 at 12:07 PM
> To: "389-users at lists.fedoraproject.org 
> <mailto:389-users at lists.fedoraproject.org>" 
> <389-users at lists.fedoraproject.org 
> <mailto:389-users at lists.fedoraproject.org>>
> Subject: Re: [389-users] Error code 51 and replication errors
>
>     On 10/22/2014 09:54 AM, Shilen Patel wrote:
>>     Hi,
>>
>>     I’m running 1.2.11.32.
>
>     What is output of rpm -q 389-ds-base?
>
>>     I have 6 replicas (two of which are read-only).  I ran into an
>>     issue where a DELETE operation failed on a server with error code
>>     51 (ldap busy).
>>
>>     [21/Oct/2014:23:44:44 -0400] conn=78160 op=39510 RESULT err=51
>>     tag=107 nentries=0 etime=3 csn=5447282c000300050000
>>
>>
>>     The application retried the delete several times for a couple of
>>     hours (while the server wasn’t getting any other requests) and
>>     the result was always the same (err=51).  Each time that
>>     happened, the error log had the following:
>>
>>     [21/Oct/2014:23:44:44 -0400] - Retry count exceeded in delete
>>
>>
>>     My first question is, what would cause a problem like this?
>>
>>     I simply restarted that directory and then the update succeeded.
>>      However, when the update went to the other 5 servers, they
>>     failed in the same way and the same error was logged in their log
>>     files.  But the update wasn’t retried.  It was just skipped and
>>     future updates via replication succeeded on those 5 servers.
>>
>>     My second question is, what’s the best way to monitor for these
>>     types of replication errors?  In this
>>     case, nsds5replicaLastUpdateStatus did not indicate a problem.
>>      If I had not been looking at the error file on those 5 hosts,
>>     I’m wondering how I would have known that a delete failed to
>>     replicate to them.  If the answer is to just have something
>>     monitoring the error log files, are there specific search strings
>>     to look for to separate out updates that have failed and won’t be
>>     retried from other errors (e.g. temporary connection issues)?
>>      Just curious if there is a best practice here.
>>
>>     Thanks!
>>
>>     — Shilen
>>
>>
>>     --
>>     389 users mailing list
>>     389-users at lists.fedoraproject.orghttps://admin.fedoraproject.org/mailman/listinfo/389-users
>
>
>
> --
> 389 users mailing list
> 389-users at lists.fedoraproject.org
> https://admin.fedoraproject.org/mailman/listinfo/389-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.fedoraproject.org/pipermail/389-users/attachments/20141022/0798aac1/attachment.html>