[389-users] Error code 51 and replication errors

Rich Megginson rmeggins at redhat.com
Wed Oct 22 17:46:08 UTC 2014


On 10/22/2014 11:35 AM, Shilen Patel wrote:
> Thanks for the information.  I’m actually running 6.5 not 6.6.  The 
> latest version I’m seeing for 6.5 is 1.2.11.15-34.el6_5.  Is that 
> version for 6.5 about the same (in terms of bug fixes) as 1.2.11.15-47 
> in 6.6?

Is 1.2.11.15-34.el6_5 the same as 1.2.11.15-47?  No.  -47 has a lot more 
bug fixes.

> If so, I’ll check out 1.2.11.15-34 in 6.5.  Otherwise, I’ll upgrade to 
> 6.6 first.  Appreciate the help.
>
> Thanks!
>
> — Shilen
>
> From: Rich Megginson <rmeggins at redhat.com <mailto:rmeggins at redhat.com>>
> Reply-To: "389-users at lists.fedoraproject.org 
> <mailto:389-users at lists.fedoraproject.org>" 
> <389-users at lists.fedoraproject.org 
> <mailto:389-users at lists.fedoraproject.org>>
> Date: Wednesday, October 22, 2014 at 1:10 PM
> To: "389-users at lists.fedoraproject.org 
> <mailto:389-users at lists.fedoraproject.org>" 
> <389-users at lists.fedoraproject.org 
> <mailto:389-users at lists.fedoraproject.org>>
> Subject: Re: [389-users] Error code 51 and replication errors
>
>     On 10/22/2014 10:58 AM, Shilen Patel wrote:
>>     1.2.11.15 is a couple of years old?
>
>     Yes and no.  1.2.11.15 was the starting point for EL6. However,
>     many, many features and fixes have been backported from later
>     versions into 1.2.11.15-47 in EL 6.6.
>
>>     I had to upgrade to the latest in copr because of another issue
>>     that I think was fixed in 1.2.11.30.
>
>     Has that issue been fixed in 1.2.11.15-47 in EL 6.6?  I know a lot
>     of 389 community members running on EL6 were using
>     fedorapeople/copr repos because they could not wait until those
>     fixes/features were available in EL 6.6.  Now that EL 6.6 is out,
>     I encourage you (and anyone else in this situation) to stop using
>     fedorapeople/copr builds and instead use 1.2.11.15-47 in EL 6.6.
>
>>     If I’m misunderstanding version numbers in EL vs copr, please let
>>     me know.
>
>     See above.
>
>>     But my main question is the second question regarding best
>>     practices for detecting replication failures and I think that
>>     applies to all versions?
>
>     nsds5replicaLastUpdateStatus is the documented way to get
>     replication status.  The fact that this error is not being
>     reported that way seems like a bug.
>     You can also monitor the errors logs.
>
>     As for this particular problem, see
>     https://fedorahosted.org/389/ticket/47409
>
>>
>>     Thanks!
>>
>>     — Shilen
>>
>>     From: Rich Megginson <rmeggins at redhat.com
>>     <mailto:rmeggins at redhat.com>>
>>     Reply-To: "389-users at lists.fedoraproject.org
>>     <mailto:389-users at lists.fedoraproject.org>"
>>     <389-users at lists.fedoraproject.org
>>     <mailto:389-users at lists.fedoraproject.org>>
>>     Date: Wednesday, October 22, 2014 at 12:14 PM
>>     To: "389-users at lists.fedoraproject.org
>>     <mailto:389-users at lists.fedoraproject.org>"
>>     <389-users at lists.fedoraproject.org
>>     <mailto:389-users at lists.fedoraproject.org>>
>>     Subject: Re: [389-users] Error code 51 and replication errors
>>
>>         On 10/22/2014 10:10 AM, Shilen Patel wrote:
>>>
>>>         389-ds-base-1.2.11.32-1.el6.x86_64
>>>
>>
>>         I would strongly encourage you to use the version provided
>>         with EL 6.6, which is 389-ds-base-1.2.11.15-47.  It looks
>>         like you are using a build from the old rmeggins repo or the
>>         newer copr repo.  These are really only for those users who
>>         needed critical fixes or features not yet in the "supported"
>>         EL6.6 version.  I don't know if that will fix your problem,
>>         but it will make it a lot easier to support.
>>
>>
>>>
>>>         Thanks!
>>>
>>>         — Shilen
>>>
>>>         From: Rich Megginson <rmeggins at redhat.com
>>>         <mailto:rmeggins at redhat.com>>
>>>         Reply-To: "389-users at lists.fedoraproject.org
>>>         <mailto:389-users at lists.fedoraproject.org>"
>>>         <389-users at lists.fedoraproject.org
>>>         <mailto:389-users at lists.fedoraproject.org>>
>>>         Date: Wednesday, October 22, 2014 at 12:07 PM
>>>         To: "389-users at lists.fedoraproject.org
>>>         <mailto:389-users at lists.fedoraproject.org>"
>>>         <389-users at lists.fedoraproject.org
>>>         <mailto:389-users at lists.fedoraproject.org>>
>>>         Subject: Re: [389-users] Error code 51 and replication errors
>>>
>>>             On 10/22/2014 09:54 AM, Shilen Patel wrote:
>>>>             Hi,
>>>>
>>>>             I’m running 1.2.11.32.
>>>
>>>             What is output of rpm -q 389-ds-base?
>>>
>>>>             I have 6 replicas (two of which are read-only).  I ran
>>>>             into an issue where a DELETE operation failed on a
>>>>             server with error code 51 (ldap busy).
>>>>
>>>>             [21/Oct/2014:23:44:44 -0400] conn=78160 op=39510 RESULT
>>>>             err=51 tag=107 nentries=0 etime=3 csn=5447282c000300050000
>>>>
>>>>
>>>>             The application retried the delete several times for a
>>>>             couple of hours (while the server wasn’t getting any
>>>>             other requests) and the result was always the same
>>>>             (err=51).  Each time that happened, the error log had
>>>>             the following:
>>>>
>>>>             [21/Oct/2014:23:44:44 -0400] - Retry count exceeded in
>>>>             delete
>>>>
>>>>
>>>>             My first question is, what would cause a problem like this?
>>>>
>>>>             I simply restarted that directory and then the update
>>>>             succeeded.  However, when the update went to the other
>>>>             5 servers, they failed in the same way and the same
>>>>             error was logged in their log files.  But the update
>>>>             wasn’t retried.  It was just skipped and future updates
>>>>             via replication succeeded on those 5 servers.
>>>>
>>>>             My second question is, what’s the best way to monitor
>>>>             for these types of replication errors?  In this
>>>>             case, nsds5replicaLastUpdateStatus did not indicate a
>>>>             problem.  If I had not been looking at the error file
>>>>             on those 5 hosts, I’m wondering how I would have known
>>>>             that a delete failed to replicate to them.  If the
>>>>             answer is to just have something monitoring the error
>>>>             log files, are there specific search strings to look
>>>>             for to separate out updates that have failed and won’t
>>>>             be retried from other errors (e.g. temporary connection
>>>>             issues)?  Just curious if there is a best practice here.
>>>>
>>>>             Thanks!
>>>>
>>>>             — Shilen
>>>>
>>>>
>>>>             --
>>>>             389 users mailing list
>>>>             389-users at lists.fedoraproject.orghttps://admin.fedoraproject.org/mailman/listinfo/389-users
>>>
>>>
>>>
>>>         --
>>>         389 users mailing list
>>>         389-users at lists.fedoraproject.orghttps://admin.fedoraproject.org/mailman/listinfo/389-users
>>
>>
>>
>>     --
>>     389 users mailing list
>>     389-users at lists.fedoraproject.orghttps://admin.fedoraproject.org/mailman/listinfo/389-users
>
>
>
> --
> 389 users mailing list
> 389-users at lists.fedoraproject.org
> https://admin.fedoraproject.org/mailman/listinfo/389-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.fedoraproject.org/pipermail/389-users/attachments/20141022/f2c0bd92/attachment.html>


More information about the 389-users mailing list