On 10/22/2014 09:54 AM, Shilen Patel wrote:
Hi,
I’m running 1.2.11.32.
What is output of rpm -q 389-ds-base?
I have 6 replicas (two of which are read-only). I ran into an issue
where a DELETE operation failed on a server with error code 51 (ldap
busy).
[21/Oct/2014:23:44:44 -0400] conn=78160 op=39510 RESULT err=51 tag=107
nentries=0 etime=3 csn=5447282c000300050000
The application retried the delete several times for a couple of hours
(while the server wasn’t getting any other requests) and the result
was always the same (err=51). Each time that happened, the error log
had the following:
[21/Oct/2014:23:44:44 -0400] - Retry count exceeded in delete
My first question is, what would cause a problem like this?
I simply restarted that directory and then the update succeeded.
However, when the update went to the other 5 servers, they failed in
the same way and the same error was logged in their log files. But
the update wasn’t retried. It was just skipped and future updates via
replication succeeded on those 5 servers.
My second question is, what’s the best way to monitor for these types
of replication errors? In this case, nsds5replicaLastUpdateStatus did
not indicate a problem. If I had not been looking at the error file
on those 5 hosts, I’m wondering how I would have known that a delete
failed to replicate to them. If the answer is to just have something
monitoring the error log files, are there specific search strings to
look for to separate out updates that have failed and won’t be retried
from other errors (e.g. temporary connection issues)? Just curious if
there is a best practice here.
Thanks!
— Shilen
--
389 users mailing list
389-users(a)lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users