[389-users] changelog deadlock replication failures with DNA
Mahadevan, Venkat
venkmaha at mail.ubc.ca
Wed Jun 12 20:00:29 UTC 2013
Hello,
While doing multiple adds using POSIX uidNumbers and the DNA plugin,
I have noticed errors such as the following:
[12/Jun/2013:11:43:24 -0700] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: retry (49) the transaction (csn=51b8c148001e02be0000) failed (rc=-30994 (DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock))
[12/Jun/2013:11:43:24 -0700] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: failed to write entry with csn (51b8c148001e02be0000); db error - -30994 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock
[12/Jun/2013:11:43:24 -0700] NSMMReplicationPlugin - write_changelog_and_ruv: can't add a change for uid=jmeter429,dc=tst,dc=id,dc=ubc,dc=ca (uniqid: e62c908c-d38f11e2-96fdeacd-f14f05d6, optype: 16) to changelog csn 51b8c148001e02be0000
[12/Jun/2013:11:43:36 -0700] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: retry (49) the transaction (csn=51b8c154004002be0000) failed (rc=-30994 (DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock))
[12/Jun/2013:11:43:36 -0700] NSMMReplicationPlugin - write_changelog_and_ruv: can't add a change for uid=jmeter797,dc=tst,dc=id,dc=ubc,dc=ca (uniqid: e62c9143-d38f11e2-96fdeacd-f14f05d6, optype: 16) to changelog csn 51b8c154004002be0000
The net effect of these errors is that an entry will be added to the Replication master but
will not sync down to any of the consumers. I am assuming because it is not added
to the changelog database correctly. Doing a bit of research, I tracked this down:
https://bugzilla.redhat.com/show_bug.cgi?format=multiple&id=907985
And there is also an advisory from RedHat that this bug has been fixed: https://rhn.redhat.com/errata/RHSA-2013-0742.html
"A problem in the lock timing in the DNA plug-in caused a deadlock if the
DNA operation was executed with other plug-ins. This update moves the
release timing of the problematic lock, and the DNA plug-in does not cause
the deadlock. (BZ#929196)"
I am running RHEL 6.4
and 389-ds-base.x86_64 1.2.11.15-14.el6_4 @rhel-x86_64-server-6
So this bug should not be occurring? Should I upgrade to a version of 389-ds-base supplied by EPEL instead of Redhat? Any
insight is most appreciated. Thank you.
Kind regards,
VM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.fedoraproject.org/pipermail/389-users/attachments/20130612/00e05bbc/attachment.html>
More information about the 389-users
mailing list