changelog deadlock replication failures with DNA

Wednesday, 12 June 2013

Hello,

While doing multiple adds using POSIX uidNumbers and the DNA plugin,
I have noticed errors such as the following:

[12/Jun/2013:11:43:24 -0700] NSMMReplicationPlugin - changelog program -
_cl5WriteOperationTxn: retry (49) the transaction (csn=51b8c148001e02be0000) failed
(rc=-30994 (DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock))
[12/Jun/2013:11:43:24 -0700] NSMMReplicationPlugin - changelog program -
_cl5WriteOperationTxn: failed to write entry with csn (51b8c148001e02be0000); db error -
-30994 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock
[12/Jun/2013:11:43:24 -0700] NSMMReplicationPlugin - write_changelog_and_ruv: can't
add a change for uid=jmeter429,dc=tst,dc=id,dc=ubc,dc=ca (uniqid:
e62c908c-d38f11e2-96fdeacd-f14f05d6, optype: 16) to changelog csn 51b8c148001e02be0000
[12/Jun/2013:11:43:36 -0700] NSMMReplicationPlugin - changelog program -
_cl5WriteOperationTxn: retry (49) the transaction (csn=51b8c154004002be0000) failed
(rc=-30994 (DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock))
[12/Jun/2013:11:43:36 -0700] NSMMReplicationPlugin - write_changelog_and_ruv: can't
add a change for uid=jmeter797,dc=tst,dc=id,dc=ubc,dc=ca (uniqid:
e62c9143-d38f11e2-96fdeacd-f14f05d6, optype: 16) to changelog csn 51b8c154004002be0000

The net effect of these errors is that an entry will be added to the Replication master
but
will not sync down to any of the consumers. I am assuming because it is not added
to the changelog database correctly. Doing a bit of research, I tracked this down:

https://bugzilla.redhat.com/show_bug.cgi?format=multiple&id=907985

And there is also an advisory from RedHat that this bug has been fixed:
https://rhn.redhat.com/errata/RHSA-2013-0742.html
"A problem in the lock timing in the DNA plug-in caused a deadlock if the
DNA operation was executed with other plug-ins. This update moves the
release timing of the problematic lock, and the DNA plug-in does not cause
the deadlock. (BZ#929196)"

I am running RHEL 6.4
and 389-ds-base.x86_64              1.2.11.15-14.el6_4 @rhel-x86_64-server-6

So this bug should not be occurring? Should I upgrade to a version of 389-ds-base supplied
by EPEL instead of Redhat? Any
insight is most appreciated. Thank you.

Kind regards,

VM

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005