On 06/12/2013 10:00 PM, Mahadevan, Venkat wrote:

Hello,

 

While doing multiple adds using POSIX uidNumbers and the DNA plugin,

I have noticed errors such as the following:

 

[12/Jun/2013:11:43:24 -0700] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: retry (49) the transaction (csn=51b8c148001e02be0000) failed (rc=-30994 (DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock))

[12/Jun/2013:11:43:24 -0700] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: failed to write entry with csn (51b8c148001e02be0000); db error - -30994 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock

[12/Jun/2013:11:43:24 -0700] NSMMReplicationPlugin - write_changelog_and_ruv: can't add a change for uid=jmeter429,dc=tst,dc=id,dc=ubc,dc=ca (uniqid: e62c908c-d38f11e2-96fdeacd-f14f05d6, optype: 16) to changelog csn 51b8c148001e02be0000

[12/Jun/2013:11:43:36 -0700] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: retry (49) the transaction (csn=51b8c154004002be0000) failed (rc=-30994 (DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock))

[12/Jun/2013:11:43:36 -0700] NSMMReplicationPlugin - write_changelog_and_ruv: can't add a change for uid=jmeter797,dc=tst,dc=id,dc=ubc,dc=ca (uniqid: e62c9143-d38f11e2-96fdeacd-f14f05d6, optype: 16) to changelog csn 51b8c154004002be0000

 

Hi Mahadevan,

This means that server was unabled (because of too many retries due to deadlock) to write the update in the changelog.
This triggers the failure of the operation. If it is on the consumer, that means that the supplier will retry later to send the update and as you have 2 differents CSN in the log I think the updates are also progressing on consumers side. Now they can be late.
I do not know why it is occurring. Deadlock is quite rare because under default deployment threads are synchronized by a backend lock.
Is the dse.ldif available somewhere ?

best regards
thierry

The net effect of these errors is that an entry will be added to the Replication master but

will not sync down to any of the consumers. I am assuming because it is not added

to the changelog database correctly. Doing a bit of research, I tracked this down:

 

https://bugzilla.redhat.com/show_bug.cgi?format=multiple&id=907985

 

And there is also an advisory from RedHat that this bug has been fixed: https://rhn.redhat.com/errata/RHSA-2013-0742.html

“A problem in the lock timing in the DNA plug-in caused a deadlock if the
DNA operation was executed with other plug-ins. This update moves the
release timing of the problematic lock, and the DNA plug-in does not cause
the deadlock. (BZ#929196)”

 

I am running RHEL 6.4

and 389-ds-base.x86_64              1.2.11.15-14.el6_4 @rhel-x86_64-server-6

 

So this bug should not be occurring? Should I upgrade to a version of 389-ds-base supplied by EPEL instead of Redhat? Any

insight is most appreciated. Thank you.

 

Kind regards,

 

VM

 



--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users