[389-users] Multi-Master Replication doesn't work after a while

Fabio Isgrò fabio.isgro at messinalug.org
Wed Feb 17 16:59:57 UTC 2010


Hi All,

I'm writing this mail to report a strange behavior in 389-ds when
configured as MultiMasterReplication.

The scenario is very simple an application use it as authentication
base, the access to ldap server is regulated by a software load balancer
with two nodes.

The strange thing I noticed is every 48 hour the server must be killed
and restarted beacause doesn't respond to any request but the port and
the process are still alive, some times MMR stops due to an not existing
conflict in some entry

But today comes the disastrous thing over, MMR doesn't work anymore.
Checking the error log on first node was reported a strange thing
> [17/Feb/2010:13:11:45 +0100] NSMMReplicationPlugin - Replication agreement for agmt="cn=srvprd-l011v->srvprd-l012v" (172:389) could not be updated. For replication to take place, please enable the suffix and restart the server
> [17/Feb/2010:13:13:25 +0100] NSMMReplicationPlugin - Total update aborted: Replication agreement for "agmt="cn=srvprd-l011v->srvprd-l012v" (172:389)" can not be updated while the replica is disabled
> [17/Feb/2010:13:13:25 +0100] NSMMReplicationPlugin - (If the suffix is disabled you must enable it then restart the server for replication to take place).
> [17/Feb/2010:13:14:50 +0100] NSMMReplicationPlugin - conn=30964 op=3 repl="ou=ext,o=MYroot": Begin incremental protocol
> [17/Feb/2010:13:14:50 +0100] NSMMReplicationPlugin - conn=30964 op=3 replica="unknown": Unable to acquire replica: error: no such replica
> [17/Feb/2010:13:14:50 +0100] NSMMReplicationPlugin - conn=30964 op=3 repl="ou=ext,o=MYroot": StartNSDS50ReplicationRequest: response=6 rc=0
> [17/Feb/2010:13:16:27 +0100] NSMMReplicationPlugin - Total update aborted: Replication agreement for "agmt="cn=srvprd-l011v->srvprd-l012v" (172:389)" can not be updated while the replica is disabled
> [17/Feb/2010:13:16:27 +0100] NSMMReplicationPlugin - (If the suffix is disabled you must enable it then restart the server for replication to take place).

But either the replica and the suffix are already enabled !!!! The first
thing I do is to restart the istance but print this

Starting dirsrv:
srvprd-l011v... 389-Directory/1.2.5 B2010.012.2033
srvprd-l011v.MYroot.com:389 (/etc/dirsrv/slapd-srvprd-l011v)

[17/Feb/2010:13:32:47 +0100] - 389-Directory/1.2.5 B2010.012.2033 starting up
[17/Feb/2010:13:32:47 +0100] NSMMReplicationPlugin - _replica_init_from_config: failed to create csn generator for replica (cn=replica,cn=\22ou=ext,o=MYroot\22,cn=mapping tree, cn=config)
[17/Feb/2010:13:32:47 +0100] - replica_destroy
[17/Feb/2010:13:32:47 +0100] NSMMReplicationPlugin - Unable to configure replica ou=esterni,o=siae: failed to create csn generator for replica (cn=replica,cn=\22ou=ext,o=MYroot\22,cn=mapping tree, cn=config)
[17/Feb/2010:13:32:47 +0100] NSMMReplicationPlugin - changelog program - _cl5CheckGuardian: found old style of guardian file: bdb/4.3/libreplication-plugin
[17/Feb/2010:13:32:47 +0100] NSMMReplicationPlugin - changelog program - _cl5DBOpen: file d0bbdd82-1dd111b2-a05ca5d6-b0600000_4b699709000000010000.db4 has no matching replica; removing
[17/Feb/2010:13:32:47 +0100] NSMMReplicationPlugin - changelog program - _cl5DBOpen: failed to remove (�ҷ�fU�) file; libdb error - 2 (No such file or directory)
[17/Feb/2010:13:32:47 +0100] NSMMReplicationPlugin - changelog program - _cl5DBOpen: opened 0 existing databases in /var/lib/dirsrv/slapd-srvprd-l011v/changelogdb
[17/Feb/2010:13:32:47 +0100] NSMMReplicationPlugin - Found replication agreement named "cn=srvprd-l011v->srvprd-l012v, cn=replica, cn="ou=ext,o=MYroot", cn=mapping tree, cn=config".
[17/Feb/2010:13:32:47 +0100] NSMMReplicationPlugin - The replication agreement named "cn=srvprd-l011v->srvprd-l012v, cn=replica, cn="ou=ext,o=MYroot", cn=mapping tree, cn=config" could not be correctly parsed. No replication will occur with this replica.
[17/Feb/2010:13:32:47 +0100] NSMMReplicationPlugin - agmtlist_config_init: found 0 replication agreements in DIT
[17/Feb/2010:13:32:47 +0100] - slapd started.  Listening on All Interfaces port 389 for LDAP requests


Due to business continuity I did restore the MMR as soon as possible and
I must did it removing replicas and changelog to recreate it from the
ground up.

What Can I do to being MMR more reliable???

Best Regards
Fabio Isgrò





More information about the 389-users mailing list