[389-users] Replication broken after each service restart

Wes Hardin wes.hardin at maxim-ic.com
Thu Jun 28 19:15:38 UTC 2012


To preface this, my issue began after upgrading from 1.2.5.x to 1.2.10.4 about a
month ago, but I did not immediate recognize the severity at that time.

Upon upgrading, it was discovered that replication had ceased to replicate.  I
got a message saying the "suffix was not enabled".  I assume(d) this meant
replication was not enabled for the suffix and that I needed to enable it.  From
the 389-console, the "Enable Replica" box was checked and the Replica Role was
"Single Master."  I couldn't see any reason for the sudden failure and the
continued resistance against my attempts to synchronize my consumers.
Eventually, I unchecked "Enable Replica" and (automatically) deleted all my
replication agreements.  I re-enabled the replica and began recreating all my
agreements.  Replication resumed normal operation.

It wasn't until today that I realized the problem still exists.  I rebooted my
single master yesterday and today found that replication had stopped again.  I
exported all my agreements to an LDIF then nuked and rebuilt all my replication
settings as I'd done after the upgrade.  Replication resumed.  Just to test my
theory, I then restarted the server again and once again broke replication.
Luckily I was prepared this time and was quickly able to recover.

Here are some messages showing the rejected attempt to replicate.  The first is
an update, the second is an initialize.  Both seem to indicate replication is
not enabled despite evidence to the contrary.

[28/Jun/2012:08:35:37 -0500] NSMMReplicationPlugin - Replication agreement for
agmt="cn=ausldap001" (ausldap001:389) could not be updated. For replication to
take place, please enable the suffix and restart the server

[28/Jun/2012:09:31:00 -0500] NSMMReplicationPlugin - Total update aborted:
Replication agreement for "agmt="cn=mfnldap002" (mfnldap002:389)" can not be
updated while the replica is disabled
[28/Jun/2012:09:31:00 -0500] NSMMReplicationPlugin - (If the suffix is disabled
you must enable it then restart the server for replication to take place).

I have noticed that I have a lot of entries whose DNs include escaped
characters, which I don't remember seeing before.  That is, \3D instead of '='
or \2C instead of ','.  For example:

dn: cn=replica,cn=dc\3Dmaxim-ic\2C dc\3Dcom,cn=mapping tree,cn=config
objectClass: nsDS5Replica
objectClass: top
nsDS5ReplicaRoot: dc=maxim-ic,dc=com
nsDS5ReplicaType: 3
nsDS5Flags: 1
nsDS5ReplicaId: 7777
nsds5ReplicaPurgeDelay: 604800
cn: replica
nsState:: YR4AAAAAAADogOxPAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
nsDS5ReplicaName: 240eb382-c13b11e1-bf5ef100-4e39c57a
nsds5ReplicaChangeCount: 0
nsds5replicareapactive: 0

But based on the following link, I guess that's to be expected.

http://directory.fedoraproject.org/wiki/Upgrade_to_New_DN_Format

There also seems to be some inconsistency in how my root is referenced.
Sometimes it has a space between the two domain components, sometimes it does
not.  You can actually see that in the example above.  The DN value has a space,
the nsDS5ReplicaRoot value has no space.

I'm rather inexperienced in the management of LDAP.  My upgrade from 1.2.5 to
1.2.10 was just a simple "yum upgrade".  I hadn't seen the link about about the
DN format or any other upgrade guide.  I'm fully willing to allow that I failed
to take some required step at the time of the upgrade.

Many thanks in advance for any help you can provide.
-- 
/* Wes Hardin */
UNIX/Linux Systems Administrator, IT Engineering Support
Maxim Integrated Products | Innovation Delivered® | www.maxim-ic.com



More information about the 389-users mailing list