We have a couple of known bugs which you might have run into... We are actively working on them.
#47696 Large Searches Hang - Possibly entryrdn related
#47750 Creating a glue fails if one above level is a conflict or missing
...

You mentioned 2 servers ldap1 and ldap2. Are they both masters? You put "local consumer" to ldap2. Does that mean ldap2 is a read only replica?

It looks to me ldap1 got broken and ldap2 is still healthy. You may want to make ldap1 in sync with ldap2 and start from there. If ldap2 is a master, you could re-initialize ldap1 from ldap2.

If ldap2 is a read only replica, you could export the contents with db2ldif -r -n <your_backend> command line utility on ldap2 and import the exported ldif file to ldap1.

Or if you don't mind losing the replication information such as tombstones and state info, you could export the contents without "-r" by db2ldif on ldap2 and import the ldif file to ldap1, then re-initialize ldap2 on ldap1.

Hope you could choose one of the 3 ways and it fixes the problem.
--noriko

Elizabeth Jones wrote:

As an added bonus, I now see this when I try to ldapsearch for this ou in my first ldap and in its local consumer -- ldap1: dn: nsuniqueid=dde5bb01-ca5811e3-af3cad6b-9c050417,ou=CDC,ou=Service Accts,ou=People,dc=mycompany,dc=com ldap2 (local consumer): dn: ou=CDC,ou=Service Accts,ou=People,dc=mycompany,dc=com

I have all kinds of borkage in my ldap today. I created a new ou in one of my data centers, ou=cdc,ou=service accts,ou=staff,ou=people,dc=mycompany,dc=com under this I added 2 users. About 5 minutes later I got an alarm from my monitoring system saying that replication had failed, and I discovered that replication from this data center to my second data center had failed, and more specifically this ou -- [22/Apr/2014:15:28:03 -0500] - Retry count exceeded in add [22/Apr/2014:15:28:03 -0500] NSMMReplicationPlugin - conn=437731 op=4 csn=5356cc22000000010000: Can't created glue entry ou=CDC,ou=Service Accts,ou=People,dc=mycompany,dc=com uniqueid=dde5bb01-ca5811e3-af3cad6b-9c050417, error 51 So I thought there was something wrong in the new ou I'd created so I went back and deleted the two children, then tried to delete the ou. But my ldap thinks that the children still exist and won't let me delete the ou -- [22/Apr/2014:15:45:17 -0500] entryrdn-index - _entryrdn_delete_key: Failed to remove ou=cdc; has children [22/Apr/2014:15:45:17 -0500] - database index operation failed BAD 1031, err=-1 Unknown error: -1 Any thoughts on how to proceed with this? I'm afraid to do anything else on the first server now that I've managed to get it into this state. thanks - EJ -- 389 users mailing list 389-users@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/389-users