[389-users] 389-Directory/1.3.1.6 cannot setup replica

Ivanov Andrey (M.) andrey.ivanov at polytechnique.fr
Wed Nov 5 19:53:58 UTC 2014


Hi, 

i'm having the same problem. I'm in the process of migration from our 389DS v1.2.10.25/CentOS5 to 389DS on CentOS 7. Everything is working fine on standalone servers but the replication (especially online initialization). It stucks _each time_ during the online initialization with SSL/TLS (and sometimes without SSL/TLS). And exactly with the same error messages as you describe. 
The network problems in my case are excluded - i used both virtual machines on the same ESXi in the same network and/or physical servers, the results are the same. 
I've tried compiling all the branches latest available (tags 1.3.2.24, 1.3.1.22, 1.3.3.5). In all the cases the result was the same. The server pushing the updates just gets stuck at some random number of entries sent to consumer (we have ~30000 entries, it gets stuck at random somewhere from 1200 to 25000 entries, the entries stuck have nothing particular in size, it's completely random). 
1.2.10.24 compiled with compat-db-4.7 on CentOS 7 has the least of these problems (and the initial replication is 10 times faster - it takes 8 seconds instead of 80 for 1.3.x!). I've been using 1.2.10.24 on CentOS5 compiled with mozilla ldap labraries and 1.2.10.23 obn CentOS 7 compiled with opendlap librarires. The first one had no problems at all to push the initial replication, the second one had intermittent problems, but much less than v1.3.x 

I've noticed that this problem is getting worse (or simply appears) if : 
* the replica is be of type 3 (multi-master), with replication agreements in both directions 
* our schema has several additional attribuites, it may be also important 
* if the virtual machine has only one CPU. Adding a second CPU increases the number of transferred entries before the initialization gets stuck. So it may me some thread/transaction contention or deadlock. 
* if the replication agreement uses SSL(port 636) or TLS(port389). Using port 389 with LDAP protocol instead of TLS/SSL increases the number of transferred entries before the initialization gets stuck. Sometimes the initialization even ends successfully in this case. 
* decreasing nsslapd-db-checkpoint-interval (say, to 5 seconds) also gets the problem worse 

When the on-line intialization is finished (if it finishes), there are no problems. I think it is related to the volume of data transferred, so small incremental updates do not generate any problem.... 
If necessary, i will make any debugs/tests - it is a critical element of our infrastructure, so i'd like this problem to be resolved... 

Regards, 
Andrey IVANOV 

----- Mail original -----

> De: "陳含林" <laneovcc at gmail.com>
> À: 389-users at lists.fedoraproject.org
> Envoyé: Mercredi 5 Novembre 2014 18:01:37
> Objet: [389-users] 389-Directory/1.3.1.6 cannot setup replica

> hello all,

> I have setup a IDM/freeipa master using CentOS7 , and import about 5000
> hosts.

> then i try to setup a IDM/freeipa replication server by using
> ipa-replica-install.

> It seems the total update on replication server hangs after about 1000+
> entries imported.

> I try to trigger a total update by setting nsds5beginreplicarefresh, but the
> result was the same.

> Any one help me ? Thanks!

> idm1 is the master, idm2 is the replication server.
> master server logs:

> [06/Nov/2014:00:21:48 +0800] - 389-Directory/ 1.3.1.6 B2014.219.1825 starting
> up
> [06/Nov/2014:00:21:48 +0800] schema-compat-plugin - warning: no entries set
> up under cn=computers, cn=compat,dc=idc
> [06/Nov/2014:00:21:51 +0800] - Skipping CoS Definition cn=Password
> Policy,cn=accounts,dc=idc--no CoS Templates found, which should be added
> before the CoS Definition.
> [06/Nov/2014:00:21:51 +0800] - Skipping CoS Definition cn=Password
> Policy,cn=accounts,dc=idc--no CoS Templates found, which should be added
> before the CoS Definition.
> [06/Nov/2014:00:21:51 +0800] - slapd started. Listening on All Interfaces
> port 389 for LDAP requests
> [06/Nov/2014:00:21:51 +0800] - Listening on All Interfaces port 636 for LDAPS
> requests
> [06/Nov/2014:00:21:51 +0800] - Listening on /var/run/slapd-IDC.socket for
> LDAPI requests
> [06/Nov/2014:00:21:51 +0800] - Entry "uid=admin,ou=people,o=ipaca" --
> attribute "krbExtraData" not allowed
> [06/Nov/2014:00:40:26 +0800] NSMMReplicationPlugin -
> agmt="cn=meToidm2.ra.cn.idc" (idm2:389): The remote replica has a different
> database generation ID than the local database. You may have to reinitialize
> the remote replica, or the local replica.
> [06/Nov/2014:00:40:26 +0800] NSMMReplicationPlugin - Beginning total update
> of replica "agmt="cn=meToidm2.ra.cn.idc" (idm2:389)".

> replication server logs:
> [06/Nov/2014:00:40:18 +0800] - 389-Directory/ 1.3.1.6 B2014.219.1825 starting
> up
> [06/Nov/2014:00:40:18 +0800] ipalockout_get_global_config - [file
> ipa_lockout.c, line 185]: Failed to get default realm (-1765328160)
> [06/Nov/2014:00:40:18 +0800] ipaenrollment_start - [file ipa_enrollment.c,
> line 393]: Failed to get default realm?!
> [06/Nov/2014:00:40:18 +0800] - slapd started. Listening on All Interfaces
> port 389 for LDAP requests
> [06/Nov/2014:00:40:18 +0800] - Listening on All Interfaces port 636 for LDAPS
> requests
> [06/Nov/2014:00:40:18 +0800] - Listening on /var/run/slapd-IDC.socket for
> LDAPI requests
> [06/Nov/2014:00:40:19 +0800] NSMMReplicationPlugin -
> agmt="cn=meToidm1.ra.cn.idc" (idm1:389): The remote replica has a different
> database generation ID than the local database. You may have to reinitialize
> the remote replica, or the local replica.
> [06/Nov/2014:00:40:20 +0800] NSMMReplicationPlugin -
> multimaster_be_state_change: replica dc=idc is going offline; disabling
> replication
> [06/Nov/2014:00:40:20 +0800] - WARNING: Import is running with
> nsslapd-db-private-import-mem on; No other process is allowed to access the
> database
> [06/Nov/2014:00:40:40 +0800] - import userRoot: Processed 1581 entries --
> average rate 79.0/sec, recent rate 79.0/sec, hit ratio 0%
> [06/Nov/2014:00:41:00 +0800] - import userRoot: Processed 1581 entries --
> average rate 39.5/sec, recent rate 39.5/sec, hit ratio 0%
> [06/Nov/2014:00:41:20 +0800] - import userRoot: Processed 1581 entries --
> average rate 26.4/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:41:41 +0800] - import userRoot: Processed 1581 entries --
> average rate 19.8/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:42:01 +0800] - import userRoot: Processed 1581 entries --
> average rate 15.8/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:42:21 +0800] - import userRoot: Processed 1581 entries --
> average rate 13.2/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:42:41 +0800] - import userRoot: Processed 1581 entries --
> average rate 11.3/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:43:01 +0800] - import userRoot: Processed 1581 entries --
> average rate 9.9/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:43:21 +0800] - import userRoot: Processed 1581 entries --
> average rate 8.8/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:43:41 +0800] - import userRoot: Processed 1581 entries --
> average rate 7.9/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:44:01 +0800] - import userRoot: Processed 1581 entries --
> average rate 7.2/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:44:21 +0800] - import userRoot: Processed 1581 entries --
> average rate 6.6/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:44:41 +0800] - import userRoot: Processed 1581 entries --
> average rate 6.1/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:45:01 +0800] - import userRoot: Processed 1581 entries --
> average rate 5.6/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:45:21 +0800] - import userRoot: Processed 1581 entries --
> average rate 5.3/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:45:21 +0800] - import userRoot: Decided to end this pass
> because the progress rate has dropped below the 50% threshold.
> [06/Nov/2014:00:45:21 +0800] - import userRoot: Ending pass number 1 ...
> [06/Nov/2014:00:45:22 +0800] - import userRoot: Foreman is done; waiting for
> workers to finish...
> [06/Nov/2014:00:45:22 +0800] - import userRoot: Workers finished; cleaning
> up...
> [06/Nov/2014:00:45:22 +0800] - import userRoot: Workers cleaned up.
> [06/Nov/2014:00:45:22 +0800] - import userRoot: Sweeping files for merging
> later...
> [06/Nov/2014:00:45:22 +0800] - WARNING: Import is running with
> nsslapd-db-private-import-mem on; No other process is allowed to access the
> database
> [06/Nov/2014:00:45:22 +0800] - import userRoot: Sweep done.
> [06/Nov/2014:00:45:22 +0800] - import userRoot: Beginning pass number 2
> [06/Nov/2014:00:45:42 +0800] - import userRoot: Processed 1581 entries (pass
> 2) -- average rate 214748285.8/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:46:02 +0800] - import userRoot: Processed 1581 entries (pass
> 2) -- average rate 107374142.9/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:46:22 +0800] - import userRoot: Processed 1581 entries (pass
> 2) -- average rate 71582761.9/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:46:42 +0800] - import userRoot: Processed 1581 entries (pass
> 2) -- average rate 53687071.4/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:47:02 +0800] - import userRoot: Processed 1581 entries (pass
> 2) -- average rate 42949657.1/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:47:22 +0800] - import userRoot: Processed 1581 entries (pass
> 2) -- average rate 35791381.0/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:47:43 +0800] - import userRoot: Processed 1581 entries (pass
> 2) -- average rate 30678326.5/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:48:03 +0800] - import userRoot: Processed 1581 entries (pass
> 2) -- average rate 26843535.7/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:48:23 +0800] - import userRoot: Processed 1581 entries (pass
> 2) -- average rate 23860920.6/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:48:43 +0800] - import userRoot: Processed 1581 entries (pass
> 2) -- average rate 21474828.6/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:49:03 +0800] - import userRoot: Processed 1581 entries (pass
> 2) -- average rate 19522571.4/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:49:23 +0800] - import userRoot: Processed 1581 entries (pass
> 2) -- average rate 17895690.5/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:49:43 +0800] - import userRoot: Processed 1581 entries (pass
> 2) -- average rate 16519098.9/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:50:03 +0800] - import userRoot: Processed 1581 entries (pass
> 2) -- average rate 15339163.3/sec, recent rate 0.0/sec, hit ratio 0%
> [06/Nov/2014:00:50:23 +0800] - import userRoot: Processed 1722 entries (pass
> 2) -- average rate 0.5/sec, recent rate 43.0/sec, hit ratio 0%
> [06/Nov/2014:00:50:26 +0800] - import userRoot: Workers finished; cleaning
> up...
> [06/Nov/2014:00:50:26 +0800] - import userRoot: Workers cleaned up.
> [06/Nov/2014:00:50:26 +0800] - import userRoot: Sweeping files for merging
> later...
> [06/Nov/2014:00:50:26 +0800] - WARNING: Import is running with
> nsslapd-db-private-import-mem on; No other process is allowed to access the
> database
> [06/Nov/2014:00:50:26 +0800] - import userRoot: Sweep done.
> [06/Nov/2014:00:50:26 +0800] - import userRoot: Beginning 2-way merge of up
> to 44 files...
> [06/Nov/2014:00:50:26 +0800] - import userRoot: No files to merge for "aci".
> [06/Nov/2014:00:50:26 +0800] - import userRoot: No files to merge for
> "automountkey".
> [06/Nov/2014:00:50:26 +0800] - import userRoot: No files to merge for
> "carLicense".
> [06/Nov/2014:00:50:26 +0800] - import userRoot: No files to merge for "cn".
> [06/Nov/2014:00:50:26 +0800] - import userRoot: No files to merge for
> "displayname".
> [06/Nov/2014:00:50:26 +0800] - import userRoot: No files to merge for
> "entryusn".
> [06/Nov/2014:00:50:26 +0800] - import userRoot: No files to merge for "fqdn".
> [06/Nov/2014:00:50:26 +0800] - import userRoot: No files to merge for
> "gidnumber".
> [06/Nov/2014:00:50:26 +0800] - import userRoot: No files to merge for
> "givenName".
> [06/Nov/2014:00:50:26 +0800] - import userRoot: No files to merge for
> "ipakrbprincipalalias".
> [06/Nov/2014:00:50:26 +0800] - import userRoot: No files to merge for
> "ipasudorunas".
> [06/Nov/2014:00:50:26 +0800] - import userRoot: No files to merge for
> "ipasudorunasgroup".
> [06/Nov/2014:00:50:26 +0800] - import userRoot: No files to merge for
> "ipauniqueid".
> [06/Nov/2014:00:50:26 +0800] - import userRoot: No files to merge for
> "krbPrincipalName".
> [06/Nov/2014:00:50:26 +0800] - import userRoot: No files to merge for
> "macAddress".
> [06/Nov/2014:00:50:26 +0800] - import userRoot: No files to merge for "mail".
> [06/Nov/2014:00:50:26 +0800] - import userRoot: No files to merge for
> "mailAlternateAddress".
> [06/Nov/2014:00:50:26 +0800] - import userRoot: No files to merge for
> "mailHost".
> [06/Nov/2014:00:50:26 +0800] - import userRoot: No files to merge for
> "managedby".
> [06/Nov/2014:00:50:26 +0800] - import userRoot: No files to merge for
> "manager".
> [06/Nov/2014:00:50:26 +0800] - import userRoot: No files to merge for
> "member".
> [06/Nov/2014:00:50:26 +0800] - import userRoot: No files to merge for
> "memberallowcmd".
> [06/Nov/2014:00:50:26 +0800] - import userRoot: No files to merge for
> "memberdenycmd".
> [06/Nov/2014:00:50:26 +0800] - import userRoot: No files to merge for
> "memberHost".
> [06/Nov/2014:00:50:27 +0800] - import userRoot: No files to merge for
> "memberOf".
> [06/Nov/2014:00:50:27 +0800] - import userRoot: No files to merge for
> "memberservice".
> [06/Nov/2014:00:50:27 +0800] - import userRoot: No files to merge for
> "memberUser".
> [06/Nov/2014:00:50:27 +0800] - import userRoot: No files to merge for
> "nscpEntryDN".
> [06/Nov/2014:00:50:27 +0800] - import userRoot: No files to merge for
> "nsds5ReplConflict".
> [06/Nov/2014:00:50:27 +0800] - import userRoot: No files to merge for
> "nsuniqueid".
> [06/Nov/2014:00:50:27 +0800] - import userRoot: No files to merge for
> "ntUniqueId".
> [06/Nov/2014:00:50:27 +0800] - import userRoot: No files to merge for
> "ntUserDomainId".
> [06/Nov/2014:00:50:27 +0800] - import userRoot: No files to merge for
> "objectclass".
> [06/Nov/2014:00:50:27 +0800] - import userRoot: No files to merge for "ou".
> [06/Nov/2014:00:50:27 +0800] - import userRoot: No files to merge for
> "owner".
> [06/Nov/2014:00:50:27 +0800] - import userRoot: No files to merge for
> "secretary".
> [06/Nov/2014:00:50:27 +0800] - import userRoot: No files to merge for
> "seeAlso".
> [06/Nov/2014:00:50:27 +0800] - import userRoot: No files to merge for "sn".
> [06/Nov/2014:00:50:27 +0800] - import userRoot: No files to merge for
> "sourcehost".
> [06/Nov/2014:00:50:27 +0800] - import userRoot: No files to merge for
> "telephoneNumber".
> [06/Nov/2014:00:50:27 +0800] - import userRoot: No files to merge for
> "title".
> [06/Nov/2014:00:50:27 +0800] - import userRoot: No files to merge for "uid".
> [06/Nov/2014:00:50:27 +0800] - import userRoot: No files to merge for
> "uidnumber".
> [06/Nov/2014:00:50:27 +0800] - import userRoot: No files to merge for
> "uniquemember".
> [06/Nov/2014:00:50:27 +0800] - import userRoot: Merging completed in 0
> seconds.
> [06/Nov/2014:00:50:27 +0800] - import userRoot: Indexing complete.
> Post-processing...
> [06/Nov/2014:00:50:27 +0800] - import userRoot: Generating numSubordinates
> complete.
> [06/Nov/2014:00:50:27 +0800] - import userRoot: Flushing caches...
> [06/Nov/2014:00:50:27 +0800] - import userRoot: Closing files...
> [06/Nov/2014:00:50:27 +0800] - import userRoot: Import complete. Processed
> 1722 entries in 607 seconds. (2.84 entries/sec)
> [06/Nov/2014:00:50:27 +0800] NSMMReplicationPlugin -
> multimaster_be_state_change: replica dc=idc is coming online; enabling
> replication
> [06/Nov/2014:00:50:27 +0800] - Skipping CoS Definition cn=Password
> Policy,cn=accounts,dc=idc--no CoS Templates found, which should be added
> before the CoS Definition.

> mapping tree
> # extended LDIF
> #
> # LDAPv3
> # base <cn=mapping tree,cn=config> with scope subtree
> # filter: (objectclass=*)
> # requesting: ALL
> #

> # mapping tree, config
> dn: cn=mapping tree,cn=config
> objectClass: top
> objectClass: extensibleObject
> cn: mapping tree

> # dc\3Didc, mapping tree, config
> dn: cn=dc\3Didc,cn=mapping tree,cn=config
> objectClass: top
> objectClass: extensibleObject
> objectClass: nsMappingTree
> cn: dc=idc
> cn: "dc=idc"
> nsslapd-state: backend
> nsslapd-backend: userRoot
> nsslapd-referral: ldap://idm1.ra.cn.idc:389/dc%3Didc

> # replica, dc\3Didc, mapping tree, config
> dn: cn=replica,cn=dc\3Didc,cn=mapping tree,cn=config
> cn: replica
> nsDS5Flags: 1
> objectClass: top
> objectClass: nsds5replica
> objectClass: extensibleobject
> nsDS5ReplicaType: 3
> nsDS5ReplicaRoot: dc=idc
> nsds5ReplicaLegacyConsumer: off
> nsDS5ReplicaId: 3
> nsDS5ReplicaBindDN: cn=replication manager,cn=config
> nsState:: AwAAAAAAAADzUlpUAAAAAAAAAAAAAAAAAAAAAAAAAAABAAAAAAAAAA==
> nsDS5ReplicaName: 6cf48589-650a11e4-90ecc46a-d11f6771
> nsds5ReplicaChangeCount: 0
> nsds5replicareapactive: 0

> # meToidm1.ra.cn.idc, replica, dc\3Didc, mapping tree, config
> dn: cn=meToidm1.ra.cn.idc,cn=replica,cn=dc\3Didc,cn=mapping tree,cn=config
> cn: meToidm1.ra.cn.idc
> objectClass: nsds5replicationagreement
> objectClass: top
> nsDS5ReplicaTransportInfo: TLS
> description: me to idm1.ra.cn.idc
> nsDS5ReplicaRoot: dc=idc
> nsDS5ReplicaHost: idm1.ra.cn.idc
> nsds5replicaTimeout: 600
> nsDS5ReplicaBindDN: cn=replication manager,cn=config
> nsDS5ReplicaPort: 389
> nsDS5ReplicaBindMethod: simple
> nsDS5ReplicatedAttributeList: (objectclass=*) $ EXCLUDE memberof
> idnssoaserial
> entryusn krblastsuccessfulauth krblastfailedauth krbloginfailedcount
> nsds5ReplicaStripAttrs: modifiersName modifyTimestamp internalModifiersName
> in
> ternalModifyTimestamp
> nsDS5ReplicaCredentials: {DES}
> nsDS5ReplicatedAttributeListTotal: (objectclass=*) $ EXCLUDE entryusn
> krblasts
> uccessfulauth krblastfailedauth krbloginfailedcount
> nsds5replicareapactive: 0
> nsds5replicaLastUpdateStart: 0
> nsds5replicaLastUpdateEnd: 0
> nsds5replicaChangesSentSinceStartup:
> nsds5replicaLastUpdateStatus: 402 Replication error acquiring replica:
> unknown
> error - Replica has different database generation ID, remote replica may nee
> d to be initialized
> nsds5replicaUpdateInProgress: FALSE
> nsds5replicaLastInitStart: 0
> nsds5replicaLastInitEnd: 0

> # search result
> search: 2
> result: 0 Success

> # numResponses: 5
> # numEntries: 4

> --
> 389 users mailing list
> 389-users at lists.fedoraproject.org
> https://admin.fedoraproject.org/mailman/listinfo/389-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.fedoraproject.org/pipermail/389-users/attachments/20141105/07892f28/attachment.html>


More information about the 389-users mailing list