Hi All,<br><br>I wanted to update this issue as I&#39;ve made some progress but replication is still not working as it should.  I&#39;ve removed the previous communication as it was getting very long and I began to receive &#39;message too large&#39; responses from the list server.  The history of this post can be read in the archives: <a href="http://lists.fedoraproject.org/pipermail/389-users/2012-April/thread.html">http://lists.fedoraproject.org/pipermail/389-users/2012-April/thread.html</a>.<br>

<br>So, I&#39;ve tried to simplify my efforts by removing the consumer replication agreements for now to focus on getting the multi-master replication working first.  To briefly review, I inherited two multi-master systems (A &amp; B) and A has been the only system running for many years.<br>

<br>To get replication working I&#39;ve done the following:<br><br>1.  Initialize master B data from a nightly backup from master A as:<br>     <br>     ./bak2db bak/directory -n &lt;my_suffix&gt;<br><br>     - I see this in the error log:<br>

<br>[20/Apr/2012:10:30:31 -0700] -   Add Attribute readonly Value off<br><br>[20/Apr/2012:10:30:31 -0700] -   Add Attribute nsslapd-directory Value /data/LDAP/slapd-&lt;master A server name&gt;/db/&lt;my_suffix&gt;<br>[20/Apr/2012:10:30:31 -0700] -   Del Attribute nsslapd-directory Value /data/LDAP/slapd-&lt;master B server name&gt;/db/&lt;my_suffix&gt;<br>

<br>[20/Apr/2012:10:30:31 -0700] - WARNING!!: current Instance Config is different from backed up configuration; The backup is restored.<br>[20/Apr/2012:10:30:31 -0700] - dblayer_restore: Removing staging area /opt/fedora-ds/slapd-&lt;master B server name&gt;/db/../fribak.<br>

<br><b>Is there any problem regarding the lines above that change the &#39;</b><b>nsslapd-directory&quot; attribute from it&#39;s original correct master B path to the path of master A</b><b> as part of the initialization?  Or is this reset to the correct path for master B?</b>  <b>If I need to reset some attributes, how can I view the current nsslapd-directory attribute from the command line with ldapsearch?</b><br>

<br>2. Start slapd deamon on master B.<br><br>     From error log:<br><br>[20/Apr/2012:10:30:40 -0700] - Fedora-Directory/7.1 B2005.146.2010 starting up<br>[20/Apr/2012:10:30:40 -0700] NSMMReplicationPlugin - replica_check_for_data_reload: Warning: data for replica o=&lt;my_suffix&gt; was reloaded and it no longer matches the data in the changelog (replica data &gt; changelog). Recreating the changelog file. This could affect replication with replica&#39;s consumers in which case the consumers should be reinitialized.<br>

<br>3. Create replication agreements between master A and B on both systems.<br>4. Run an initialization from the DS console on master B to master A.<br><br>Here is what I see from the logs:<br><br>error log on master B:<br>

<br>[20/Apr/2012:10:30:40 -0700] - slapd started.  Listening on All Interfaces port 389 for LDAP requests<br>[20/Apr/2012:10:31:05 -0700] NSMMReplicationPlugin - Beginning total update of replica &quot;agmt=&quot;cn=&lt;my_suffix&gt;_to_&lt;master_A&gt;&quot; (&lt;master_A&gt;:389)&quot;.<br>

[20/Apr/2012:10:31:11 -0700] NSMMReplicationPlugin - Finished total update of replica &quot;agmt=&quot;cn=&lt;my_suffix&gt;_to_&lt;master_A&gt;&quot; (&lt;master_A&gt;:389)&quot;. Sent 1718 entries.<br><br><b>The above appears to have sent 1718 entries to master A.  And &quot;replication status&#39; on master B says &quot;incremental update succeeded&quot;.<br>

<br></b>error log on master A:<br><br>[20/Apr/2012:10:30:40 -0700] NSMMReplicationPlugin - conn=1578 op=3 repl=&quot;o=my_suffix&quot;: Begin incremental protocol<br>[20/Apr/2012:10:30:40 -0700] NSMMReplicationPlugin - conn=1578 op=3 repl=&quot;o=my_suffix&quot;: Acquired replica<br>

[20/Apr/2012:10:30:40 -0700] NSMMReplicationPlugin - conn=1578 op=3 repl=&quot;o=my_suffix&quot;: StartNSDS50ReplicationRequest: response=0 rc=0<br>[20/Apr/2012:10:30:40 -0700] NSMMReplicationPlugin - conn=1578 op=5 repl=&quot;o=my_suffix&quot;: Released replica<br>

[20/Apr/2012:10:31:03 -0700] NSMMReplicationPlugin - conn=1579 op=3 repl=&quot;o=my_suffix&quot;: Begin total protocol<br>[20/Apr/2012:10:31:03 -0700] NSMMReplicationPlugin - conn=1579 op=3 repl=&quot;o=my_suffix&quot;: Acquired replica<br>

[20/Apr/2012:10:31:03 -0700] NSMMReplicationPlugin - multimaster_be_state_change: replica o=my_suffix is going offline; disabling replication<br>[20/Apr/2012:10:31:04 -0700] NSMMReplicationPlugin - agmt=&quot;cn=my_suffix_to_master_B&quot; (master_B:389): State: backoff -&gt; backoff<br>

[20/Apr/2012:10:31:04 -0700] NSMMReplicationPlugin - agmt=&quot;cn=my_suffix_to_master_B&quot; (master_B:389): State: backoff -&gt; backoff<br>[20/Apr/2012:10:31:04 -0700] NSMMReplicationPlugin - agmt=&quot;cn=my_suffix_to_master_B&quot; (master_B:389): No linger to cancel on the connection<br>

[20/Apr/2012:10:31:04 -0700] NSMMReplicationPlugin - agmt=&quot;cn=my_suffix_to_master_B&quot; (master_B:389): Disconnected from the consumer<br>[20/Apr/2012:10:31:05 -0700] NSMMReplicationPlugin - agmt=&quot;cn=my_suffix_to_master_B&quot; (master_B:389): repl5_inc_stop: protocol stopped after 0 seconds<br>

[20/Apr/2012:10:31:05 -0700] NSMMReplicationPlugin - conn=0 op=0 repl=&quot;o=my_suffix&quot;: Replica in use locking_purl=conn=1579 id=3<br>[20/Apr/2012:10:31:05 -0700] NSMMReplicationPlugin - replica_disable_replication: replica o=my_suffix is acquired<br>

[20/Apr/2012:10:31:05 -0700] - WARNING: Import is running with nsslapd-db-private-import-mem on; No other process is allowed to access the database<br>[20/Apr/2012:10:31:05 -0700] NSMMReplicationPlugin - conn=1579 op=3 repl=&quot;o=my_suffix&quot;: StartNSDS50ReplicationRequest: response=0 rc=0<br>

[20/Apr/2012:10:31:09 -0700] - import my_suffix: Workers finished; cleaning up...<br>[20/Apr/2012:10:31:09 -0700] - import my_suffix: Workers cleaned up.<br>[20/Apr/2012:10:31:09 -0700] - import my_suffix: Indexing complete.  Post-processing...<br>

[20/Apr/2012:10:31:09 -0700] - import my_suffix: Flushing caches...<br>[20/Apr/2012:10:31:09 -0700] - import my_suffix: Closing files...<br>[20/Apr/2012:10:31:10 -0700] - import my_suffix: Import complete.  Processed 1718 entries in 5 seconds. (343.60 entries/sec)<br>

<br><b>The above log info looks as if it did &#39;acquire&#39; replication from master B and processed 1718 entries.</b><br><br>[20/Apr/2012:10:31:10 -0700] NSMMReplicationPlugin - multimaster_be_state_change: replica o=my_suffix is coming online; enabling replication<br>

[20/Apr/2012:10:31:10 -0700] NSMMReplicationPlugin - _replica_configure_ruv: No ruv tombstone found for replica o=my_suffix. Created a new one<br>[20/Apr/2012:10:31:10 -0700] NSMMReplicationPlugin - replica_reload_ruv: Warning: new data for replica o=my_suffix does not match the data in the changelog.<br>

 Recreating the changelog file. This could affect replication with replica&#39;s  consumers in which case the consumers should be reinitialized.<br>[20/Apr/2012:10:31:11 -0700] NSMMReplicationPlugin - conn=0 op=0 repl=&quot;o=my_suffix&quot;: Released replica<br>

[20/Apr/2012:10:31:11 -0700] NSMMReplicationPlugin - replica_enable_replication: replica o=my_suffix is relinquished<br>[20/Apr/2012:10:31:11 -0700] NSMMReplicationPlugin - agmt=&quot;cn=my_suffix_to_master_B&quot; (master_B:389): No linger to cancel on the connection<br>

[20/Apr/2012:10:31:11 -0700] NSMMReplicationPlugin - agmt=&quot;cn=my_suffix_to_master_B&quot; (master_B:389): Disconnected from the consumer<br>[20/Apr/2012:10:31:11 -0700] NSMMReplicationPlugin - agmt=&quot;cn=my_suffix_to_master_B&quot; (master_B:389): State: start -&gt; ready_to_acquire_replica<br>

[20/Apr/2012:10:31:11 -0700] NSMMReplicationPlugin - changelog program - cl5DeleteDBSync: file for replica at (o=my_suffix) not found<br>[20/Apr/2012:10:31:11 -0700] NSMMReplicationPlugin - agmt=&quot;cn=my_suffix_to_master_B&quot; (master_B:389): State: ready_to_acquire_replica -&gt; wait_for_changes<br>

[20/Apr/2012:10:31:11 -0700] NSMMReplicationPlugin - changelog program - _cl5NewDBFile: semaphore /opt/fedora-ds/slapd-&lt;master_A&gt;/changelogdb/1da9fe82-1dd211b2-80bc8f56-47cc0000.sema<br>[20/Apr/2012:10:31:11 -0700] NSMMReplicationPlugin - changelog program - _cl5NewDBFile: maxConcurrentWrites=2<br>

[20/Apr/2012:10:31:11 -0700] NSMMReplicationPlugin - changelog program - _cl5GetEntryCount: 0 changes for replica 1da9fe82-1dd211b2-80bc8f56-47cc0000<br>[20/Apr/2012:10:31:11 -0700] NSMMReplicationPlugin - conn=1579 op=1722 repl=&quot;o=my_suffix&quot;: Replica not in use<br>

<br><b>The above is the last of logs referring to this replication.  Is there anything odd?<br><br></b>The replication agreements are set to &#39;always keep directories in sync&#39; and since this manual initialization from the console the logs go back to (every 5 min or so):<br>

<br>master A error log:<br><br>Unable to acquire replica: permission denied. The bind dn &quot;cn=replication manager,cn=config&quot; does not have permission to supply replication updates to the replica. Will retry later.<br>

<br>master B error log:<br><br>Unable to acquire replica: error: permission denied<br><br><b>It seems as if the attempt to sync between master A &amp; B is always from A to B.  Is this normal, could this have anything to do with the </b><b>&#39;</b><b>nsslapd-directory&quot; attribute?<br>

<br></b>As always any help is greatly appreciated.<br><br>Thanks in advance,<br><br>Herb<br><br>