Hi Flo,
Il 26/03/2019 09:45, Florence Blanc-Renaud via FreeIPA-users ha scritto:
On 3/20/19 9:32 AM, Giulio Casella via FreeIPA-users wrote:
Hi everyone, I'm stuck with a broken replica. I had a setup with two ipa server in replica (ipa-server-4.6.4 on CentOS 7.6), let's say "idc01" and "idc02".
Due to heavy load idc01 crashed many times, and was not working anymore.
So I tried to redo the replica again. At first I tried to "ipa-replica-manage re-initialize", with no success.
Now I'm trying to redo from scratch the replica setup: on idc02 I removed the segments (ipa topologysegment-del, for both ca and domain suffix), on idc01 I removed everything (ipa-server-install --uninstall), then I joined domain (ipa-client-install), and everything is working so far.
When doing "ipa-replica-install" on idc01 I get:
[...] [28/41]: setting up initial replication Starting replication, please wait until this has completed. Update in progress, 22 seconds elapsed [ldap://idc02.my.dom.ain:389] reports: Update failed! Status: [Error (-11) connection error: Unknown connection error (-11) - Total update aborted]
And on idc02 (the working server), in /var/log/dirsrv/slapd-MY-DOM-AIN/errors I find lines stating:
[20/Mar/2019:09:28:06.545187923 +0100] - INFO - NSMMReplicationPlugin - repl5_tot_run - Beginning total update of replica "agmt="cn=meToidc01.my.dom.ain" (idc01:389)". [20/Mar/2019:09:28:26.528046160 +0100] - ERR - NSMMReplicationPlugin - perform_operation - agmt="cn=meToidc01.my.dom.ain" (idc01:389): Failed to send extended operation: LDAP error -1 (Can't contact LDAP server) [20/Mar/2019:09:28:26.530763939 +0100] - ERR - NSMMReplicationPlugin - repl5_tot_log_operation_failure - agmt="cn=meToidc01.my.dom.ain" (idc01:389): Received error -1 (Can't contact LDAP server): for total update operation [20/Mar/2019:09:28:26.532678072 +0100] - ERR - NSMMReplicationPlugin - release_replica - agmt="cn=meToidc01.my.dom.ain" (idc01:389): Unable to send endReplication extended operation (Can't contact LDAP server) [20/Mar/2019:09:28:26.534307539 +0100] - ERR - NSMMReplicationPlugin - repl5_tot_run - Total update failed for replica "agmt="cn=meToidc01.my.dom.ain" (idc01:389)", error (-11) [20/Mar/2019:09:28:26.561763168 +0100] - INFO - NSMMReplicationPlugin - bind_and_check_pwp - agmt="cn=meToidc01.my.dom.ain" (idc01:389): Replication bind with GSSAPI auth resumed [20/Mar/2019:09:28:26.582389258 +0100] - WARN - NSMMReplicationPlugin - repl5_inc_run - agmt="cn=meToidc01.my.dom.ain" (idc01:389): The remote replica has a different database generation ID than the local database. You may have to reinitialize the remote replica, or the local replica.
It seems that idc02 remembers something about the old replica.
Any hint?
Hi,
In order to clean every reference to the old replica: (on idc01) $ ipa-server-install --uninstall -U $ kdestroy -A
(on idc02) $ ipa-replica-manage del idc01.my.dom.ain --clean --force
Then you should be able to reinstall idc01 as a replica.
No way, same result, it hangs in "[28/41]: setting up initial replication", after about 20 secs. I also tried, on idc02, to clean all RUVs referring idc01, with no luck.