On 4/10/19 4:59 PM, Rob Crittenden wrote:
Giulio Casella via FreeIPA-users wrote:
Hi, I managed to fix it! The solution was to increase a couple of parameters in ldap config. I passed "--dirsrv-config-file=custom.ldif" to ipa-replica-install, with custom.ldif containing:
dn: cn=config changetype: modify replace: nsslapd-maxsasliosize nsslapd-maxsasliosize: 4194304 replace: nsslapd-sasl-max-buffer-size nsslapd-sasl-max-buffer-size: 4194304
In brief I doubled the sasl buffer size, because I noticed a log message saying "SASL encrypted packet length exceeds maximum allowed limit".
But the behaviour of ipa-replica-install was quite strange, it crashed, and in a packet capture session I noticed the presence of some "TCP zero window" packets sent from wannabe-replica to existing ipa server. Maybe developers want to try to catch that error and revert the operation, just like is done with other kind of errors.
Maybe one of the 389-ds devs have an idea. They're probably going to want to see logs and what your definition of crash is.
rob
TCP zero window make me think to a client not reading fast enough. Is it transient/recoverable or not ?
Rob is right, if a problem is detected at 389-ds level, access/errors logs are appreciated. and also the ipa-replica-install backstack when it crashed.
regards thierry
Ciao, g
Il 01/04/2019 15:28, Giulio Casella via FreeIPA-users ha scritto:
Hi, I'm still stuck on this, I tried to delete every reference to the old server, with ipa commands ("ipa-replica-manage clean-ruv") and directly in ldap (as reported in https://access.redhat.com/solutions/136993).
If I try to "ipa-replica-manage list-ruv" on idc02 I get:
Replica Update Vectors: idc02.my.dom.ain:389: 5 Certificate Server Replica Update Vectors: idc02.my.dom.ain:389: 91
(same result looking directly into ldap)
is it correct? Does a server has replica reference to itself?
I also tried to instantiate a new server, idc03.my.dom.ain, never known before (fresh centos install, ipa-client-install, ipa-replica-install). The setup (surprisingly to me) failed (details below).
At this point I suspect the problem is on idc02 (the only working server), unrelated to previous server idc01.
For completeness this is what I did:
. Fresh install of a CentOS 7 box, updated, installed ipa software (name idc03.my.dom.ain) . ipa-client-install --principal admin --domain=my.dom.ain --realm=MY.DOM.AIN --force-join . ipa-replica-install --setup-dns --no-forwarders --setup-ca
Last command failed (in "[28/41]: setting up initial replication"), and in /var/log/ipareplica-install.log of idc03 I read:
[...] 2019-03-28T09:30:48Z DEBUG [28/41]: setting up initial replication 2019-03-28T09:30:48Z DEBUG retrieving schema for SchemaCache url=ldapi://%2fvar%2frun%2fslapd-MY-DOM-AIN.socket conn=<ldap.ldapobject.SimpleLDAPObject instance at 0x7fb72af73050> 2019-03-28T09:30:48Z DEBUG Destroyed connection context.ldap2_140424739228880 2019-03-28T09:30:48Z DEBUG Starting external process 2019-03-28T09:30:48Z DEBUG args=/bin/systemctl --system daemon-reload 2019-03-28T09:30:48Z DEBUG Process finished, return code=0 2019-03-28T09:30:48Z DEBUG stdout= 2019-03-28T09:30:48Z DEBUG stderr= 2019-03-28T09:30:48Z DEBUG Starting external process 2019-03-28T09:30:48Z DEBUG args=/bin/systemctl restart dirsrv@MY-DOM-AIN.service 2019-03-28T09:30:54Z DEBUG Process finished, return code=0 2019-03-28T09:30:54Z DEBUG stdout= 2019-03-28T09:30:54Z DEBUG stderr= 2019-03-28T09:30:54Z DEBUG Restart of dirsrv@MY-DOM-AIN.service complete 2019-03-28T09:30:54Z DEBUG Created connection context.ldap2_140424739228880 2019-03-28T09:30:55Z DEBUG Fetching nsDS5ReplicaId from master [attempt 1/5] 2019-03-28T09:30:55Z DEBUG retrieving schema for SchemaCache url=ldap://idc02.my.dom.ain:389 conn=<ldap.ldapobject.SimpleLDAPObject instance at 0x7fb72bf8e128> 2019-03-28T09:30:55Z DEBUG Successfully updated nsDS5ReplicaId. 2019-03-28T09:30:55Z DEBUG Add or update replica config cn=replica,cn=dc=my,dc=dom,dc=ain,cn=mapping tree,cn=config 2019-03-28T09:30:55Z DEBUG Added replica config cn=replica,cn=dc=my,dc=dom,dc=ain,cn=mapping tree,cn=config 2019-03-28T09:30:55Z DEBUG Add or update replica config cn=replica,cn=dc=my,dc=dom,dc=ain,cn=mapping tree,cn=config 2019-03-28T09:30:55Z DEBUG No update to cn=replica,cn=dc=my,dc=dom,dc=ain,cn=mapping tree,cn=config necessary 2019-03-28T09:30:55Z DEBUG Waiting for replication (ldap://idc02.my.dom.ain:389) cn=meToidc03.my.dom.ain,cn=replica,cn=dc=my,dc=dom,dc=ain,cn=mapping tree,cn=config (objectclass=*) 2019-03-28T09:30:55Z DEBUG Entry found [LDAPEntry(ipapython.dn.DN('cn=meToidc03.my.dom.ain,cn=replica,cn=dc=my,dc=dom,dc=ain,cn=mapping tree,cn=config'), {u'nsds5replicaLastInitStart': ['19700101000000Z'], u'nsds5replicaUpdateInProgress': ['FALSE'], u'cn': ['meToidc03.my.dom.ain'], u'objectClass': ['nsds5replicationagreement', 'top'], u'nsds5replicaLastUpdateEnd': ['19700101000000Z'], u'nsDS5ReplicaRoot': ['dc=my,dc=dom,dc=ain'], u'nsDS5ReplicaHost': ['idc03.my.dom.ain'], u'nsds5replicaLastUpdateStatus': ['Error (0) No replication sessions started since server startup'], u'nsDS5ReplicaBindMethod': ['SASL/GSSAPI'], u'nsds5ReplicaStripAttrs': ['modifiersName modifyTimestamp internalModifiersName internalModifyTimestamp'], u'nsds5replicaLastUpdateStart': ['19700101000000Z'], u'nsDS5ReplicaPort': ['389'], u'nsDS5ReplicaTransportInfo': ['LDAP'], u'description': ['me to idc03.my.dom.ain'], u'nsds5replicareapactive': ['0'], u'nsds5replicaChangesSentSinceStartup': [''], u'nsds5replicaTimeout': ['120'], u'nsDS5ReplicatedAttributeList': ['(objectclass=*) $ EXCLUDE memberof idnssoaserial entryusn krblastsuccessfulauth krblastfailedauth krbloginfailedcount'], u'nsds5replicaLastInitEnd': ['19700101000000Z'], u'nsDS5ReplicatedAttributeListTotal': ['(objectclass=*) $ EXCLUDE entryusn krblastsuccessfulauth krblastfailedauth krbloginfailedcount']})] 2019-03-28T09:30:55Z DEBUG Entry found [LDAPEntry(ipapython.dn.DN('cn=meToidc02.my.dom.ain,cn=replica,cn=dc=my,dc=dom,dc=ain,cn=mapping tree,cn=config'), {u'nsds5replicaLastInitStart': ['19700101000000Z'], u'nsds5replicaUpdateInProgress': ['FALSE'], u'cn': ['meToidc02.my.dom.ain'], u'objectClass': ['nsds5replicationagreement', 'top'], u'nsds5replicaLastUpdateEnd': ['19700101000000Z'], u'nsDS5ReplicaRoot': ['dc=my,dc=dom,dc=ain'], u'nsDS5ReplicaHost': ['idc02.my.dom.ain'], u'nsds5replicaLastUpdateStatus': ['Error (0) No replication sessions started since server startup'], u'nsDS5ReplicaBindMethod': ['SASL/GSSAPI'], u'nsds5ReplicaStripAttrs': ['modifiersName modifyTimestamp internalModifiersName internalModifyTimestamp'], u'nsds5replicaLastUpdateStart': ['19700101000000Z'], u'nsDS5ReplicaPort': ['389'], u'nsDS5ReplicaTransportInfo': ['LDAP'], u'description': ['me to idc02.my.dom.ain'], u'nsds5replicareapactive': ['0'], u'nsds5replicaChangesSentSinceStartup': [''], u'nsds5replicaTimeout': ['120'], u'nsDS5ReplicatedAttributeList': ['(objectclass=*) $ EXCLUDE memberof idnssoaserial entryusn krblastsuccessfulauth krblastfailedauth krbloginfailedcount'], u'nsds5replicaLastInitEnd': ['19700101000000Z'], u'nsDS5ReplicatedAttributeListTotal': ['(objectclass=*) $ EXCLUDE entryusn krblastsuccessfulauth krblastfailedauth krbloginfailedcount']})] 2019-03-28T09:31:15Z DEBUG Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/ipaserver/install/service.py", line 570, in start_creation run_step(full_msg, method) File "/usr/lib/python2.7/site-packages/ipaserver/install/service.py", line 560, in run_step method() File "/usr/lib/python2.7/site-packages/ipaserver/install/dsinstance.py", line 456, in __setup_replica cacert=self.ca_file File "/usr/lib/python2.7/site-packages/ipaserver/install/replication.py", line 1817, in setup_promote_replication raise RuntimeError("Failed to start replication") RuntimeError: Failed to start replication [...]
while in /var/log/dirsrv/slapd-MY-DOM-AIN/errors of idc02 I can find:
[...] [28/Mar/2019:10:30:56.602197981 +0100] - INFO - NSMMReplicationPlugin - repl5_tot_run - Beginning total update of replica "agmt="cn=meToidc03.my.dom.ain" (idc03:389)". [28/Mar/2019:10:31:15.787867217 +0100] - ERR - NSMMReplicationPlugin - repl5_tot_log_operation_failure - agmt="cn=meToidc03.my.dom.ain" (idc03:389): Received error -1 (Can't contact LDAP server): for total update operation [28/Mar/2019:10:31:15.789885458 +0100] - ERR - NSMMReplicationPlugin - release_replica - agmt="cn=meToidc03.my.dom.ain" (idc03:389): Unable to send endReplication extended operation (Can't contact LDAP server) [28/Mar/2019:10:31:15.791374133 +0100] - ERR - NSMMReplicationPlugin - repl5_tot_run - Total update failed for replica "agmt="cn=meToidc03.my.dom.ain" (idc03:389)", error (-11) [28/Mar/2019:10:31:15.823809612 +0100] - INFO - NSMMReplicationPlugin - bind_and_check_pwp - agmt="cn=meToidc03.my.dom.ain" (idc03:389): Replication bind with GSSAPI auth resumed [28/Mar/2019:10:31:16.221049084 +0100] - WARN - NSMMReplicationPlugin - repl5_inc_run - agmt="cn=meToidc03.my.dom.ain" (idc03:389): The remote replica has a different database generation ID than the local database. You may have to reinitialize the remote replica, or the local replica. [28/Mar/2019:10:31:19.234198978 +0100] - WARN - NSMMReplicationPlugin - repl5_inc_run - agmt="cn=meToidc03.my.dom.ain" (idc03:389): The remote replica has a different database generation ID than the local database. You may have to reinitialize the remote replica, or the local replica. [28/Mar/2019:10:31:22.247206811 +0100] - WARN - NSMMReplicationPlugin - repl5_inc_run - agmt="cn=meToidc03.my.dom.ain" (idc03:389): The remote replica has a different database generation ID than the local database. You may have to reinitialize the remote replica, or the local replica.
Last message keeps repeating until I uninstall replica on idc03.
How can I restore a scenario with a redundant setup (more than one ipa server)?
Thanks in advance, Giulio Casella
Il 26/03/2019 11:08, Giulio Casella via FreeIPA-users ha scritto:
Hi Flo,
Il 26/03/2019 09:45, Florence Blanc-Renaud via FreeIPA-users ha scritto:
On 3/20/19 9:32 AM, Giulio Casella via FreeIPA-users wrote:
Hi everyone, I'm stuck with a broken replica. I had a setup with two ipa server in replica (ipa-server-4.6.4 on CentOS 7.6), let's say "idc01" and "idc02".
Due to heavy load idc01 crashed many times, and was not working anymore.
So I tried to redo the replica again. At first I tried to "ipa-replica-manage re-initialize", with no success.
Now I'm trying to redo from scratch the replica setup: on idc02 I removed the segments (ipa topologysegment-del, for both ca and domain suffix), on idc01 I removed everything (ipa-server-install --uninstall), then I joined domain (ipa-client-install), and everything is working so far.
When doing "ipa-replica-install" on idc01 I get:
[...] [28/41]: setting up initial replication Starting replication, please wait until this has completed. Update in progress, 22 seconds elapsed [ldap://idc02.my.dom.ain:389] reports: Update failed! Status: [Error (-11) connection error: Unknown connection error (-11) - Total update aborted]
And on idc02 (the working server), in /var/log/dirsrv/slapd-MY-DOM-AIN/errors I find lines stating:
[20/Mar/2019:09:28:06.545187923 +0100] - INFO - NSMMReplicationPlugin - repl5_tot_run - Beginning total update of replica "agmt="cn=meToidc01.my.dom.ain" (idc01:389)". [20/Mar/2019:09:28:26.528046160 +0100] - ERR - NSMMReplicationPlugin - perform_operation - agmt="cn=meToidc01.my.dom.ain" (idc01:389): Failed to send extended operation: LDAP error -1 (Can't contact LDAP server) [20/Mar/2019:09:28:26.530763939 +0100] - ERR - NSMMReplicationPlugin - repl5_tot_log_operation_failure - agmt="cn=meToidc01.my.dom.ain" (idc01:389): Received error -1 (Can't contact LDAP server): for total update operation [20/Mar/2019:09:28:26.532678072 +0100] - ERR - NSMMReplicationPlugin - release_replica - agmt="cn=meToidc01.my.dom.ain" (idc01:389): Unable to send endReplication extended operation (Can't contact LDAP server) [20/Mar/2019:09:28:26.534307539 +0100] - ERR - NSMMReplicationPlugin - repl5_tot_run - Total update failed for replica "agmt="cn=meToidc01.my.dom.ain" (idc01:389)", error (-11) [20/Mar/2019:09:28:26.561763168 +0100] - INFO - NSMMReplicationPlugin - bind_and_check_pwp - agmt="cn=meToidc01.my.dom.ain" (idc01:389): Replication bind with GSSAPI auth resumed [20/Mar/2019:09:28:26.582389258 +0100] - WARN - NSMMReplicationPlugin - repl5_inc_run - agmt="cn=meToidc01.my.dom.ain" (idc01:389): The remote replica has a different database generation ID than the local database. You may have to reinitialize the remote replica, or the local replica.
It seems that idc02 remembers something about the old replica.
Any hint?
Hi,
In order to clean every reference to the old replica: (on idc01) $ ipa-server-install --uninstall -U $ kdestroy -A
(on idc02) $ ipa-replica-manage del idc01.my.dom.ain --clean --force
Then you should be able to reinstall idc01 as a replica.
No way, same result, it hangs in "[28/41]: setting up initial replication", after about 20 secs. I also tried, on idc02, to clean all RUVs referring idc01, with no luck. _______________________________________________ FreeIPA-users mailing list -- freeipa-users@lists.fedorahosted.org To unsubscribe send an email to freeipa-users-leave@lists.fedorahosted.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedorahosted.org/archives/list/freeipa-users@lists.fedorahoste...
FreeIPA-users mailing list -- freeipa-users@lists.fedorahosted.org To unsubscribe send an email to freeipa-users-leave@lists.fedorahosted.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedorahosted.org/archives/list/freeipa-users@lists.fedorahoste...
FreeIPA-users mailing list -- freeipa-users@lists.fedorahosted.org To unsubscribe send an email to freeipa-users-leave@lists.fedorahosted.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedorahosted.org/archives/list/freeipa-users@lists.fedorahoste...