Hello Everyone,

Regarding cleaning up, I'm unable to run "ipa topologysegment-del ca ns01.dev.example.net-to-ns02.dev.example.net" due to a message "ipa: ERROR: Server is unwilling to perform: Removal of Segment disconnects topology.Deletion not allowed."

So if I go into LDAP manually, I find this DN: "DN: cn=masterAgreement1-ns02.dev.example.net-pki-tomcat,cn=replica,cn=o\3Dipaca,cn=mapping tree,cn=config"

According to ipa-replica-manage list-ruv I have certificate server RUVs 96 and 12. the DN above does not have any reference to RUV 12, instead listing a RUV 86. Yet "ipa-replica-manager clean-dangling-ruv" says no dangling RUVs found.

Is there a test I can do to see whether ns02 pki-tomcat is correctly replicating ns01?

Thank you!

Anthony

On Mon, Apr 8, 2019 at 3:57 PM Anthony Jarvis-Clark <anthonyclarka2@gmail.com> wrote:
Hi Rob,

Thank you for the pointer. The ipa topologysegment output only showed one agreement, as far as I can tell there aren't any dangling entries that the ipa command line tool can find.

Searching LDAP on ns01 shows multiple nsds50ruv entries in the cn=masterAgreement1-ns02.dev.example.net-pki-tomcat,cn=replica,cn=o\3Dipaca,cn=mapping tree,cn=config DN. That DN has a nsds5replicaLastUpdateStatus of "Error (32) Problem connecting to replica - LDAP error: No such object (connection error)"

Should I use "ipa topologysegment-del" to clean up the CA replication? I'm thinking "del" because the agreement has old stuff hanging around in it, so removing it then re-adding it should clear everything out correctly.

Thank you,

Anthony Clark




On Mon, Apr 8, 2019 at 10:36 AM Rob Crittenden <rcritten@redhat.com> wrote:
Anthony Jarvis-Clark via FreeIPA-users wrote:
> Hello Everyone,
>
> Over the weekend we lost a replica during an upgrade and had to rebuild
> it. The OS (CentOS 7.6) was reinstalled from scratch, the host then
> added to the IPA domain, and then turned into a replica.
>
> Sequence of events:
> 1) ns01 upgraded from FreeIPA 4.4.0-14 to 4.6.4-10
> 2) ns02 corrupted during upgrade process
> 3) on ns01, "ipa-replica-manage del ns02" ran.
> 4) ns02 rebuilt from scratch with latest CentOS 7.6 packages.
> 5) ns02 added to IPA domain
> 6) ns02 added as replica
>
> The process went well, no errors during the "ipa-replica-install
> --setup-ca --setup-kra --setup-dns --forwarder=x.x.x.x" process.
>
> However, on ns01, I'm getting the following message in /var/log/messages:
>
> Apr  8 13:54:36 ns01 ns-slapd: [08/Apr/2019:13:54:36.294135188 +0000] -
> ERR - slapi_ldap_bind - Error: could not bind id [cn=Replication Manager
> cloneAgreement1-ns02.dev.example.net-pki-tomcat,ou=csusers,cn=config]
> authentication mechanism [SIMPLE]: error 32 (No such object)
> ...
> Apr  8 13:59:36 ns01 ns-slapd: [08/Apr/2019:13:59:36.547881587 +0000] -
> ERR - slapi_ldap_bind - Error: could not bind id [cn=Replication Manager
> cloneAgreement1-ns02.dev.example.net-pki-tomcat,ou=csusers,cn=config]
> authentication mechanism [SIMPLE]: error 32 (No such object)
>
> If I run a search in ns01's LDAP I get this result:
>
> [root@ns01 ~]# ldapsearch -xLLL -h ns01.dev.example.net
> <http://ns01.dev.example.net> -D "cn=directory manager" -W -b
> "ou=csusers,cn=config"
> Enter LDAP Password:
> dn: ou=csusers,cn=config
> objectClass: top
> objectClass: organizationalUnit
> ou: csusers
>
> dn: cn=Replication Manager masterAgreement1-ns02.dev.example.net-pki-tomca
>  t,ou=csusers,cn=config
> cn: Replication Manager masterAgreement1-ns02.dev.example.net-pki-tomcat
> objectClass: top
> objectClass: person
> sn: manager
> userPassword:: redacted!!!
>
> So there's a "masterAgreement1" but no "cloneAgreement1".
>
> Is that something hanging around from the previous replica agreement? If
> so, how do I fix whatever is running that query every 5 minutes?
>
> Or is it indicative of something else that is wrong? I ran the tool
> from https://github.com/peterpakos/checkipaconsistency and it reports
> everything is fine (except that ns01 has a dangling AD trust that was
> supposed to be removed, but that's for another post I guess).
>
> How do I identify the process that is running that query that causes the
> error message in /var/log/messages?

Sounds like there is still a CA replication agreement. You can try using
the topologysegment command(s) to list and hopefully remove this
dangling agreement.

rob