Hello Everyone,
Over the weekend we lost a replica during an upgrade and had to rebuild it. The OS (CentOS 7.6) was reinstalled from scratch, the host then added to the IPA domain, and then turned into a replica.
Sequence of events: 1) ns01 upgraded from FreeIPA 4.4.0-14 to 4.6.4-10 2) ns02 corrupted during upgrade process 3) on ns01, "ipa-replica-manage del ns02" ran. 4) ns02 rebuilt from scratch with latest CentOS 7.6 packages. 5) ns02 added to IPA domain 6) ns02 added as replica
The process went well, no errors during the "ipa-replica-install --setup-ca --setup-kra --setup-dns --forwarder=x.x.x.x" process.
However, on ns01, I'm getting the following message in /var/log/messages:
Apr 8 13:54:36 ns01 ns-slapd: [08/Apr/2019:13:54:36.294135188 +0000] - ERR - slapi_ldap_bind - Error: could not bind id [cn=Replication Manager cloneAgreement1-ns02.dev.example.net-pki-tomcat,ou=csusers,cn=config] authentication mechanism [SIMPLE]: error 32 (No such object) ... Apr 8 13:59:36 ns01 ns-slapd: [08/Apr/2019:13:59:36.547881587 +0000] - ERR - slapi_ldap_bind - Error: could not bind id [cn=Replication Manager cloneAgreement1-ns02.dev.example.net-pki-tomcat,ou=csusers,cn=config] authentication mechanism [SIMPLE]: error 32 (No such object)
If I run a search in ns01's LDAP I get this result:
[root@ns01 ~]# ldapsearch -xLLL -h ns01.dev.example.net -D "cn=directory manager" -W -b "ou=csusers,cn=config" Enter LDAP Password: dn: ou=csusers,cn=config objectClass: top objectClass: organizationalUnit ou: csusers
dn: cn=Replication Manager masterAgreement1-ns02.dev.example.net-pki-tomca t,ou=csusers,cn=config cn: Replication Manager masterAgreement1-ns02.dev.example.net-pki-tomcat objectClass: top objectClass: person sn: manager userPassword:: redacted!!!
So there's a "masterAgreement1" but no "cloneAgreement1".
Is that something hanging around from the previous replica agreement? If so, how do I fix whatever is running that query every 5 minutes?
Or is it indicative of something else that is wrong? I ran the tool from https://github.com/peterpakos/checkipaconsistency and it reports everything is fine (except that ns01 has a dangling AD trust that was supposed to be removed, but that's for another post I guess).
How do I identify the process that is running that query that causes the error message in /var/log/messages?
Many, many thanks,
Anthony Clark
Anthony Jarvis-Clark via FreeIPA-users wrote:
Hello Everyone,
Over the weekend we lost a replica during an upgrade and had to rebuild it. The OS (CentOS 7.6) was reinstalled from scratch, the host then added to the IPA domain, and then turned into a replica.
Sequence of events:
- ns01 upgraded from FreeIPA 4.4.0-14 to 4.6.4-10
- ns02 corrupted during upgrade process
- on ns01, "ipa-replica-manage del ns02" ran.
- ns02 rebuilt from scratch with latest CentOS 7.6 packages.
- ns02 added to IPA domain
- ns02 added as replica
The process went well, no errors during the "ipa-replica-install --setup-ca --setup-kra --setup-dns --forwarder=x.x.x.x" process.
However, on ns01, I'm getting the following message in /var/log/messages:
Apr 8 13:54:36 ns01 ns-slapd: [08/Apr/2019:13:54:36.294135188 +0000] - ERR - slapi_ldap_bind - Error: could not bind id [cn=Replication Manager cloneAgreement1-ns02.dev.example.net-pki-tomcat,ou=csusers,cn=config] authentication mechanism [SIMPLE]: error 32 (No such object) ... Apr 8 13:59:36 ns01 ns-slapd: [08/Apr/2019:13:59:36.547881587 +0000] - ERR - slapi_ldap_bind - Error: could not bind id [cn=Replication Manager cloneAgreement1-ns02.dev.example.net-pki-tomcat,ou=csusers,cn=config] authentication mechanism [SIMPLE]: error 32 (No such object)
If I run a search in ns01's LDAP I get this result:
[root@ns01 ~]# ldapsearch -xLLL -h ns01.dev.example.net http://ns01.dev.example.net -D "cn=directory manager" -W -b "ou=csusers,cn=config" Enter LDAP Password: dn: ou=csusers,cn=config objectClass: top objectClass: organizationalUnit ou: csusers
dn: cn=Replication Manager masterAgreement1-ns02.dev.example.net-pki-tomca t,ou=csusers,cn=config cn: Replication Manager masterAgreement1-ns02.dev.example.net-pki-tomcat objectClass: top objectClass: person sn: manager userPassword:: redacted!!!
So there's a "masterAgreement1" but no "cloneAgreement1".
Is that something hanging around from the previous replica agreement? If so, how do I fix whatever is running that query every 5 minutes?
Or is it indicative of something else that is wrong? I ran the tool from https://github.com/peterpakos/checkipaconsistency%C2%A0and it reports everything is fine (except that ns01 has a dangling AD trust that was supposed to be removed, but that's for another post I guess).
How do I identify the process that is running that query that causes the error message in /var/log/messages?
Sounds like there is still a CA replication agreement. You can try using the topologysegment command(s) to list and hopefully remove this dangling agreement.
rob
Hi Rob,
Thank you for the pointer. The ipa topologysegment output only showed one agreement, as far as I can tell there aren't any dangling entries that the ipa command line tool can find.
Searching LDAP on ns01 shows multiple nsds50ruv entries in the cn=masterAgreement1-ns02.dev.example.net-pki-tomcat,cn=replica,cn=o\3Dipaca,cn=mapping tree,cn=config DN. That DN has a nsds5replicaLastUpdateStatus of "Error (32) Problem connecting to replica - LDAP error: No such object (connection error)"
Should I use "ipa topologysegment-del" to clean up the CA replication? I'm thinking "del" because the agreement has old stuff hanging around in it, so removing it then re-adding it should clear everything out correctly.
Thank you,
Anthony Clark
On Mon, Apr 8, 2019 at 10:36 AM Rob Crittenden rcritten@redhat.com wrote:
Anthony Jarvis-Clark via FreeIPA-users wrote:
Hello Everyone,
Over the weekend we lost a replica during an upgrade and had to rebuild it. The OS (CentOS 7.6) was reinstalled from scratch, the host then added to the IPA domain, and then turned into a replica.
Sequence of events:
- ns01 upgraded from FreeIPA 4.4.0-14 to 4.6.4-10
- ns02 corrupted during upgrade process
- on ns01, "ipa-replica-manage del ns02" ran.
- ns02 rebuilt from scratch with latest CentOS 7.6 packages.
- ns02 added to IPA domain
- ns02 added as replica
The process went well, no errors during the "ipa-replica-install --setup-ca --setup-kra --setup-dns --forwarder=x.x.x.x" process.
However, on ns01, I'm getting the following message in /var/log/messages:
Apr 8 13:54:36 ns01 ns-slapd: [08/Apr/2019:13:54:36.294135188 +0000] - ERR - slapi_ldap_bind - Error: could not bind id [cn=Replication Manager cloneAgreement1-ns02.dev.example.net-pki-tomcat,ou=csusers,cn=config] authentication mechanism [SIMPLE]: error 32 (No such object) ... Apr 8 13:59:36 ns01 ns-slapd: [08/Apr/2019:13:59:36.547881587 +0000] - ERR - slapi_ldap_bind - Error: could not bind id [cn=Replication Manager cloneAgreement1-ns02.dev.example.net-pki-tomcat,ou=csusers,cn=config] authentication mechanism [SIMPLE]: error 32 (No such object)
If I run a search in ns01's LDAP I get this result:
[root@ns01 ~]# ldapsearch -xLLL -h ns01.dev.example.net http://ns01.dev.example.net -D "cn=directory manager" -W -b "ou=csusers,cn=config" Enter LDAP Password: dn: ou=csusers,cn=config objectClass: top objectClass: organizationalUnit ou: csusers
dn: cn=Replication Manager
masterAgreement1-ns02.dev.example.net-pki-tomca
t,ou=csusers,cn=config cn: Replication Manager masterAgreement1-ns02.dev.example.net-pki-tomcat objectClass: top objectClass: person sn: manager userPassword:: redacted!!!
So there's a "masterAgreement1" but no "cloneAgreement1".
Is that something hanging around from the previous replica agreement? If so, how do I fix whatever is running that query every 5 minutes?
Or is it indicative of something else that is wrong? I ran the tool from https://github.com/peterpakos/checkipaconsistency and it reports everything is fine (except that ns01 has a dangling AD trust that was supposed to be removed, but that's for another post I guess).
How do I identify the process that is running that query that causes the error message in /var/log/messages?
Sounds like there is still a CA replication agreement. You can try using the topologysegment command(s) to list and hopefully remove this dangling agreement.
rob
Hello Everyone,
Regarding cleaning up, I'm unable to run "ipa topologysegment-del ca ns01.dev.example.net-to-ns02.dev.example.net" due to a message "ipa: ERROR: Server is unwilling to perform: Removal of Segment disconnects topology.Deletion not allowed."
So if I go into LDAP manually, I find this DN: "DN: cn=masterAgreement1-ns02.dev.example.net-pki-tomcat,cn=replica,cn=o\3Dipaca,cn=mapping tree,cn=config"
According to ipa-replica-manage list-ruv I have certificate server RUVs 96 and 12. the DN above does not have any reference to RUV 12, instead listing a RUV 86. Yet "ipa-replica-manager clean-dangling-ruv" says no dangling RUVs found.
Is there a test I can do to see whether ns02 pki-tomcat is correctly replicating ns01?
Thank you!
Anthony
On Mon, Apr 8, 2019 at 3:57 PM Anthony Jarvis-Clark < anthonyclarka2@gmail.com> wrote:
Hi Rob,
Thank you for the pointer. The ipa topologysegment output only showed one agreement, as far as I can tell there aren't any dangling entries that the ipa command line tool can find.
Searching LDAP on ns01 shows multiple nsds50ruv entries in the cn=masterAgreement1-ns02.dev.example.net-pki-tomcat,cn=replica,cn=o\3Dipaca,cn=mapping tree,cn=config DN. That DN has a nsds5replicaLastUpdateStatus of "Error (32) Problem connecting to replica - LDAP error: No such object (connection error)"
Should I use "ipa topologysegment-del" to clean up the CA replication? I'm thinking "del" because the agreement has old stuff hanging around in it, so removing it then re-adding it should clear everything out correctly.
Thank you,
Anthony Clark
On Mon, Apr 8, 2019 at 10:36 AM Rob Crittenden rcritten@redhat.com wrote:
Anthony Jarvis-Clark via FreeIPA-users wrote:
Hello Everyone,
Over the weekend we lost a replica during an upgrade and had to rebuild it. The OS (CentOS 7.6) was reinstalled from scratch, the host then added to the IPA domain, and then turned into a replica.
Sequence of events:
- ns01 upgraded from FreeIPA 4.4.0-14 to 4.6.4-10
- ns02 corrupted during upgrade process
- on ns01, "ipa-replica-manage del ns02" ran.
- ns02 rebuilt from scratch with latest CentOS 7.6 packages.
- ns02 added to IPA domain
- ns02 added as replica
The process went well, no errors during the "ipa-replica-install --setup-ca --setup-kra --setup-dns --forwarder=x.x.x.x" process.
However, on ns01, I'm getting the following message in
/var/log/messages:
Apr 8 13:54:36 ns01 ns-slapd: [08/Apr/2019:13:54:36.294135188 +0000] - ERR - slapi_ldap_bind - Error: could not bind id [cn=Replication Manager cloneAgreement1-ns02.dev.example.net-pki-tomcat,ou=csusers,cn=config] authentication mechanism [SIMPLE]: error 32 (No such object) ... Apr 8 13:59:36 ns01 ns-slapd: [08/Apr/2019:13:59:36.547881587 +0000] - ERR - slapi_ldap_bind - Error: could not bind id [cn=Replication Manager cloneAgreement1-ns02.dev.example.net-pki-tomcat,ou=csusers,cn=config] authentication mechanism [SIMPLE]: error 32 (No such object)
If I run a search in ns01's LDAP I get this result:
[root@ns01 ~]# ldapsearch -xLLL -h ns01.dev.example.net http://ns01.dev.example.net -D "cn=directory manager" -W -b "ou=csusers,cn=config" Enter LDAP Password: dn: ou=csusers,cn=config objectClass: top objectClass: organizationalUnit ou: csusers
dn: cn=Replication Manager
masterAgreement1-ns02.dev.example.net-pki-tomca
t,ou=csusers,cn=config cn: Replication Manager masterAgreement1-ns02.dev.example.net-pki-tomcat objectClass: top objectClass: person sn: manager userPassword:: redacted!!!
So there's a "masterAgreement1" but no "cloneAgreement1".
Is that something hanging around from the previous replica agreement? If so, how do I fix whatever is running that query every 5 minutes?
Or is it indicative of something else that is wrong? I ran the tool from https://github.com/peterpakos/checkipaconsistency and it reports everything is fine (except that ns01 has a dangling AD trust that was supposed to be removed, but that's for another post I guess).
How do I identify the process that is running that query that causes the error message in /var/log/messages?
Sounds like there is still a CA replication agreement. You can try using the topologysegment command(s) to list and hopefully remove this dangling agreement.
rob
freeipa-users@lists.fedorahosted.org