Hi guys,
I ran into a rather significant problem. I needed to rebuild two nodes in my topology and re-include them under the same hostnames. What I’m seeing now is that the replication to these new nodes is broken. Replication from them seems to work. I suspect that we have some stale metadata somewhere in the topology whereby the old nodes are still present somewhere in the agreements under other ids?
What’s the best way to troubleshoot this?
Thanks again, Sergei
The error I’m basically getting is:
[23/Mar/2018:03:23:29.461074995 +0000] - ERR - NSMMReplicationPlugin - bind_and_check_pwp - agmt=“cn=HOST1-to-HOST2" (ipa203:389) - Replication bind with GSSAPI auth failed: LDAP error 49 (Invalid credentials) ()
Any ideas?
On Mar 22, 2018, at 5:05 PM, Sergei Gerasenko gerases@gmail.com wrote:
Hi guys,
I ran into a rather significant problem. I needed to rebuild two nodes in my topology and re-include them under the same hostnames. What I’m seeing now is that the replication to these new nodes is broken. Replication from them seems to work. I suspect that we have some stale metadata somewhere in the topology whereby the old nodes are still present somewhere in the agreements under other ids?
What’s the best way to troubleshoot this?
Thanks again, Sergei
On 03/23/2018 12:07 AM, Sergei Gerasenko wrote:
The error I’m basically getting is:
[23/Mar/2018:03:23:29.461074995 +0000] - ERR - NSMMReplicationPlugin - bind_and_check_pwp - agmt=“cn=HOST1-to-HOST2" (ipa203:389) - Replication bind with GSSAPI auth failed: LDAP error 49 (Invalid credentials) ()
Any ideas?
GSSAPI authentication is failing. Wrong principle name in agreement? KDC issue? I don't know, but that's what the error means. It could also be a red herring as it typically does recover (it logs something like "auth resumed"). We need to see more logging from the errors log.
On Mar 22, 2018, at 5:05 PM, Sergei Gerasenko gerases@gmail.com wrote:
Hi guys,
I ran into a rather significant problem. I needed to rebuild two nodes in my topology and re-include them under the same hostnames. What I’m seeing now is that the replication to these new nodes is broken. Replication from them seems to work. I suspect that we have some stale metadata somewhere in the topology whereby the old nodes are still present somewhere in the agreements under other ids?
What’s the best way to troubleshoot this?
Thanks again, Sergei
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org
The only other message before that is suspcious:
set_krb5_creds - Could not get initial credentials for principal ... in keytab [FILE:/etc/dirsrv/ds.keytab]: -1765328203 (Key table entry not found)
I might get more when I get to work, but I think that’s all the errors I found. The resume message is not there. I saw your commit 5 years ago on this issue.
On Mar 23, 2018, at 6:48 AM, Mark Reynolds mreynolds@redhat.com wrote:
On 03/23/2018 12:07 AM, Sergei Gerasenko wrote:
The error I’m basically getting is:
[23/Mar/2018:03:23:29.461074995 +0000] - ERR - NSMMReplicationPlugin - bind_and_check_pwp - agmt=“cn=HOST1-to-HOST2" (ipa203:389) - Replication bind with GSSAPI auth failed: LDAP error 49 (Invalid credentials) ()
Any ideas?
GSSAPI authentication is failing. Wrong principle name in agreement? KDC issue? I don't know, but that's what the error means. It could also be a red herring as it typically does recover (it logs something like "auth resumed"). We need to see more logging from the errors log.
On Mar 22, 2018, at 5:05 PM, Sergei Gerasenko gerases@gmail.com wrote:
Hi guys,
I ran into a rather significant problem. I needed to rebuild two nodes in my topology and re-include them under the same hostnames. What I’m seeing now is that the replication to these new nodes is broken. Replication from them seems to work. I suspect that we have some stale metadata somewhere in the topology whereby the old nodes are still present somewhere in the agreements under other ids?
What’s the best way to troubleshoot this?
Thanks again, Sergei
389-users mailing list -- 389-users@lists.fedoraproject.org mailto:389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org mailto:389-users-leave@lists.fedoraproject.org
So here’s a more complete snippet from the host (ipa204) that can’t push to its partner (ipa203):
[23/Mar/2018:04:09:43.460073218 +0000] - ERR - NSACLPlugin - acl_parse - The ACL target cn=vaults,cn=kra,dc=XXX,dc=net does not exist [23/Mar/2018:04:09:43.460238115 +0000] - ERR - NSACLPlugin - acl_parse - The ACL target cn=vaults,cn=kra,dc=XXX,dc=net does not exist [23/Mar/2018:04:09:43.460483444 +0000] - ERR - NSACLPlugin - acl_parse - The ACL target cn=vaults,cn=kra,dc=XXX,dc=net does not exist [23/Mar/2018:04:09:43.460620709 +0000] - ERR - NSACLPlugin - acl_parse - The ACL target cn=vaults,cn=kra,dc=XXX,dc=net does not exist [23/Mar/2018:04:09:43.460793082 +0000] - ERR - NSACLPlugin - acl_parse - The ACL target cn=vaults,cn=kra,dc=XXX,dc=net does not exist [23/Mar/2018:04:09:43.460998306 +0000] - ERR - NSACLPlugin - acl_parse - The ACL target cn=vaults,cn=kra,dc=XXX,dc=net does not exist [23/Mar/2018:04:09:43.461171061 +0000] - ERR - NSACLPlugin - acl_parse - The ACL target cn=vaults,cn=kra,dc=XXX,dc=net does not exist [23/Mar/2018:04:09:43.461370548 +0000] - ERR - NSACLPlugin - acl_parse - The ACL target cn=vaults,cn=kra,dc=XXX,dc=net does not exist [23/Mar/2018:04:09:43.461554598 +0000] - ERR - NSACLPlugin - acl_parse - The ACL target cn=dns,dc=XXX,dc=net does not exist [23/Mar/2018:04:09:43.462223077 +0000] - ERR - NSACLPlugin - acl_parse - The ACL target cn=ad,cn=etc,dc=XXX,dc=net does not exist [23/Mar/2018:04:09:43.469236418 +0000] - ERR - NSACLPlugin - acl_parse - The ACL target cn=casigningcert cert-pki-ca,cn=ca_renewal,cn=ipa,cn=etc,dc=XXX,dc=net does not exist[23/Mar/2018:04:09:43.469549785 +0000] - ERR - NSACLPlugin - acl_parse - The ACL target cn=casigningcert cert-pki-ca,cn=ca_renewal,cn=ipa,cn=etc,dc=XXX,dc=net does not exist [23/Mar/2018:04:09:43.526348986 +0000] - ERR - NSACLPlugin - acl_parse - The ACL target cn=automember rebuild membership,cn=tasks,cn=config does not exist [23/Mar/2018:04:09:43.543200030 +0000] - ERR - schema-compat-plugin - schema-compat-plugin tree scan will start in about 5 seconds! [23/Mar/2018:04:09:43.543437243 +0000] - ERR - set_krb5_creds - Could not get initial credentials for principal [ldap/ipa204.iad.auth.core.XXX.net@CNVR.NET] in keytab [FILE:/etc/dirsrv/ds.keytab]: -1765328324 (Generic error (see e-text)) [23/Mar/2018:04:09:43.550151255 +0000] - INFO - slapd_daemon - slapd started. Listening on All Interfaces port 389 for LDAP requests [23/Mar/2018:04:09:43.550328761 +0000] - INFO - slapd_daemon - Listening on All Interfaces port 636 for LDAPS requests [23/Mar/2018:04:09:43.550525479 +0000] - INFO - slapd_daemon - Listening on /var/run/slapd-XXX-NET.socket for LDAPI requests [23/Mar/2018:04:09:47.905572763 +0000] - ERR - NSMMReplicationPlugin - bind_and_check_pwp - agmt="cn=ipa204-to-ipa203" (ipa203:389) - Replication bind with GSSAPI auth failed: LDAP error 49 (Invalid credentials) () [23/Mar/2018:04:09:50.611808868 +0000] - ERR - schema-compat-plugin - warning: no entries set up under cn=computers, cn=compat,dc=XXX,dc=net [23/Mar/2018:04:09:50.612281544 +0000] - ERR - schema-compat-plugin - Finished plugin initialization.
The 204 has this message:
[23/Mar/2018:01:52:48.624405992 +0000] - ERR - NSMMReplicationPlugin - acquire_replica - agmt="cn=meToipa101.XXXX.net" (ipa101:389): Unable to acquire replica: permission denied. The bind dn "" does not have permission to supply replication updates to the replica. Will retry later.
Ipa101 is another host in the topology from which 204 was made.
Let me know if I can supply more info.
Thank you! Sergei
On Mar 23, 2018, at 6:48 AM, Mark Reynolds mreynolds@redhat.com wrote:
On 03/23/2018 12:07 AM, Sergei Gerasenko wrote:
The error I’m basically getting is:
[23/Mar/2018:03:23:29.461074995 +0000] - ERR - NSMMReplicationPlugin - bind_and_check_pwp - agmt=“cn=HOST1-to-HOST2" (ipa203:389) - Replication bind with GSSAPI auth failed: LDAP error 49 (Invalid credentials) ()
Any ideas?
GSSAPI authentication is failing. Wrong principle name in agreement? KDC issue? I don't know, but that's what the error means. It could also be a red herring as it typically does recover (it logs something like "auth resumed"). We need to see more logging from the errors log.
On Mar 22, 2018, at 5:05 PM, Sergei Gerasenko gerases@gmail.com wrote:
Hi guys,
I ran into a rather significant problem. I needed to rebuild two nodes in my topology and re-include them under the same hostnames. What I’m seeing now is that the replication to these new nodes is broken. Replication from them seems to work. I suspect that we have some stale metadata somewhere in the topology whereby the old nodes are still present somewhere in the agreements under other ids?
What’s the best way to troubleshoot this?
Thanks again, Sergei
389-users mailing list -- 389-users@lists.fedoraproject.org mailto:389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org mailto:389-users-leave@lists.fedoraproject.org
On 03/23/2018 09:25 AM, Sergei Gerasenko wrote:
So here’s a more complete snippet from the host (ipa204) that can’t push to its partner (ipa203):
[23/Mar/2018:04:09:43.460073218 +0000] - ERR - NSACLPlugin - acl_parse
- The ACL target cn=vaults,cn=kra,dc=XXX,dc=net does not exist
[23/Mar/2018:04:09:43.460238115 +0000] - ERR - NSACLPlugin - acl_parse
- The ACL target cn=vaults,cn=kra,dc=XXX,dc=net does not exist
[23/Mar/2018:04:09:43.460483444 +0000] - ERR - NSACLPlugin - acl_parse
- The ACL target cn=vaults,cn=kra,dc=XXX,dc=net does not exist
[23/Mar/2018:04:09:43.460620709 +0000] - ERR - NSACLPlugin - acl_parse
- The ACL target cn=vaults,cn=kra,dc=XXX,dc=net does not exist
[23/Mar/2018:04:09:43.460793082 +0000] - ERR - NSACLPlugin - acl_parse
- The ACL target cn=vaults,cn=kra,dc=XXX,dc=net does not exist
[23/Mar/2018:04:09:43.460998306 +0000] - ERR - NSACLPlugin - acl_parse
- The ACL target cn=vaults,cn=kra,dc=XXX,dc=net does not exist
[23/Mar/2018:04:09:43.461171061 +0000] - ERR - NSACLPlugin - acl_parse
- The ACL target cn=vaults,cn=kra,dc=XXX,dc=net does not exist
[23/Mar/2018:04:09:43.461370548 +0000] - ERR - NSACLPlugin - acl_parse
- The ACL target cn=vaults,cn=kra,dc=XXX,dc=net does not exist
[23/Mar/2018:04:09:43.461554598 +0000] - ERR - NSACLPlugin - acl_parse
- The ACL target cn=dns,dc=XXX,dc=net does not exist
[23/Mar/2018:04:09:43.462223077 +0000] - ERR - NSACLPlugin - acl_parse
- The ACL target cn=ad,cn=etc,dc=XXX,dc=net does not exist
[23/Mar/2018:04:09:43.469236418 +0000] - ERR - NSACLPlugin - acl_parse
- The ACL target cn=casigningcert
cert-pki-ca,cn=ca_renewal,cn=ipa,cn=etc,dc=XXX,dc=net does not exist[23/Mar/2018:04:09:43.469549785 +0000] - ERR - NSACLPlugin - acl_parse - The ACL target cn=casigningcert cert-pki-ca,cn=ca_renewal,cn=ipa,cn=etc,dc=XXX,dc=net does not exist [23/Mar/2018:04:09:43.526348986 +0000] - ERR - NSACLPlugin - acl_parse
- The ACL target cn=automember rebuild membership,cn=tasks,cn=config
does not exist [23/Mar/2018:04:09:43.543200030 +0000] - ERR - schema-compat-plugin - schema-compat-plugin tree scan will start in about 5 seconds! [23/Mar/2018:04:09:43.543437243 +0000] - ERR - set_krb5_creds - Could not get initial credentials for principal [ldap/ipa204.iad.auth.core.XXX.net@CNVR.NET mailto:ldap/ipa204.iad.auth.core.XXX.net@CNVR.NET] in keytab [FILE:/etc/dirsrv/ds.keytab]: -1765328324 (Generic error (see e-text)) [23/Mar/2018:04:09:43.550151255 +0000] - INFO - slapd_daemon - slapd started. Listening on All Interfaces port 389 for LDAP requests [23/Mar/2018:04:09:43.550328761 +0000] - INFO - slapd_daemon - Listening on All Interfaces port 636 for LDAPS requests [23/Mar/2018:04:09:43.550525479 +0000] - INFO - slapd_daemon - Listening on /var/run/slapd-XXX-NET.socket for LDAPI requests [23/Mar/2018:04:09:47.905572763 +0000] - ERR - NSMMReplicationPlugin - bind_and_check_pwp - agmt="cn=ipa204-to-ipa203" (ipa203:389) - Replication bind with GSSAPI auth failed: LDAP error 49 (Invalid credentials) () [23/Mar/2018:04:09:50.611808868 +0000] - ERR - schema-compat-plugin - warning: no entries set up under cn=computers, cn=compat,dc=XXX,dc=net [23/Mar/2018:04:09:50.612281544 +0000] - ERR - schema-compat-plugin - Finished plugin initialization.
The 204 has this message:
[23/Mar/2018:01:52:48.624405992 +0000] - ERR - NSMMReplicationPlugin - acquire_replica - agmt="cn=meToipa101.XXXX.net http://meToipa101.XXXX.net" (ipa101:389): Unable to acquire replica: permission denied. The bind dn "" does not have permission to supply replication updates to the replica. Will retry later.
This is because its not finding the your kerberos credentials. It's something with your env/setup or keytab file.
Can you do:
# kinit -k -t /etc/dirsrv/ds.keytab ldap/ipa204.iad.auth.core.XXX.net@CNVR.NET mailto:ldap/ipa204.iad.auth.core.XXX.net@CNVR.NET
and
# klist
Ipa101 is another host in the topology from which 204 was made.
Let me know if I can supply more info.
Thank you! Sergei
On Mar 23, 2018, at 6:48 AM, Mark Reynolds <mreynolds@redhat.com mailto:mreynolds@redhat.com> wrote:
On 03/23/2018 12:07 AM, Sergei Gerasenko wrote:
The error I’m basically getting is:
[23/Mar/2018:03:23:29.461074995 +0000] - ERR - NSMMReplicationPlugin
- bind_and_check_pwp - agmt=“cn=HOST1-to-HOST2" (ipa203:389) -
Replication bind with GSSAPI auth failed: LDAP error 49 (Invalid credentials) ()
Any ideas?
GSSAPI authentication is failing. Wrong principle name in agreement? KDC issue? I don't know, but that's what the error means. It could also be a red herring as it typically does recover (it logs something like "auth resumed"). We need to see more logging from the errors log.
On Mar 22, 2018, at 5:05 PM, Sergei Gerasenko <gerases@gmail.com mailto:gerases@gmail.com> wrote:
Hi guys,
I ran into a rather significant problem. I needed to rebuild two nodes in my topology and re-include them under the same hostnames. What I’m seeing now is that the replication to these new nodes is broken. Replication from them seems to work. I suspect that we have some stale metadata somewhere in the topology whereby the old nodes are still present somewhere in the agreements under other ids?
What’s the best way to troubleshoot this?
Thanks again, Sergei
389-users mailing list -- 389-users@lists.fedoraproject.org mailto:389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org mailto:389-users-leave@lists.fedoraproject.org
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org
On Mar 23, 2018, at 8:58 AM, Mark Reynolds mreynolds@redhat.com wrote:
kinit -k -t /etc/dirsrv/ds.keytab
kinit: Keytab contains no suitable keys for host/ipa204.iad.cnvr.net@CNVR.NET while getting initial credentials
On 03/23/2018 10:01 AM, Sergei Gerasenko wrote:
On Mar 23, 2018, at 8:58 AM, Mark Reynolds <mreynolds@redhat.com mailto:mreynolds@redhat.com> wrote:
kinit -k -t /etc/dirsrv/ds.keytab
kinit: Keytab contains no suitable keys for host/ipa204.iad.cnvr.net@CNVR.NET mailto:host/ipa204.iad.cnvr.net@CNVR.NET while getting initial credentials
That's the problem... Is there anything in the keytab file? Looks like you might need to setup/get your keytab again...
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org
Yes, there’s something there. Should I follow this and everything should be ok?
http://directory.fedoraproject.org/docs/389ds/howto/howto-kerberos.html http://directory.fedoraproject.org/docs/389ds/howto/howto-kerberos.html
On Mar 23, 2018, at 9:10 AM, Mark Reynolds mreynolds@redhat.com wrote:
On 03/23/2018 10:01 AM, Sergei Gerasenko wrote:
On Mar 23, 2018, at 8:58 AM, Mark Reynolds <mreynolds@redhat.com mailto:mreynolds@redhat.com> wrote:
kinit -k -t /etc/dirsrv/ds.keytab
kinit: Keytab contains no suitable keys for host/ipa204.iad.cnvr.net@CNVR.NET mailto:host/ipa204.iad.cnvr.net@CNVR.NET while getting initial credentials
That's the problem... Is there anything in the keytab file? Looks like you might need to setup/get your keytab again...
389-users mailing list -- 389-users@lists.fedoraproject.org mailto:389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org mailto:389-users-leave@lists.fedoraproject.org
Also, and I don’t know if it’s strange, but I get that kinit error on any IPA host. I have a 2-master VM environment and trying kinit -k -t /etc/dirsrv/ds.keytab gives the same error back — but they are replicating without issues.
I must admit I don't know too much about troubleshooting kerberos, I just know that in your case its broken. Perhaps ask for help on on the FreeIPA users list as they are much more familiar with this than I am:
freeipa-users@lists.fedorahosted.org
On 03/23/2018 10:40 AM, Sergei Gerasenko wrote:
Also, and I don’t know if it’s strange, but I get that kinit error on any IPA host. I have a 2-master VM environment and trying kinit -k -t /etc/dirsrv/ds.keytab gives the same error back — but they are replicating without issues. _______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org
I think the real command is:
kinit -k -t /etc/dirsrv/ds.keytab ldap/HOST@CNVR.NET mailto:ldap/HOST@CNVR.NET
That does work
On Mar 23, 2018, at 9:51 AM, Mark Reynolds mreynolds@redhat.com wrote:
I must admit I don't know too much about troubleshooting kerberos, I just know that in your case its broken. Perhaps ask for help on on the FreeIPA users list as they are much more familiar with this than I am:
freeipa-users@lists.fedorahosted.org
On 03/23/2018 10:40 AM, Sergei Gerasenko wrote:
Also, and I don’t know if it’s strange, but I get that kinit error on any IPA host. I have a 2-master VM environment and trying kinit -k -t /etc/dirsrv/ds.keytab gives the same error back — but they are replicating without issues. _______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org
The problem was caused by too much traffic going to the server. The neighboring machines couldn’t push their updates to it. Once the traffic was split into more servers, everything normalized.
On Mar 23, 2018, at 9:51 AM, Mark Reynolds mreynolds@redhat.com wrote:
I must admit I don't know too much about troubleshooting kerberos, I just know that in your case its broken. Perhaps ask for help on on the FreeIPA users list as they are much more familiar with this than I am:
freeipa-users@lists.fedorahosted.org
On 03/23/2018 10:40 AM, Sergei Gerasenko wrote:
Also, and I don’t know if it’s strange, but I get that kinit error on any IPA host. I have a 2-master VM environment and trying kinit -k -t /etc/dirsrv/ds.keytab gives the same error back — but they are replicating without issues. _______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org
389-users@lists.fedoraproject.org