On Thu, Aug 24, 2017 at 10:12:55AM +0200, Michael Gusek via FreeIPA-users wrote:
Hello Jakub,
here the first lines of ldap_child.log
|(Wed Aug 23 16:07:11 2017) [[sssd[ldap_child[2104]]]] [main] (0x0400): ldap_child started. (Wed Aug 23 16:07:11 2017) [[sssd[ldap_child[2104]]]] [main] (0x2000): context initialized (Wed Aug 23 16:07:11 2017) [[sssd[ldap_child[2104]]]] [unpack_buffer] (0x1000): total buffer size: 81 (Wed Aug 23 16:07:11 2017) [[sssd[ldap_child[2104]]]] [unpack_buffer] (0x1000): realm_str size: 16 (Wed Aug 23 16:07:11 2017) [[sssd[ldap_child[2104]]]] [unpack_buffer] (0x1000): got realm_str: IPA.EXAMPLE.COM (Wed Aug 23 16:07:11 2017) [[sssd[ldap_child[2104]]]] [unpack_buffer] (0x1000): princ_str size: 41 (Wed Aug 23 16:07:11 2017) [[sssd[ldap_child[2104]]]] [unpack_buffer] (0x1000): got princ_str: host/ipa-lx-test-debian9.ípa.example.com (Wed Aug 23 16:07:11 2017) [[sssd[ldap_child[2104]]]] [unpack_buffer] (0x1000): keytab_name size: 0 (Wed Aug 23 16:07:11 2017) [[sssd[ldap_child[2104]]]] [unpack_buffer] (0x1000): lifetime: 86400 (Wed Aug 23 16:07:11 2017) [[sssd[ldap_child[2104]]]] [unpack_buffer] (0x0200): Will run as [0][0]. (Wed Aug 23 16:07:11 2017) [[sssd[ldap_child[2104]]]] [privileged_krb5_setup] (0x2000): Kerberos context initialized (Wed Aug 23 16:07:11 2017) [[sssd[ldap_child[2104]]]] [main] (0x2000): Kerberos context initialized (Wed Aug 23 16:07:11 2017) [[sssd[ldap_child[2104]]]] [become_user] (0x0200): Trying to become user [0][0]. (Wed Aug 23 16:07:11 2017) [[sssd[ldap_child[2104]]]] [become_user] (0x0200): Already user [0]. (Wed Aug 23 16:07:11 2017) [[sssd[ldap_child[2104]]]] [main] (0x2000): Running as [0][0]. (Wed Aug 23 16:07:11 2017) [[sssd[ldap_child[2104]]]] [main] (0x2000): getting TGT sync (Wed Aug 23 16:07:11 2017) [[sssd[ldap_child[2104]]]] [ldap_child_get_tgt_sync] (0x2000): got realm_name: [IPA.EXAMPLE.COM] (Wed Aug 23 16:07:11 2017) [[sssd[ldap_child[2104]]]] [ldap_child_get_tgt_sync] (0x0100): Principal name is: [host/ipa-lx-test-debian9.ípa.example.com@IPA.EXAMPLE.COM] (Wed Aug 23 16:07:11 2017) [[sssd[ldap_child[2104]]]] [ldap_child_get_tgt_sync] (0x0100): Using keytab [MEMORY:/etc/krb5.keytab] (Wed Aug 23 16:07:11 2017) [[sssd[ldap_child[2104]]]] [ldap_child_get_tgt_sync] (0x0100): Will canonicalize principals (Wed Aug 23 16:07:11 2017) [[sssd[ldap_child[2104]]]] [sss_child_krb5_trace_cb] (0x4000): [2104] 1503497231.122092: Getting initial credentials for host/ipa-lx-test-debian9.ípa.example.com@IPA.EXAMPLE.COM (Wed Aug 23 16:07:11 2017) [[sssd[ldap_child[2104]]]] [sss_child_krb5_trace_cb] (0x4000): [2104] 1503497231.122313: Looked up etypes in keytab: aes256-cts, aes128-cts, aes256-sha2, aes128-sha2, des3-cbc-sha1, rc4-hmac (Wed Aug 23 16:07:11 2017) [[sssd[ldap_child[2104]]]] [sss_child_krb5_trace_cb] (0x4000): [2104] 1503497231.122451: Sending request (218 bytes) to IPA.EXAMPLE.COM (Wed Aug 23 16:07:17 2017) [[sssd[ldap_child[2104]]]] [sig_term_handler] (0x0010): Received signal [Terminated] [15], shutting down (Wed Aug 23 16:07:17 2017) [[sssd[ldap_child[2104]]]] [sig_term_handler] (0x0010): Unlink file [/var/lib/sss/db/ccache_IPA.EXAMPLE.COM_TmKHkD] (Wed Aug 23 16:07:17 2017) [[sssd[ldap_child[2105]]]] [main] (0x0400): ldap_child started. (Wed Aug 23 16:07:17 2017) [[sssd[ldap_child[2105]]]] [main] (0x2000): context initialized |
We connect to IPA.EXAMPLE.COM, this is not helpfull. You can see, there is a delay of 5 seconds. Later in this file, you can see, we try to connect to second server ipa-lx-test-02.ípa.example.com.
Right, sssd says it does, but I really wonder why the ldap_child timeouts even in that case. Are there any log entries in the ipa-lx-test-02.ípa.example.com's log files around the time sssd connects to it?
And also -- could you run a simple tcpdump (tcpdump -i eth1 -x "port 88 or port 53") to see what hosts does sssd talk to and what hosts does it discover?
What I'm wondering is -- does SSSD really talk to the right server at that time or does it keep trying the bad one even if it should be trying the one that is up?
By the way, I'm suprised that the failover doesn't work for you. This is a quite basic feature that had been developed years ago. Can you also specify exactly how you bring one of the IDM servers down? Do you power it off, run ipactl stop, ...?
|(Wed Aug 23 16:10:14 2017) [[sssd[ldap_child[2129]]]] [sss_child_krb5_trace_cb] (0x4000): [2129] 1503497414.357486: Getting initial credentials for host/ipa-lx-test-debian9.ípa.example.com@IPA.EXAMPLE.COM (Wed Aug 23 16:10:14 2017) [[sssd[ldap_child[2129]]]] [sss_child_krb5_trace_cb] (0x4000): [2129] 1503497414.357715: Looked up etypes in keytab: aes256-cts, aes128-cts, aes256-sha2, aes128-sha2, des3-cbc-sha1, rc4-hmac (Wed Aug 23 16:10:14 2017) [[sssd[ldap_child[2129]]]] [sss_child_krb5_trace_cb] (0x4000): [2129] 1503497414.357842: Sending request (218 bytes) to IPA.EXAMPLE.COM (Wed Aug 23 16:10:14 2017) [[sssd[ldap_child[2129]]]] [sss_child_krb5_trace_cb] (0x4000): [2129] 1503497414.360574: Resolving hostname ipa-lx-test-02.ípa.example.com. (Wed Aug 23 16:10:14 2017) [[sssd[ldap_child[2129]]]] [sss_child_krb5_trace_cb] (0x4000): [2129] 1503497414.362141: Sending initial UDP request to dgram <ip_of_ipa-lx-test-02>:88 (Wed Aug 23 16:10:14 2017) [[sssd[ldap_child[2129]]]] [sss_child_krb5_trace_cb] (0x4000): [2129] 1503497414.367725: Received answer (346 bytes) from dgram <ip_of_ipa-lx-test-02>:88 (Wed Aug 23 16:10:19 2017) [[sssd[ldap_child[2129]]]] [sss_child_krb5_trace_cb] (0x4000): [2129] 1503497419.374601: Response was from master KDC |
I think this was my test with ‘kinit admin@IPA.EXAMPLE.COM’. You can find whole log file attached. From my point of view, failover to second KDC is not fast enough and we should set some timeouts in sssd.conf or krb5.conf ?
Well, kinit doesn't talk to sssd..
sssd only has interfaces towards the NSS and PAM subsystems, so kinit talks straight to libkrb5. What /KDCs/ does kinit talk to depends on libkrb5 configuration. In your case, it doesn't contain any KDCs, so it's either what SSSD tells libkrb5 to look at (through a kdcinfo file interface) DNS (because dns_lookup_kdc = true) in this order. And the kdcinfo files should not be generated unless sssd can switch to the 'online' mode here.
krb5.conf from ipa-lx-test-debian9:
|#File modified by ipa-client-install includedir /var/lib/sss/pubconf/krb5.include.d/ [libdefaults] default_realm = IPA.EXAMPLE.COM dns_lookup_realm = true dns_lookup_kdc = true rdns = false ticket_lifetime = 24h forwardable = yes default_ccache_name = KEYRING:persistent:%{uid} [realms] IPA.EXAMPLE.COM = { pkinit_anchors = FILE:/etc/ipa/ca.crt } [domain_realm] .ipa.example.com = IPA.EXAMPLE.COM ipa.example.com = IPA.EXAMPLE.COM |
Regards,
Michael
Am 23.08.2017 um 22:20 schrieb Jakub Hrozek via FreeIPA-users:
On Wed, Aug 23, 2017 at 05:13:13PM +0200, Michael Gusek via FreeIPA-users wrote:
Hi,
we are testing a FreeIPA trust to an Active Directory. Trust itself works, we are happy. Now we tested a failure on FreeIPA site. We have two instances, both with same roles. If we poweroff first installed server, and clean sssd caches with restart of sssd on client side , sssd service can’t establish a connection to second instance.
ipa-lx-test-01.ipa.example.com is the first installed FreeIPA with ipa-server-4.4.0-14.el7.centos.7.x86_64 on latest CentOS7 ipa-lx-test-02.ipa.example.com is the second installed FreeIPA with ipa-server-4.4.0-14.el7.centos.7.x86_64 on latest CentOS7 ipa-lx-test-debian9.ipa.example.com is a latest Debian 9.1 with sssd 1.15.0-3
For deeper inspection full log is attached. In logs we found something like this:
OK, so as you say communication with the KDC failed: (Wed Aug 23 16:07:11 2017) [sssd[be[ipa.example.com]]] [be_resolve_server_process] (0x0200): Found address for server ipa-lx-test-02.ipa.example.com: [x.x.x.x] TTL 1200 (Wed Aug 23 16:07:11 2017) [sssd[be[ipa.example.com]]] [sdap_kinit_kdc_resolved] (0x1000): KDC resolved, attempting to get TGT... (Wed Aug 23 16:07:11 2017) [sssd[be[ipa.example.com]]] [create_tgt_req_send_buffer] (0x0400): buffer size: 81 (Wed Aug 23 16:07:11 2017) [sssd[be[ipa.example.com]]] [child_handler_setup] (0x2000): Setting up signal handler up for pid [2104] (Wed Aug 23 16:07:11 2017) [sssd[be[ipa.example.com]]] [child_handler_setup] (0x2000): Signal handler set up for pid [2104] (Wed Aug 23 16:07:11 2017) [sssd[be[ipa.example.com]]] [set_tgt_child_timeout] (0x0400): Setting 6 seconds timeout for tgt child (Wed Aug 23 16:07:11 2017) [sssd[be[ipa.example.com]]] [sdap_process_result] (0x2000): Trace: sh[0x558252c35750], connected[1], ops[(nil)], ldap[0x558252c130c0] (Wed Aug 23 16:07:11 2017) [sssd[be[ipa.example.com]]] [sdap_process_result] (0x2000): Trace: end of ldap_result list (Wed Aug 23 16:07:11 2017) [sssd[be[ipa.example.com]]] [write_pipe_handler] (0x0400): All data has been sent! (Wed Aug 23 16:07:17 2017) [sssd[be[ipa.example.com]]] [get_tgt_timeout_handler] (0x4000): timeout for sending SIGTERM to tgt child [2104] reached. (Wed Aug 23 16:07:17 2017) [sssd[be[ipa.example.com]]] [get_tgt_timeout_handler] (0x0400): Setting 2 seconds timeout for sending SIGKILL to tgt child (Wed Aug 23 16:07:17 2017) [sssd[be[ipa.example.com]]] [read_pipe_handler] (0x0400): EOF received, client finished (Wed Aug 23 16:07:17 2017) [sssd[be[ipa.example.com]]] [child_sig_handler] (0x1000): Waiting for child [2104]. (Wed Aug 23 16:07:17 2017) [sssd[be[ipa.example.com]]] [child_sig_handler] (0x0020): child [2104] failed with status [7]. (Wed Aug 23 16:07:17 2017) [sssd[be[ipa.example.com]]] [child_callback] (0x0020): LDAP child was terminated due to timeout (Wed Aug 23 16:07:17 2017) [sssd[be[ipa.example.com]]] [sdap_kinit_done] (0x0080): Communication with KDC timed out, trying the next one (Wed Aug 23 16:07:17 2017) [sssd[be[ipa.example.com]]] [_be_fo_set_port_status] (0x8000): Setting status: PORT_NOT_WORKING. Called from: ../src/providers/ldap/sdap_async_connection.c: sdap_kinit_done: 1207 (Wed Aug 23 16:07:17 2017) [sssd[be[ipa.example.com]]] [fo_set_port_status] (0x0100): Marking port 389 of server 'ipa-lx-test-02.ipa.example.com' as 'not working' (Wed Aug 23 16:07:17 2017) [sssd[be[ipa.example.com]]] [fo_set_port_status] (0x0400): Marking port 389 of duplicate server 'ipa-lx-test-02.ipa.example.com' as 'not working'
Could you check in the ldap_child.log which KDC did SSSD try to talk to and what takes so long? _______________________________________________ FreeIPA-users mailing list -- freeipa-users@lists.fedorahosted.org To unsubscribe send an email to freeipa-users-leave@lists.fedorahosted.org
*Michael**Gusek*| System Administrator| Webtrekk GmbH | *t*+49 30 755 415 302| *f *+49 30 755 415 100 | *w *www.webtrekk.com https://www.webtrekk.com/?wt_mc=signature.-.-.-.homepageURL Amtsgericht/Local Court Berlin, HRB 93435 B | Geschäftsführer/CEO Christian Sauer und Wolf Lichtenstein
FreeIPA-users mailing list -- freeipa-users@lists.fedorahosted.org To unsubscribe send an email to freeipa-users-leave@lists.fedorahosted.org