Hi, folks,
I've got a double-barrelled problem.
We're an RHEL shop with mostly RHEL 7 machines authenticating via sssd against LDAP which has, for a few weeks now, been undergoing very slow logons. Now we are also seeing incomplete information being returned from id -G. I've trimmed down a test case to eliminate every other variable I can find, and I'm left with sssd as my focal point.
Any thoughts on where to start with this very puzzling, very annoying problem.
Thanks,
John A
Do you have a lot of nested groups? Patches all applied?
On Wed, Sep 20, 2023, 1:34 PM Johnnie W Adams jxadams@ualr.edu wrote:
Hi, folks,
I've got a double-barrelled problem. We're an RHEL shop with mostly RHEL 7 machines authenticating viasssd against LDAP which has, for a few weeks now, been undergoing very slow logons. Now we are also seeing incomplete information being returned from id -G. I've trimmed down a test case to eliminate every other variable I can find, and I'm left with sssd as my focal point.
Any thoughts on where to start with this very puzzling, veryannoying problem.
Thanks,
John A-- John Adams Senior Linux/Middleware Administrator | Information Technology Services +1-501-916-3010 | jxadams@ualr.edu | http://ualr.edu/itservices *UA Little Rock*
Reminder: IT Services will never ask for your password over the phone or in an email. Always be suspicious of requests for personal information that come via email, even from known contacts. For more information or to report suspicious email, visit IT Security http://ualr.edu/itservices/security/. _______________________________________________ sssd-users mailing list -- sssd-users@lists.fedorahosted.org To unsubscribe send an email to sssd-users-leave@lists.fedorahosted.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedorahosted.org/archives/list/sssd-users@lists.fedorahosted.o... Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
I'm running (on this test box here) sssd 1.16.2 and the box is completely patched. I don't believe I have a nested group in the directory.
On Wed, Sep 20, 2023 at 3:37 PM solarflow99 solarflow99@gmail.com wrote:
Do you have a lot of nested groups? Patches all applied?
On Wed, Sep 20, 2023, 1:34 PM Johnnie W Adams jxadams@ualr.edu wrote:
Hi, folks,
I've got a double-barrelled problem. We're an RHEL shop with mostly RHEL 7 machines authenticating viasssd against LDAP which has, for a few weeks now, been undergoing very slow logons. Now we are also seeing incomplete information being returned from id -G. I've trimmed down a test case to eliminate every other variable I can find, and I'm left with sssd as my focal point.
Any thoughts on where to start with this very puzzling, veryannoying problem.
Thanks,
John A-- John Adams Senior Linux/Middleware Administrator | Information Technology Services +1-501-916-3010 | jxadams@ualr.edu | http://ualr.edu/itservices *UA Little Rock*
Reminder: IT Services will never ask for your password over the phone or in an email. Always be suspicious of requests for personal information that come via email, even from known contacts. For more information or to report suspicious email, visit IT Security http://ualr.edu/itservices/security/. _______________________________________________ sssd-users mailing list -- sssd-users@lists.fedorahosted.org To unsubscribe send an email to sssd-users-leave@lists.fedorahosted.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedorahosted.org/archives/list/sssd-users@lists.fedorahosted.o... Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
sssd-users mailing list -- sssd-users@lists.fedorahosted.org To unsubscribe send an email to sssd-users-leave@lists.fedorahosted.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedorahosted.org/archives/list/sssd-users@lists.fedorahosted.o... Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Is there a way you can share some SSSD logs showcasing the slow logins and incomplete id results?
1. Edit /etc/sssd/sssd.conf and add "debug_level = 9" to all bracket sections "[domain/...], [sssd], [pam], etc".
2. Restart SSSD and clear cache and logs:
# systemctl stop sssd ; rm -rf /var/log/sssd/* /var/lib/sss/{db,mc}/* ; systemctl start sssd
3. Reproduce the error.
4. Archive the relevant configs:
# tar czvpf /tmp/sssd-debug_$(hostname -s)_$(date +%F_%H%M%S).tar.gz /var/lib/sss /var/log/{sssd,secure,messages,samba} /etc/{ssh/sshd_config,pam.d,nsswitch.conf,krb5.c*,openldap,authselect,hosts,resolv.conf,sssd}
-----------------
Otherwise, I'd try a few tuning parameters to see if it helps to speed things up:
[domain/example.com] ignore_group_members = true ldap_deref_threshold = 0
[pam] pam_id_timeout = 30
[nss] entry_cache_timeout = 10800
------------------
You might also considering mounting the SSSD cache in tmpfs to speed up the responsiveness of SSSD for busy machines:
Add the following entry to the /etc/fstab file as a single line:
tmpfs /var/lib/sss/db/ tmpfs size=300M,mode=0700,uid=sssd,gid=sssd,rootcontext=system_u:object_r:sssd_var_lib_t:s0 0 0
Restart SSSD afterwards:
# systemctl stop sssd ; rm -rf /var/lib/sss/db/* ; mount /var/lib/sss/db/ ; systemctl start sssd
On 9/20/23 16:32, Johnnie W Adams wrote:
Hi, folks,
I've got a double-barrelled problem.
We're an RHEL shop with mostly RHEL 7 machines authenticating via sssd against LDAP which has, for a few weeks now, been undergoing very slow logons. Now we are also seeing incomplete information being returned from id -G. I've trimmed down a test case to eliminate every other variable I can find, and I'm left with sssd as my focal point.
Any thoughts on where to start with this very puzzling, very annoying problem.
I've attached a tar file of the /var/log/sssd directory, which I hope will carry over. The logon delay is >10 seconds.
On Wed, Sep 20, 2023 at 3:45 PM Striker Leggette striker@terranforge.com wrote:
Is there a way you can share some SSSD logs showcasing the slow logins and incomplete id results?
- Edit /etc/sssd/sssd.conf and add "debug_level = 9" to all bracket
sections "[domain/...], [sssd], [pam], etc".
Restart SSSD and clear cache and logs:
# systemctl stop sssd ; rm -rf /var/log/sssd/* /var/lib/sss/{db,mc}/*
; systemctl start sssd
Reproduce the error.
Archive the relevant configs:
# tar czvpf /tmp/sssd-debug_$(hostname -s)_$(date +%F_%H%M%S).tar.gz
/var/lib/sss /var/log/{sssd,secure,messages,samba}
/etc/{ssh/sshd_config,pam.d,nsswitch.conf,krb5.c*,openldap,authselect,hosts,resolv.conf,sssd}
Otherwise, I'd try a few tuning parameters to see if it helps to speed things up:
[domain/example.com] ignore_group_members = true ldap_deref_threshold = 0
[pam] pam_id_timeout = 30
[nss] entry_cache_timeout = 10800
You might also considering mounting the SSSD cache in tmpfs to speed up the responsiveness of SSSD for busy machines:
Add the following entry to the /etc/fstab file as a single line:
tmpfs /var/lib/sss/db/ tmpfs size=300M,mode=0700,uid=sssd,gid=sssd,rootcontext=system_u:object_r:sssd_var_lib_t:s0
0 0
Restart SSSD afterwards:
# systemctl stop sssd ; rm -rf /var/lib/sss/db/* ; mount /var/lib/sss/db/ ; systemctl start sssd
On 9/20/23 16:32, Johnnie W Adams wrote:
Hi, folks,
I've got a double-barrelled problem. We're an RHEL shop with mostly RHEL 7 machines authenticating viasssd against LDAP which has, for a few weeks now, been undergoing very slow logons. Now we are also seeing incomplete information being returned from id -G. I've trimmed down a test case to eliminate every other variable I can find, and I'm left with sssd as my focal point.
Any thoughts on where to start with this very puzzling, veryannoying problem.
Not digging too far into this yet, but:
(Wed Sep 20 15:48:08 2023) [sssd[be[default]]] [dp_get_account_info_handler] (0x0200): Got request for [0x1][BE_REQ_USER][name=pms1@default] ... (Wed Sep 20 15:48:08 2023) [sssd[be[default]]] [sdap_print_server] (0x2000): Searching 144.167.6.61:1389 ... (Wed Sep 20 15:48:14 2023) [sssd[be[default]]] [generic_ext_search_handler] (0x0040): sdap_get_generic_ext_recv failed [110]: Connection timed out ... (Wed Sep 20 15:48:14 2023) [sssd[be[default]]] [fo_resolve_service_send] (0x0020): No available servers for service 'LDAP' (Wed Sep 20 15:48:14 2023) [sssd[be[default]]] [be_resolve_server_done] (0x1000): Server resolution failed: [5]: Input/output error (Wed Sep 20 15:48:14 2023) [sssd[be[default]]] [sdap_id_op_connect_done] (0x0020): Failed to connect, going offline (5 [Input/output error]) (Wed Sep 20 15:48:14 2023) [sssd[be[default]]] [be_mark_offline] (0x2000): Going offline!
It looks like you might have some LDAP servers which aren't responding to requests, Are non-working LDAP servers still listed in your DNS? Check:
# dig -t srv _ldap._tcp.net.ualr.edu
The IP Address which isn't responding to requests seems to be tied to the host opendj2a.net.ualr.edu.
On 9/20/23 16:56, Johnnie W Adams wrote:
I've attached a tar file of the /var/log/sssd directory, which I hope will carry over. The logon delay is >10 seconds.
The DNS query returns nothing. The node I'm querying is indeed opendj2a. I wonder if this could be an IPv6 issue?
On Wed, Sep 20, 2023 at 4:05 PM Striker Leggette striker@terranforge.com wrote:
Not digging too far into this yet, but:
(Wed Sep 20 15:48:08 2023) [sssd[be[default]]] [dp_get_account_info_handler] (0x0200): Got request for [0x1][BE_REQ_USER][name=pms1@default] ... (Wed Sep 20 15:48:08 2023) [sssd[be[default]]] [sdap_print_server] (0x2000): Searching 144.167.6.61:1389 ... (Wed Sep 20 15:48:14 2023) [sssd[be[default]]] [generic_ext_search_handler] (0x0040): sdap_get_generic_ext_recv failed [110]: Connection timed out ... (Wed Sep 20 15:48:14 2023) [sssd[be[default]]] [fo_resolve_service_send] (0x0020): No available servers for service 'LDAP' (Wed Sep 20 15:48:14 2023) [sssd[be[default]]] [be_resolve_server_done] (0x1000): Server resolution failed: [5]: Input/output error (Wed Sep 20 15:48:14 2023) [sssd[be[default]]] [sdap_id_op_connect_done] (0x0020): Failed to connect, going offline (5 [Input/output error]) (Wed Sep 20 15:48:14 2023) [sssd[be[default]]] [be_mark_offline] (0x2000): Going offline!
It looks like you might have some LDAP servers which aren't responding to requests, Are non-working LDAP servers still listed in your DNS? Check:
# dig -t srv _ldap._tcp.net.ualr.edu
The IP Address which isn't responding to requests seems to be tied to the host opendj2a.net.ualr.edu.
On 9/20/23 16:56, Johnnie W Adams wrote:
I've attached a tar file of the /var/log/sssd directory, which I hope will carry over. The logon delay is >10 seconds.
We are not using IPv6, which is why I wondered if that could be the issue. These are OpenLDAP servers.
On Wed, Sep 20, 2023 at 4:12 PM Striker Leggette striker@terranforge.com wrote:
Do you only use IPv6? SSSD is trying to connect using IPv4.
What do you use as your id_provider? Are these AD servers?
On 9/20/23 17:08, Johnnie W Adams wrote:
The DNS query returns nothing. The node I'm querying is indeed opendj2a. I wonder if this could be an IPv6 issue?
It doesn't look like an IPv6 issue since SSSD is trying to connect using IPv4.
If SSSD isn't getting a response from OpenLDAP, you may want to check your server logs to find out why.
At the moment, this doesn't seem like an SSSD issue.
On 9/20/23 17:17, Johnnie W Adams wrote:
We are not using IPv6, which is why I wondered if that could be the issue. These are OpenLDAP servers.
sssd-users@lists.fedorahosted.org