Hi,
I have an issue with sssd 1.15.0-3 on Debian 9. My server is a gitlab server, after few hours, authentification stop working. I'm using sssd to authenticate users using ldap against Active Directory.
By setting sss_debuglevel 6 I was able to identify that sssd_pam opened too many files :
(Sun Mar 29 18:06:10 2020) [sssd[pam]] [accept_fd_handler] (0x0020): Accept failed [Too many open files]
When this happen, lsof report that sssd_pam had thousand of open files :
sssd_pam 27277 root 2006u unix 0xffff90fa7b935000 0t0 3395594982 /var/lib/sss/pipes/pam type=STREAM
I set the fd_limit parameter in sss.Dconf in order to avoid too many open files that fast.
I can fix the issue if I restart sssd.
For information here is my sssd.conf file :
[sssd] domains = sub.domain.net config_file_version = 2 servisubs = nss, pam
[domain/sub.domain.net] ad_domain = sub.domain.net ldap_uri = ldap://ad1.sub.domain.net, ldap://ad2.sub.domain.net id_provider = ldap ldap_acsubss_order = expire ldap_tls_reqsubrt = never ldap_schema = rfc2307bis ldap_referrals = false ldap_forsub_upper_case_realm = true ldap_search_base = DC=sub,DC=domain,DC=net ldap_group_search_base = DC=sub,DC=domain,DC=net ldap_group_object_class = group ldap_group_name = sAMAccountName ldap_user_object_class = User ldap_user_name = sAMAccountName ldap_user_fullname = displayName ldap_user_home_directory = unixHomeDirectory ldap_user_principal = userPrincipalName ldap_default_bind_dn = CN=user,OU=OU,DC=sub,DC=domain,DC=net ldap_default_authtok = ********** cache_credentials = true acsubss_provider = simple simple_allow_groups = group1, group2 auth_provider = ldap use_fully_qualified_names = false dns_discovery_domain = sub.domain.net default_shell = /bin/bash override_shell = /bin/bash fallback_homedir = /home/%d/%u enumerate = false ldap_user_objectsid = objectSid ldap_group_objectsid = objectSid ldap_user_primary_group = primaryGroupID case_sensitive = False ldap_id_mapping = true
[nss] filter_users = git, root, monitoring
[pam] fd_limit = 10000 client_idle_timeout = 10
Have you any idea what could cause sssd_pam not closing those files ? Best regards,
Hugo
On (17/04/20 05:42), Hugo Deprez wrote:
Hi,
I have an issue with sssd 1.15.0-3 on Debian 9.
There were many changes between 1.15.0 and 1.16.x. Could you test sssd 1.16.3-3.1 from debian buster?
My server is a gitlab server, after few hours, authentification stop working. I'm using sssd to authenticate users using ldap against Active Directory.
By setting sss_debuglevel 6 I was able to identify that sssd_pam opened too many files :
(Sun Mar 29 18:06:10 2020) [sssd[pam]] [accept_fd_handler] (0x0020): Accept failed [Too many open files]
When this happen, lsof report that sssd_pam had thousand of open files :
sssd_pam 27277 root 2006u unix 0xffff90fa7b935000 0t0 3395594982 /var/lib/sss/pipes/pam type=STREAM
I set the fd_limit parameter in sss.Dconf in order to avoid too many open files that fast.
I can fix the issue if I restart sssd.
For information here is my sssd.conf file :
[sssd] domains = sub.domain.net config_file_version = 2 servisubs = nss, pam
[domain/sub.domain.net] ad_domain = sub.domain.net ldap_uri = ldap://ad1.sub.domain.net, ldap://ad2.sub.domain.net id_provider = ldap ldap_acsubss_order = expire ldap_tls_reqsubrt = never ldap_schema = rfc2307bis ldap_referrals = false ldap_forsub_upper_case_realm = true ldap_search_base = DC=sub,DC=domain,DC=net ldap_group_search_base = DC=sub,DC=domain,DC=net ldap_group_object_class = group ldap_group_name = sAMAccountName ldap_user_object_class = User ldap_user_name = sAMAccountName ldap_user_fullname = displayName ldap_user_home_directory = unixHomeDirectory ldap_user_principal = userPrincipalName ldap_default_bind_dn = CN=user,OU=OU,DC=sub,DC=domain,DC=net ldap_default_authtok = ********** cache_credentials = true acsubss_provider = simple simple_allow_groups = group1, group2 auth_provider = ldap use_fully_qualified_names = false dns_discovery_domain = sub.domain.net default_shell = /bin/bash override_shell = /bin/bash fallback_homedir = /home/%d/%u enumerate = false ldap_user_objectsid = objectSid ldap_group_objectsid = objectSid ldap_user_primary_group = primaryGroupID case_sensitive = False ldap_id_mapping = true
[nss] filter_users = git, root, monitoring
[pam] fd_limit = 10000 client_idle_timeout = 10
Have you any idea what could cause sssd_pam not closing those files ? Best regards,
If upgrade to 1.16.3-3.1 does not help then would guess some application does not use PAM correctly and thus clients does not close connections. Based on your settings, sssd will close idle connection after 10 seconds. But clients might open connections much faster than sssd is able to close unused connections.
Changing fd_limit and client_idle_timeout is not solution in case of broken client application.
LS
Hi Lukas,
thank you,
Sadly I'm not sure I'll be able to backport sssd 1.16 to debian 9 it is not part of official backports.
Looking at the logs, when there is no pubkey authentification happen for more than 10s there is no file release.
Is there a way to identify broken client ?
Best regards,
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ Le vendredi 17 avril 2020 10:52, Lukas Slebodnik lslebodn@redhat.com a écrit :
On (17/04/20 05:42), Hugo Deprez wrote:
Hi, I have an issue with sssd 1.15.0-3 on Debian 9.
There were many changes between 1.15.0 and 1.16.x. Could you test sssd 1.16.3-3.1 from debian buster?
My server is a gitlab server, after few hours, authentification stop working. I'm using sssd to authenticate users using ldap against Active Directory. By setting sss_debuglevel 6 I was able to identify that sssd_pam opened too many files : (Sun Mar 29 18:06:10 2020) [sssd[pam]] [accept_fd_handler] (0x0020): Accept failed [Too many open files] When this happen, lsof report that sssd_pam had thousand of open files : sssd_pam 27277 root 2006u unix 0xffff90fa7b935000 0t0 3395594982 /var/lib/sss/pipes/pam type=STREAM I set the fd_limit parameter in sss.Dconf in order to avoid too many open files that fast. I can fix the issue if I restart sssd. For information here is my sssd.conf file : [sssd] domains = sub.domain.net config_file_version = 2 servisubs = nss, pam [domain/sub.domain.net] ad_domain = sub.domain.net ldap_uri = ldap://ad1.sub.domain.net, ldap://ad2.sub.domain.net id_provider = ldap ldap_acsubss_order = expire ldap_tls_reqsubrt = never ldap_schema = rfc2307bis ldap_referrals = false ldap_forsub_upper_case_realm = true ldap_search_base = DC=sub,DC=domain,DC=net ldap_group_search_base = DC=sub,DC=domain,DC=net ldap_group_object_class = group ldap_group_name = sAMAccountName ldap_user_object_class = User ldap_user_name = sAMAccountName ldap_user_fullname = displayName ldap_user_home_directory = unixHomeDirectory ldap_user_principal = userPrincipalName ldap_default_bind_dn = CN=user,OU=OU,DC=sub,DC=domain,DC=net ldap_default_authtok = ********** cache_credentials = true acsubss_provider = simple simple_allow_groups = group1, group2 auth_provider = ldap use_fully_qualified_names = false dns_discovery_domain = sub.domain.net default_shell = /bin/bash override_shell = /bin/bash fallback_homedir = /home/%d/%u enumerate = false ldap_user_objectsid = objectSid ldap_group_objectsid = objectSid ldap_user_primary_group = primaryGroupID case_sensitive = False ldap_id_mapping = true [nss] filter_users = git, root, monitoring [pam] fd_limit = 10000 client_idle_timeout = 10 Have you any idea what could cause sssd_pam not closing those files ? Best regards,
If upgrade to 1.16.3-3.1 does not help then would guess some application does not use PAM correctly and thus clients does not close connections. Based on your settings, sssd will close idle connection after 10 seconds. But clients might open connections much faster than sssd is able to close unused connections.
Changing fd_limit and client_idle_timeout is not solution in case of broken client application.
LS
On (17/04/20 12:01), Hugo Deprez wrote:
Hi Lukas,
thank you,
Sadly I'm not sure I'll be able to backport sssd 1.16 to debian 9 it is not part of official backports.
Looking at the logs, when there is no pubkey authentification happen for more than 10s there is no file release.
Is there a way to identify broken client ?
You can increase debug_level in '[pam]' section of sssd.conf And you should be able to see something there.
Here is a message from sssd master
(Fri Apr 24 10:51:45 2020) [sssd[pam]] [get_client_cred] (0x4000): Client [0x5629dd1be520][19] creds: euid[0] egid[0] pid[17233] cmd_line['su'].
BTW I would still recomment to test lastest sssd 1.16. Maybe on debian 10 if you do not want to backport yourself.
LS
sssd-users@lists.fedorahosted.org