sssd performance on large domains
by zfnoctis@gmail.com
Hi,
I'm wondering if there are any plans to improve sssd performance on large active directory domains (100k+ users, 40k+ groups), or if there are settings I am not aware of that can greatly improve performance, specifically for workstation use cases.
Currently if I do not set "ignore_group_members = True" in sssd.conf, logins can take upwards of 6 minutes and "sssd_be" will max the CPU for up to 20 minutes after logon, which makes it a non-starter. The reason I want to allow group members to be seen is that I want certain domain groups to be able to perform elevated actions using polkit. If I ignore group members, polkit reports that the group is empty and so no one can elevate in the graphical environment.
Ultimately this means that Linux workstations are at a severe disadvantage since they cannot be bound to the domain and have the normal set of access features users and IT expect from macOS or Windows.
Distributions used: Ubuntu 16.04 (sssd 1.13.4-1ubuntu1.1), Ubuntu 16.10 (sssd 1.13.4-3) and Fedora 24 (sssd-1.13.4-3.fc24). All exhibit the same problems.
I've also tried "ldap_group_nesting_level = 1" without seeing any noticeable improvement with respect to performance. Putting the database on /tmp isn't viable as these are workstations that will reboot semi-frequently, and I don't believe this is an I/O bound performance issue anyways.
Thanks for your time.
1 year, 7 months
Trouble resolving a AD group on one machine
by Orion Poplawski
We have a particular machine that is having trouble resolving an AD group -
"domain admins". The relevant log entries seem to be:
(2021-12-29 13:40:17): [nss] [cache_req_search_cache] (0x0400): CR #152:
Looking up [domain admins(a)ad.nwra.com] in cache
(2021-12-29 13:40:17): [nss] [sysdb_search_override_by_name] (0x0400): No user
override found for name [domain admins(a)ad.nwra.com].
(2021-12-29 13:40:17): [nss] [sysdb_getgrnam_with_views] (0x4000): Group
object [name=domain admins(a)ad.nwra.com,cn=groups,cn=ad.nwra.com,cn=sysdb],
contains ghost entries which must be resolved before overrides can be applied.
(2021-12-29 13:40:17): [nss] [sysdb_getgrnam_with_views] (0x4000): Returning
empty result.
(2021-12-29 13:40:17): [nss] [cache_req_search_cache] (0x0400): CR #152:
Object [domain admins(a)ad.nwra.com] was not found in cache
(2021-12-29 13:40:17): [nss] [cache_req_search_ncache_add_to_domain] (0x0400):
CR #152: Adding [domain admins(a)ad.nwra.com] to negative cache
(2021-12-29 13:40:17): [nss] [sss_ncache_set_str] (0x0400): Adding
[NCE/GROUP/ad.nwra.com/domain admins(a)ad.nwra.com] to negative cache
(2021-12-29 13:40:17): [nss] [cache_req_process_result] (0x0400): CR #152:
Finished: Not found
(2021-12-29 13:40:17): [nss] [sss_domain_get_state] (0x1000): Domain
ad.nwra.com is Active
(2021-12-29 13:40:17): [nss] [nss_protocol_done] (0x4000): Sending reply: not
found
on working systems we don't have the sysdb_getgrnam_with_views message. I'd
rather not clear the sssd database. Is there anything else that can be done?
'sss_cache -g "domain admins"' does not help.
We're using an IPA <-> AD trust.
Thanks
--
Orion Poplawski
IT Systems Manager 720-772-5637
NWRA, Boulder/CoRA Office FAX: 303-415-9702
3380 Mitchell Lane orion(a)nwra.com
Boulder, CO 80301 https://www.nwra.com/
1 year, 8 months
Having trouble getting GSSAPI to work
by Aram Akhavan
Hi all,
I'm new to sssd and am working on deploying it in my homelab on a test VM.
So far, I've successfully joined my host to my very basic/vanilla Active
Directory domain using *realm join*. I can log in via console and ssh
using AD credentials, and sudo works great too.
I can't for the life of me get GSSAPI to work on ssh, though. My
relevant sshd_config options are:
# GSSAPI options
GSSAPIAuthentication yes
#GSSAPICleanupCredentials yes
#GSSAPIStrictAcceptorCheck yes
GSSAPIKeyExchange yes
I turned on debug logging on the ssh server and client and the only
thing I can see that would suggest any issues are:
Dec 16 23:09:55 test sshd[6068]: debug3: userauth_finish: failure
partial=0 next methods="publickey,gssapi-keyex,gssapi-with-mic,password"
[preauth]
I do see this in the syslog when sssd is restarted, though everything
else does still work:
Dec 16 23:10:20 test sssd[6102]: tkey query failed: GSSAPI error: Major
= Unspecified GSS failure. Minor code may provide more information,
Minor = Server not found in Kerberos database.
In my sssd_nub.lan.log file I have a few errors but from what I can tell
they're all related to dynamic dns updates:
(2021-12-16 23:10:10): [be[nub.lan]] [ad_disable_gc] (0x0040): POSIX
attributes were requested but are not present on the server side. Global
Catalog lookups will be disabled
(2021-12-16 23:10:20): [be[nub.lan]] [child_sig_handler] (0x0020): child
[6102] failed with status [2].
(2021-12-16 23:10:20): [be[nub.lan]] [nsupdate_child_handler] (0x0040):
Dynamic DNS child failed with status [512]
(2021-12-16 23:10:20): [be[nub.lan]] [be_nsupdate_done] (0x0040):
nsupdate child execution failed [1432158240]: Dynamic DNS update failed
(2021-12-16 23:10:20): [be[nub.lan]] [child_sig_handler] (0x0020): child
[6106] failed with status [2].
(2021-12-16 23:10:20): [be[nub.lan]] [nsupdate_child_handler] (0x0040):
Dynamic DNS child failed with status [512]
(2021-12-16 23:10:20): [be[nub.lan]] [be_nsupdate_done] (0x0040):
nsupdate child execution failed [1432158240]: Dynamic DNS update failed
(2021-12-16 23:10:20): [be[nub.lan]] [ad_dyndns_sdap_update_done]
(0x0040): Dynamic DNS update failed [1432158240]: Dynamic DNS update failed
(2021-12-16 23:10:20): [be[nub.lan]] [be_ptask_done] (0x0040): Task
[Dyndns update]: failed with [1432158240]: Dynamic DNS update failed
(2021-12-16 23:25:20): [be[nub.lan]] [sss_ldap_init_sys_connect_done]
(0x0020): ldap_init_fd failed: Bad parameter to an ldap routine.
[23][cldap://arbiter.nub.lan:389]
(2021-12-16 23:25:20): [be[nub.lan]] [sdap_sys_connect_done] (0x0020):
sdap_async_connect_call request failed: [5]: Input/output error.
(2021-12-16 23:25:20): [be[nub.lan]] [sss_ldap_init_sys_connect_done]
(0x0020): ldap_init_fd failed: Bad parameter to an ldap routine.
[24][cldap://ARBITER.nub.lan:389]
(2021-12-16 23:25:20): [be[nub.lan]] [sdap_sys_connect_done] (0x0020):
sdap_async_connect_call request failed: [5]: Input/output error.
(2021-12-16 23:25:20): [be[nub.lan]] [ad_cldap_ping_done] (0x0040):
Unable to get site and forest information [2]: No such file or directory
I noticed the sssd troubleshooting basics mention to use *kinit* for
debug, which I did, and *klist* shows:
Ticket cache: FILE:/tmp/krb5cc_7000_MM3M16
Default principal: aram(a)NUB.LAN
Valid starting Expires Service principal
12/16/2021 23:28:30 12/17/2021 09:28:30 krbtgt/NUB.LAN(a)NUB.LAN
renew until 12/17/2021 23:28:27
I'm guessing my issue may be related to the service principal name used
for sshd, but despite my best searching efforts, I couldn't find
anything that tells me what it should be or how I might add it to AD.
I'm stuck! Any pointers or guidance would be greatly appreciated.
Thanks,
Aram
1 year, 9 months
kcm, gssproxy and klist
by Winberg Adam
With KCM and gssproxy we often see a long list of credentials when doing a 'klist':
[user.u@lxserv2114 ~]$ klist
Ticket cache: KCM:17098:66803
Default principal: user.u@AD
Valid starting Expires Service principal
01/01/1970 00:00:00 01/01/1970 00:00:00 Encrypted/Credentials/v1@X-GSSPROXY:
01/01/1970 00:00:00 01/01/1970 00:00:00 Encrypted/Credentials/v1@X-GSSPROXY:
01/01/1970 00:00:00 01/01/1970 00:00:00 Encrypted/Credentials/v1@X-GSSPROXY:
01/01/1970 00:00:00 01/01/1970 00:00:00 Encrypted/Credentials/v1@X-GSSPROXY:
01/01/1970 00:00:00 01/01/1970 00:00:00 Encrypted/Credentials/v1@X-GSSPROXY:
01/01/1970 00:00:00 01/01/1970 00:00:00 Encrypted/Credentials/v1@X-GSSPROXY:
01/01/1970 00:00:00 01/01/1970 00:00:00 Encrypted/Credentials/v1@X-GSSPROXY:
and so on...
The actual gssproxy credentials at /var/lib/gssproxy/clients/ does not correspond with this output, it only contains what could be expected - a TGT and maybe some service tickets.
The ever growing 'klist' list of credentials is a problem, after a while the user can no longer get any new credentials and therefore has no access to its NFS homedir (sec=krb5). I'm guessing it's the 'max_uid_ccaches' option in sssd-kcm that prevents this.
What is going on here - have we configured gssproxy/kcm wrong or is this a bug?
Regards
Adam
1 year, 9 months
SSSD as a backend to FreeRadius
by Ned Wilson
In our organization, we have an Active Directory domain, and a CentOS IdM subdomain at a remote site that has a two-way trust relationship with the master ID domain. Since this remote site is using a less-than-reliable internet connection, it was built this way so that we can ensure, with use of cached credentials, that authentication will be speedy to the end users. Furthermore, if and when the connection goes down, end users will experience no loss of functionality until the cached credentials expire.
The IPA master is running CentOS Stream 9, and the trust relationship has been configured as follows:
yum install ipa-server-trust-ad
ipa-adtrust-install --netbios-name=CENTOSIDM --admin-name=admin --add-sids --add-agents --enable-compat
ipa trust-add AD.MASTER --type=ad --admin=Administrator --server=pdc.ad.master --range-type=ipa-ad-trust-posix --all --raw --two-way=true
There are file servers at this remote site that are using Samba. Users are able to authenticate to the Samba servers with either AD or IPA credentials. On the client side, this is accomplished using first ipa-client-install, followed by ipa-client-samba.
The one requirement that has yet to be satisfied here is VPN access for those users at the remote site. In my read of the FreeRADIUS documentation, PEAP-MSCHAP-v2 authentication will only work if you either use the ntlm_auth binary, or have a version of FreeRADIUS that was built with support for direct linking to the winbind libraries.
Since these machines are all either IPA masters or IPA clients, I have not messed with the Samba configs much. Everything is running Samba 4.14, which mandates the use of winbind, but they are all using sss as a backend for idmap. This works well enough for file sharing, but FreeRADIUS just will not have it.
I should confess - I'm not too familiar with the eccentricities of winbind. In this instance, I'm just not sure how to configure it ( or not ) in such a way as to get FreeRADIUS to successfully authenticate a user from a trusted domain, with ntlm_auth.
It also seems, from reading the FreeRADIUS documentation, that SSSD is just not supported as a backend, or at least, not directly. I was able to get both krb5 and pam to work, but these require passwords to be sent in clear text. I need some way to deal with MSCHAP-v2 authentication.
It had occurred to me that I could find an older source RPM for sssd-libwinbind and sssd-libwinbind-devel, compile those, and then build a version of FreeRADIUS from source that is linked against those libraries. However, since sssd-libwinbind was removed from the sssd GitHub project, and support has been removed from RHEL8 at this point, I was a little worried about going forward with this.
Any ideas?
1 year, 9 months
Building sssd RPMs from source for RHEL8....
by Spike White
All,
I have reviewed:
https://github.com/SSSD/sssd
https://sssd.io/
And most especially:
https://sssd.io/contrib/building-sssd.html
In an attempt to build RHEL8 sssd RPMs from github.com:SSSD/sssd.git.
In the past, I have attempted to build RHEL8 RPMs on RHEL8. That is a
fool's errand! I realize that now, because RHEL is not self-hosting.
(I have loads of test RHEL & OL 7 & 8 servers available, no Fedora.)
So I have to stand up a Fedora server to build the RHEL 8 RPMs. Currently,
Fedora is at Fedora 35 and RHEL8 is based on Fed 28. Is that ok, or do I
have to find Fed 28?
I've reviewed the RHEL9 beta (which is based on Fed 35). Doesn't seem that
different than RHEL8.
Spike
1 year, 9 months