prepping deployment of sssd in an environment with ~60,000 accounts, ~4500 groups, backend is AD. Some accounts are members of ~200 groups, whose total members might exceed 35,000 members total. None of this is ideal, and frankly most of my issues can be attributed to poor historic decisions around managing identity in this decades old environment.
With "ignore_group_members = false", if a single account ( who is a member of 200 groups, some of which have 35,000 members) runs "id" it can take minutes to complete on an uncached sssd client. With this configuration option set to true, the operation can complete in a few seconds on an uncached sssd client. This is great, however the accounts in this environment are fond of running getent group <some_group> and returning <some_group>'s member list, which is disabled with "ignore_group_members = true".
I was wondering if I missed a configuration item that might allow both "quick" id <account_with_many_large_groups_as_member> operations AND getent group <some_group>?
Assuming no configuration item to address this, Is it conceivable that sssd could consider foregrounding "id" type operations for accounts when all that is being requested is a list of group ids and group names for a single account, while deferring or backgrounding all of the group member enumeration happening on the backend when "id <account>" is run? If this is conceivable perhaps a pointer to where I might look in the code to see about this?
Perhaps I'm barking up the wrong tree, and it's simpler to write a wrapper for getent group that caches the equivalent ldapsearch?
Thank you for your consideration and development of this software.
Hi.
I'll take a stab at providing some info, but please don't take this as a definitive answer.
If you take a glance at https://sssd.io/_images/architecture.svg you'll see that SSSD is built around its cache (/var/lib/sss/db/*)
The problem of a "slow `id` of a user that is a member of a bunch of big groups" is a very prominent SSSD problem in large environments, culminating in the IPA-AD trust scenario. And quite long standing: https://jhrozek.wordpress.com/2015/08/19/performance-tuning-sssd-for-large-i... So far 'ignore_group_members = true' is by far maring the best response available.
On a high level, on a client side the problem is two-fold: (1) slow cache write operations (by "backends", 'sssd_be' process) (2) slow cache read operations (by "responders", 'sssd_nss' in your case)
(2) is being addressed to some extent: I currently have patches posted for review - https://github.com/SSSD/sssd/pull/7841 - that show some promise. Depending on specifics of your setup and workflow, those patches might, or might not, provide you some alleviation. Typical scenario where pronounced benefits are expected: busy server with a hot and *huge* cache that performs tons of identity operations. If you see it worth and could give those patches a try and then provide feedback - that would be great.
(1) is more tricky. We have profiling results that show that most of CPU time is consumed in: - https://github.com/SSSD/sssd/blob/master/src/ldb_modules/memberof.c This a plugin for a 3rd party library - `libldb` - that on the fly adds 'memberof: group-dn" attributes to user objects being written to the cache. - otherwise CPU consumption really depends on a backend being used - IPA, AD, LDAP with or without nested groups, etc. There is no single bottleneck.
Now getting to your ideas, if I understood it correctly.
What you describe is more or less what already happens when 'id_provider = ldap' is used. When one does `getent -s sss group $group` with 'ignore_group_members = false', it will return all group members. But inspection of /var/lib/sss/db/cache_$domain.ldb will show only the group object being cached, containing all members as "ghost" and "orig_member" attributes.
With IPA it doesn't work this way, if I understand correctly, to properly support IPA views (server side overrides) - user objects need to be resolved, so that the group could return overridden members properly.
Honestly, I don't remember right now how it works exactly with "id_provider = ad". If you are curious, you can stop SSSD, wipe cache, start SSSD, resolve single group (`getent group ...`), stop SSSD and inspect cache content using 'ldbsearch' tool. I'm talking about `getent group` because this is - `getgrgid()` - what takes time when you call `id`. `id` first resolves user (fast), list of groups user is member of (fast* using tokenGroups), and then it needs to convert every GID to groupname using `getgrgid()` - this loop is what typically hammers SSSD.
*) well, tokenGroups returns a list of SIDs, and SSSD still needs to loop and resolve every SID, this is definitely fast if 'ignore_group_members = true' but I don't remember right now what happens here otherwise, maybe all groups members gets resolved as well already here (in this case `id` loops over the cache already).
Hope this helps.
On Wed, Feb 26, 2025 at 7:23 PM Bob Green via sssd-devel sssd-devel@lists.fedorahosted.org wrote:
prepping deployment of sssd in an environment with ~60,000 accounts, ~4500 groups, backend is AD. Some accounts are members of ~200 groups, whose total members might exceed 35,000 members total. None of this is ideal, and frankly most of my issues can be attributed to poor historic decisions around managing identity in this decades old environment.
With "ignore_group_members = false", if a single account ( who is a member of 200 groups, some of which have 35,000 members) runs "id" it can take minutes to complete on an uncached sssd client. With this configuration option set to true, the operation can complete in a few seconds on an uncached sssd client. This is great, however the accounts in this environment are fond of running getent group <some_group> and returning <some_group>'s member list, which is disabled with "ignore_group_members = true".
I was wondering if I missed a configuration item that might allow both "quick" id <account_with_many_large_groups_as_member> operations AND getent group <some_group>?
Assuming no configuration item to address this, Is it conceivable that sssd could consider foregrounding "id" type operations for accounts when all that is being requested is a list of group ids and group names for a single account, while deferring or backgrounding all of the group member enumeration happening on the backend when "id <account>" is run? If this is conceivable perhaps a pointer to where I might look in the code to see about this?
Perhaps I'm barking up the wrong tree, and it's simpler to write a wrapper for getent group that caches the equivalent ldapsearch?
Thank you for your consideration and development of this software.
sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org To unsubscribe send an email to sssd-devel-leave@lists.fedorahosted.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedorahosted.org/archives/list/sssd-devel@lists.fedorahosted.o... Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
sssd-devel@lists.fedorahosted.org