Not too long back, we noticed a mention in a RHEL6 release note that sssd supported autofs now. We currently run autofs directly against ldap, but just for fun we thought we'd see how well it played with sssd, which is using the same ldap system for users/groups.
Unfortunately, our initial try was unsuccessful. Based on reviewing the code, it looks like sssd wants to download the *entire* autofs map, including all entries, from ldap in order it serve it to autofs? We have something like 120000 users/groups, so that's not really feasible or desirable 8-/.
Am I understanding the operation correctly? Or is there some way to get it to look up map entries as necessary like it does for users/groups? It doesn't try to suck all of the users or all of the groups out of ldap at initialization, I'm not sure why it wants to do so for autofs maps.
Thanks...
On Sun, Mar 15, 2015 at 05:05:18PM -0700, Paul B. Henson wrote:
Not too long back, we noticed a mention in a RHEL6 release note that sssd supported autofs now. We currently run autofs directly against ldap, but just for fun we thought we'd see how well it played with sssd, which is using the same ldap system for users/groups.
Unfortunately, our initial try was unsuccessful. Based on reviewing the code, it looks like sssd wants to download the *entire* autofs map, including all entries, from ldap in order it serve it to autofs? We have something like 120000 users/groups, so that's not really feasible or desirable 8-/.
Am I understanding the operation correctly? Or is there some way to get it to look up map entries as necessary like it does for users/groups? It doesn't try to suck all of the users or all of the groups out of ldap at initialization, I'm not sure why it wants to do so for autofs maps.
SSSD would download whatever autofs would tell it to..if you increase debugging in the autofs responder and the domain section you'd see the queries coming from automounter to the autofs responder in the responder log and the LDAP searches in the domain log.
HTH
On Mon, Mar 16, 2015 at 09:14:26AM +0100, Jakub Hrozek wrote:
SSSD would download whatever autofs would tell it to..if you increase debugging in the autofs responder and the domain section you'd see the queries coming from automounter to the autofs responder in the responder log and the LDAP searches in the domain log.
Hmm, I'm not sure how to interpret that. While we haven't tested with the latest sssd, only the one shipping with RHEL6, I don't see any commits or code changes that would result in different behavior.
Specifically, the behavior I see is that on startup, sssd tries to download *every* entry in a map. autofs is configured to look at files, then sssd, and the file /etc/auto.master contains:
/user auto_user -fstype=nfs4,sec=krb5p,noresvport
Here are the queries sssd makes:
----- Mar 13 19:04:16 shelley slapd[22757]: conn=1333352 op=2 SRCH base="ou=automount, ou=service,dc=csupomona,dc=edu" scope=2 deref=3 filter="(&(ou=auto_user)(objectClass=organizationalUnit))"
Mar 13 19:04:16 shelley slapd[22757]: conn=1333352 op=2 SEARCH RESULT tag=101 err=0 nentries=1 text= -----
First, it looks for, and finds, the auto_user map in ldap.
----- Mar 13 19:04:16 shelley slapd[22757]: conn=1333352 op=3 SRCH base="ou=auto_user, ou=automount,ou=service,dc=csupomona,dc=edu" scope=2 deref=3 filter="(&(uid=*)(objectClass=top))"
Mar 13 19:04:16 shelley slapd[22757]: conn=1333352 op=3 SEARCH RESULT tag=101 err=4 nentries=100 text= -----
Next, it tries to download *every* entry in that map. The connection from sssd to ldap is resource restricted, so it only gets the first 100 (of the 120000) entries. This is further demonstrated by the sssd log:
(Thu Mar 12 18:12:42 2015) [sssd[autofs]] [sysdb_autofs_entries_by_map] (0x2000): found 100 entries for map auto_user
At this point, trying to access /user/<username>, where <username> was one of the 100 results returned, works fine. Where <username> was not one of the first 100 fails utterly with not found errors.
Am I missing something? The behavior I would expect is that a given automount key would be searched for when access was attempted to that directory, ie on demand. That is the behavior autofs exhibits when pointed directly at ldap rather than at sssd. If I try to access /user/fred, an ldap search is made for uid=fred. /user/bob, a search for uid=bob. autofs pointed directly at ldap *never* does a search for uid=*.
Thanks...
On Mon, Mar 16, 2015 at 06:15:07PM -0700, Paul B. Henson wrote:
On Mon, Mar 16, 2015 at 09:14:26AM +0100, Jakub Hrozek wrote:
SSSD would download whatever autofs would tell it to..if you increase debugging in the autofs responder and the domain section you'd see the queries coming from automounter to the autofs responder in the responder log and the LDAP searches in the domain log.
Hmm, I'm not sure how to interpret that.
What I meant is that sssd only performs LDAP lookups based on what queries it receives through the automounter deamon.
While we haven't tested with the latest sssd, only the one shipping with RHEL6, I don't see any commits or code changes that would result in different behavior.
Right, the autofs code is mostly stable.
Specifically, the behavior I see is that on startup, sssd tries to download *every* entry in a map. autofs is configured to look at files, then sssd, and the file /etc/auto.master contains:
/user auto_user -fstype=nfs4,sec=krb5p,noresvport
Here are the queries sssd makes:
Mar 13 19:04:16 shelley slapd[22757]: conn=1333352 op=2 SRCH base="ou=automount, ou=service,dc=csupomona,dc=edu" scope=2 deref=3 filter="(&(ou=auto_user)(objectClass=organizationalUnit))"
Mar 13 19:04:16 shelley slapd[22757]: conn=1333352 op=2 SEARCH RESULT tag=101 err=0 nentries=1 text=
First, it looks for, and finds, the auto_user map in ldap.
Mar 13 19:04:16 shelley slapd[22757]: conn=1333352 op=3 SRCH base="ou=auto_user, ou=automount,ou=service,dc=csupomona,dc=edu" scope=2 deref=3 filter="(&(uid=*)(objectClass=top))"
Mar 13 19:04:16 shelley slapd[22757]: conn=1333352 op=3 SEARCH RESULT tag=101 err=4 nentries=100 text=
Next, it tries to download *every* entry in that map. The connection from sssd to ldap is resource restricted, so it only gets the first 100 (of the 120000) entries. This is further demonstrated by the sssd log:
(Thu Mar 12 18:12:42 2015) [sssd[autofs]] [sysdb_autofs_entries_by_map] (0x2000): found 100 entries for map auto_user
At this point, trying to access /user/<username>, where <username> was one of the 100 results returned, works fine. Where <username> was not one of the first 100 fails utterly with not found errors.
Am I missing something? The behavior I would expect is that a given automount key would be searched for when access was attempted to that directory, ie on demand. That is the behavior autofs exhibits when pointed directly at ldap rather than at sssd. If I try to access /user/fred, an ldap search is made for uid=fred. /user/bob, a search for uid=bob. autofs pointed directly at ldap *never* does a search for uid=*.
Sorry, I had to go and re-read the autofs LDAP code since we haven't touched it in years. Indeed it seems that we download all the keys at once. I don't remember why we did that and I didn't realize the plain automounter driver does this differently. Maybe we were trying to reduce the number of round-trips, I no longer remember that.
Ian, do you think it would make sense to only fetch keys as they are requested by either _sss_getautomntent_r() or _sss_getautomntbyname_r() as opposed to fetching them all when _sss_setautomntent() is called?
Is that how the 'native' ldap integration works?
On Tue, Mar 17, 2015 at 02:02:08PM +0100, Jakub Hrozek wrote:
On Mon, Mar 16, 2015 at 06:15:07PM -0700, Paul B. Henson wrote:
On Mon, Mar 16, 2015 at 09:14:26AM +0100, Jakub Hrozek wrote:
SSSD would download whatever autofs would tell it to..if you increase debugging in the autofs responder and the domain section you'd see the queries coming from automounter to the autofs responder in the responder log and the LDAP searches in the domain log.
Hmm, I'm not sure how to interpret that.
What I meant is that sssd only performs LDAP lookups based on what queries it receives through the automounter deamon.
While we haven't tested with the latest sssd, only the one shipping with RHEL6, I don't see any commits or code changes that would result in different behavior.
Right, the autofs code is mostly stable.
Specifically, the behavior I see is that on startup, sssd tries to download *every* entry in a map. autofs is configured to look at files, then sssd, and the file /etc/auto.master contains:
/user auto_user -fstype=nfs4,sec=krb5p,noresvport
Here are the queries sssd makes:
Mar 13 19:04:16 shelley slapd[22757]: conn=1333352 op=2 SRCH base="ou=automount, ou=service,dc=csupomona,dc=edu" scope=2 deref=3 filter="(&(ou=auto_user)(objectClass=organizationalUnit))"
Mar 13 19:04:16 shelley slapd[22757]: conn=1333352 op=2 SEARCH RESULT tag=101 err=0 nentries=1 text=
First, it looks for, and finds, the auto_user map in ldap.
Mar 13 19:04:16 shelley slapd[22757]: conn=1333352 op=3 SRCH base="ou=auto_user, ou=automount,ou=service,dc=csupomona,dc=edu" scope=2 deref=3 filter="(&(uid=*)(objectClass=top))"
Mar 13 19:04:16 shelley slapd[22757]: conn=1333352 op=3 SEARCH RESULT tag=101 err=4 nentries=100 text=
Next, it tries to download *every* entry in that map. The connection from sssd to ldap is resource restricted, so it only gets the first 100 (of the 120000) entries. This is further demonstrated by the sssd log:
(Thu Mar 12 18:12:42 2015) [sssd[autofs]] [sysdb_autofs_entries_by_map] (0x2000): found 100 entries for map auto_user
At this point, trying to access /user/<username>, where <username> was one of the 100 results returned, works fine. Where <username> was not one of the first 100 fails utterly with not found errors.
Am I missing something? The behavior I would expect is that a given automount key would be searched for when access was attempted to that directory, ie on demand. That is the behavior autofs exhibits when pointed directly at ldap rather than at sssd. If I try to access /user/fred, an ldap search is made for uid=fred. /user/bob, a search for uid=bob. autofs pointed directly at ldap *never* does a search for uid=*.
Sorry, I had to go and re-read the autofs LDAP code since we haven't touched it in years. Indeed it seems that we download all the keys at once. I don't remember why we did that and I didn't realize the plain automounter driver does this differently. Maybe we were trying to reduce the number of round-trips, I no longer remember that.
Ian, do you think it would make sense to only fetch keys as they are requested by either _sss_getautomntent_r() or _sss_getautomntbyname_r() as opposed to fetching them all when _sss_setautomntent() is called?
Is that how the 'native' ldap integration works?
Sorry, I meant to also CC Ian.
On Tue, 2015-03-17 at 14:06 +0100, Jakub Hrozek wrote:
On Tue, Mar 17, 2015 at 02:02:08PM +0100, Jakub Hrozek wrote:
On Mon, Mar 16, 2015 at 06:15:07PM -0700, Paul B. Henson wrote:
On Mon, Mar 16, 2015 at 09:14:26AM +0100, Jakub Hrozek wrote:
SSSD would download whatever autofs would tell it to..if you increase debugging in the autofs responder and the domain section you'd see the queries coming from automounter to the autofs responder in the responder log and the LDAP searches in the domain log.
Hmm, I'm not sure how to interpret that.
What I meant is that sssd only performs LDAP lookups based on what queries it receives through the automounter deamon.
While we haven't tested with the latest sssd, only the one shipping with RHEL6, I don't see any commits or code changes that would result in different behavior.
Right, the autofs code is mostly stable.
Specifically, the behavior I see is that on startup, sssd tries to download *every* entry in a map. autofs is configured to look at files, then sssd, and the file /etc/auto.master contains:
/user auto_user -fstype=nfs4,sec=krb5p,noresvport
Here are the queries sssd makes:
Mar 13 19:04:16 shelley slapd[22757]: conn=1333352 op=2 SRCH base="ou=automount, ou=service,dc=csupomona,dc=edu" scope=2 deref=3 filter="(&(ou=auto_user)(objectClass=organizationalUnit))"
Mar 13 19:04:16 shelley slapd[22757]: conn=1333352 op=2 SEARCH RESULT tag=101 err=0 nentries=1 text=
First, it looks for, and finds, the auto_user map in ldap.
Mar 13 19:04:16 shelley slapd[22757]: conn=1333352 op=3 SRCH base="ou=auto_user, ou=automount,ou=service,dc=csupomona,dc=edu" scope=2 deref=3 filter="(&(uid=*)(objectClass=top))"
Mar 13 19:04:16 shelley slapd[22757]: conn=1333352 op=3 SEARCH RESULT tag=101 err=4 nentries=100 text=
Next, it tries to download *every* entry in that map. The connection from sssd to ldap is resource restricted, so it only gets the first 100 (of the 120000) entries. This is further demonstrated by the sssd log:
(Thu Mar 12 18:12:42 2015) [sssd[autofs]] [sysdb_autofs_entries_by_map] (0x2000): found 100 entries for map auto_user
At this point, trying to access /user/<username>, where <username> was one of the 100 results returned, works fine. Where <username> was not one of the first 100 fails utterly with not found errors.
Am I missing something? The behavior I would expect is that a given automount key would be searched for when access was attempted to that directory, ie on demand. That is the behavior autofs exhibits when pointed directly at ldap rather than at sssd. If I try to access /user/fred, an ldap search is made for uid=fred. /user/bob, a search for uid=bob. autofs pointed directly at ldap *never* does a search for uid=*.
Sorry, I had to go and re-read the autofs LDAP code since we haven't touched it in years. Indeed it seems that we download all the keys at once. I don't remember why we did that and I didn't realize the plain automounter driver does this differently. Maybe we were trying to reduce the number of round-trips, I no longer remember that.
Ian, do you think it would make sense to only fetch keys as they are requested by either _sss_getautomntent_r() or _sss_getautomntbyname_r() as opposed to fetching them all when _sss_setautomntent() is called?
Is that how the 'native' ldap integration works?
Not quite.
If an indirect map has the "browse" (or ghost) option the entire map must be read so the keys are known and can be used to pre-create the mount point directories. Otherwise autofs tries to only read the map entry upon lookup and update the entry entry if needed.
Direct maps must always be read completely because the direct mount triggers must be created for each map entry at startup and refresh. Direct maps are only updated upon receiving a HUP signal.
What we do now to read an sss autofs map is call setautomntent() and repeatedly call getautomntent_r() until we get an ENOENT to signify we have the entire map.
I think we discussed this at the time and, given the cases, decided it was best for sss to read the entire map and cache it since it might need to be able to supply the entire map and can't know if that will be the case.
Ian
From: Ian Kent Sent: Tuesday, March 17, 2015 6:37 AM
If an indirect map has the "browse" (or ghost) option the entire map
[...]
Direct maps must always be read completely because the direct mount
Ah, yes, we do not use the ghost option or direct maps, so I did not consider that use case.
I think we discussed this at the time and, given the cases, decided it was best for sss to read the entire map and cache it since it might need to be able to supply the entire map and can't know if that will be the case.
On the other hand, there are cases such as mine where not only is reading the entire map unnecessary but inadvisable and practically infeasible 8-/. For my deployment, using sssd to access autofs data in ldap rather than autofs accessing it directly would drastically increase the load on the LDAP server, which seems completely cross purpose to the existence of sssd. Currently there is sporadic load as individual entries (probably no more than 10-1000 or so in a given day, depending on the server) are looked up. With sssd, all 120000 entries would be pulled over on an ongoing basis throughout the day (I'm not sure how often it refreshes its local cache).
That's okay though, I don't really have any strong need to use sssd for autofs. I just thought it would be interesting to look into given the new RHEL6 support from the perspective of centralizing all LDAP lookups into one place. Assuming there are no planned changes to the native autofs ldap support, that meets our needs fine.
Thanks…
On 03/17/2015 04:04 PM, Paul B. Henson wrote:
From: Ian Kent Sent: Tuesday, March 17, 2015 6:37 AM
If an indirect map has the "browse" (or ghost) option the entire map
[...]
Direct maps must always be read completely because the direct mount
Ah, yes, we do not use the ghost option or direct maps, so I did not consider that use case.
I think we discussed this at the time and, given the cases, decided it was best for sss to read the entire map and cache it since it might need to be able to supply the entire map and can't know if that will be the case.
On the other hand, there are cases such as mine where not only is reading the entire map unnecessary but inadvisable and practically infeasible 8-/. For my deployment, using sssd to access autofs data in ldap rather than autofs accessing it directly would drastically increase the load on the LDAP server, which seems completely cross purpose to the existence of sssd. Currently there is sporadic load as individual entries (probably no more than 10-1000 or so in a given day, depending on the server) are looked up. With sssd, all 120000 entries would be pulled over on an ongoing basis throughout the day (I'm not sure how often it refreshes its local cache).
That's okay though, I don't really have any strong need to use sssd for autofs. I just thought it would be interesting to look into given the new RHEL6 support from the perspective of centralizing all LDAP lookups into one place. Assuming there are no planned changes to the native autofs ldap support, that meets our needs fine.
I think we still should file an RFE for autofs support in SSSD to handle your case in future. It is a valid case and there is a reason why SSSD is better for than direct case (caching mainly). I am not saying we will fix it quickly but if we see more cases like yours it will bump the priority.
Thanks…
sssd-users mailing list sssd-users@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/sssd-users
From: Dmitri Pal Sent: Tuesday, March 17, 2015 3:00 PM
I think we still should file an RFE for autofs support in SSSD to handle your case in future.
Sure; do you want me to open a Red Hat support case to trigger that? Or should I just create a ticket in the open source tracking system at https://fedorahosted.org/sssd/newticket? Or do you mean you're just going to make a note of it yourself and I don't need to do anything in particular?
Thanks…
On 03/17/2015 08:49 PM, Paul B. Henson wrote:
From: Dmitri Pal Sent: Tuesday, March 17, 2015 3:00 PM
I think we still should file an RFE for autofs support in SSSD to handle your case in future.
Sure; do you want me to open a Red Hat support case to trigger that? Or should I just create a ticket in the open source tracking system at https://fedorahosted.org/sssd/newticket?
Yes a ticket would be nice.
Or do you mean you're just going to make a note of it yourself and I don't need to do anything in particular?
Thanks…
From: Dmitri Pal Sent: Wednesday, March 18, 2015 11:50 AM
Yes a ticket would be nice.
Okay, ticket created:
https://fedorahosted.org/sssd/ticket/2607
I also opened Red Hat support case 1386246 referencing the ticket, requesting that it be fixed and backported to RHEL6, for whatever that is worth :).
Thanks…
On Tue, 2015-03-17 at 13:04 -0700, Paul B. Henson wrote:
From: Ian Kent Sent: Tuesday, March 17, 2015 6:37 AM
If an indirect map has the "browse" (or ghost) option the entire map
[...]
Direct maps must always be read completely because the direct mount
Ah, yes, we do not use the ghost option or direct maps, so I did not consider that use case.
LOL, and understandably with that many map entries.
I think we discussed this at the time and, given the cases, decided it was best for sss to read the entire map and cache it since it might need to be able to supply the entire map and can't know if that will be the case.
On the other hand, there are cases such as mine where not only is reading the entire map unnecessary but inadvisable and practically infeasible 8-/. For my deployment, using sssd to access autofs data in ldap rather than autofs accessing it directly would drastically increase the load on the LDAP server, which seems completely cross purpose to the existence of sssd. Currently there is sporadic load as individual entries (probably no more than 10-1000 or so in a given day, depending on the server) are looked up. With sssd, all 120000 entries would be pulled over on an ongoing basis throughout the day (I'm not sure how often it refreshes its local cache).
Indeed, IIRC my large map testing showed that somewhere less than 30000 is about the practical limit for reading the entire map. There would be other side effects in autofs spending that much time reading the the entire map too so I don't think it will ever be possible in your case.
That's okay though, I don't really have any strong need to use sssd for autofs. I just thought it would be interesting to look into given the new RHEL6 support from the perspective of centralizing all LDAP lookups into one place. Assuming there are no planned changes to the native autofs ldap support, that meets our needs fine.
I can't change the way the lookup and read map works very much specifically because of the problems really large maps bring. If there are any changes made they must also cover the large map requirement.
But we haven't heard from Jakub yet, it may be possible to change sss to lookup and read the map on demand rather than all at once. Not sure though given I'm not familiar with the wider design of sss.
Ian
On Wed, Mar 18, 2015 at 09:45:29AM +0800, Ian Kent wrote:
On Tue, 2015-03-17 at 13:04 -0700, Paul B. Henson wrote:
From: Ian Kent Sent: Tuesday, March 17, 2015 6:37 AM
If an indirect map has the "browse" (or ghost) option the entire map
[...]
Direct maps must always be read completely because the direct mount
Ah, yes, we do not use the ghost option or direct maps, so I did not consider that use case.
LOL, and understandably with that many map entries.
I think we discussed this at the time and, given the cases, decided it was best for sss to read the entire map and cache it since it might need to be able to supply the entire map and can't know if that will be the case.
On the other hand, there are cases such as mine where not only is reading the entire map unnecessary but inadvisable and practically infeasible 8-/. For my deployment, using sssd to access autofs data in ldap rather than autofs accessing it directly would drastically increase the load on the LDAP server, which seems completely cross purpose to the existence of sssd. Currently there is sporadic load as individual entries (probably no more than 10-1000 or so in a given day, depending on the server) are looked up. With sssd, all 120000 entries would be pulled over on an ongoing basis throughout the day (I'm not sure how often it refreshes its local cache).
Indeed, IIRC my large map testing showed that somewhere less than 30000 is about the practical limit for reading the entire map. There would be other side effects in autofs spending that much time reading the the entire map too so I don't think it will ever be possible in your case.
That's okay though, I don't really have any strong need to use sssd for autofs. I just thought it would be interesting to look into given the new RHEL6 support from the perspective of centralizing all LDAP lookups into one place. Assuming there are no planned changes to the native autofs ldap support, that meets our needs fine.
I can't change the way the lookup and read map works very much specifically because of the problems really large maps bring. If there are any changes made they must also cover the large map requirement.
But we haven't heard from Jakub yet, it may be possible to change sss to lookup and read the map on demand rather than all at once. Not sure though given I'm not familiar with the wider design of sss.
Sorry, the discussion fell out of my timezone :-)
Currently we implement: setautomntent(map_name) endautomntent(map_name) That select and de-select a map to work with, then also: getautomntent(map_name, cursor, max_entries) which is a iterator-like interface that returns up to max_entries keys starting from cursor and finally: getautomntbyname(map_name, key_name) that returns data of key_name inside map_name
I'm not sure if we can optimize getautomntent() to read the next entries on demand, I think we need to fetch the whole map there. But it would be entirely doable to add another LDAP request that would be invoked by getautomntbyname() and would fetch the single key.
Would that provide more efficient API towards automounter?
From: Jakub Hrozek Sent: Wednesday, March 18, 2015 12:29 AM
I'm not sure if we can optimize getautomntent() to read the next entries on demand, I think we need to fetch the whole map there.
Theoretically you could use the pagedResultsControl:
https://www.ietf.org/rfc/rfc2696.txt
I don't know if it would be worth the effort though. If someone has a use case that requires reading the entire map, they probably just need to keep the size of the map small enough that it's not an issue to read it all.
But it would be entirely doable to add another LDAP request that would be invoked by getautomntbyname() and would fetch the single key.
Would that provide more efficient API towards automounter?
Absolutely. My suggestion, as in my reply to Ian, would be to treat autofs maps the same way as users/groups are treated. Do not enumerate by default, and lookup entries as necessary. Also, modify the current enumerate config item to allow enumeration to be enabled/disabled separately for users, groups, and autofs maps. Perhaps extended to allow a list of backends in addition to true/false, where true means enable everything, false means enable nothing, and a specific list "passwd,group,autofs" or "autofs" means enable enumeration only for those data sources.
Hmm. Or maybe that's too complicated? Unlike user/groups, where any random user can run setXXent/getXXent and result in enumeration, are autofs map lookups restricted to autofs? Presumably autofs only calls getautomntent when it actually needs the entire map, so maybe an explicit enumeration enable/disable in sssd isn't required. If someone uses an autofs feature that requires the entire map, the entire map will be downloaded, otherwise it won't be. On further thought, an explicit flag might be confusing when someone tries to use direct maps or browse mode and they just mysteriously don't work because sssd has enumeration disabled 8-/.
I will open a ticket for an RFE as Dmitri requested for this issue.
Thanks.
On Wed, Mar 18, 2015 at 12:14:32PM -0700, Paul B. Henson wrote:
From: Jakub Hrozek Sent: Wednesday, March 18, 2015 12:29 AM
I'm not sure if we can optimize getautomntent() to read the next entries on demand, I think we need to fetch the whole map there.
Theoretically you could use the pagedResultsControl:
https://www.ietf.org/rfc/rfc2696.txt
I don't know if it would be worth the effort though. If someone has a use case that requires reading the entire map, they probably just need to keep the size of the map small enough that it's not an issue to read it all.
Yes, that's what we use for retrieving large groups. But for automounter maps it's not really practically usable I think because we would have to keep the cookie across requests to the back end.
With large group lookups, we simply do several small internal lookups as part of a single large group lookup, so maintaining the cookie is trivial.
But it would be entirely doable to add another LDAP request that would be invoked by getautomntbyname() and would fetch the single key.
Would that provide more efficient API towards automounter?
Absolutely. My suggestion, as in my reply to Ian, would be to treat autofs maps the same way as users/groups are treated. Do not enumerate by default, and lookup entries as necessary. Also, modify the current enumerate config item to allow enumeration to be enabled/disabled separately for users, groups, and autofs maps. Perhaps extended to allow a list of backends in addition to true/false, where true means enable everything, false means enable nothing, and a specific list "passwd,group,autofs" or "autofs" means enable enumeration only for those data sources.
Hmm. Or maybe that's too complicated? Unlike user/groups, where any random user can run setXXent/getXXent and result in enumeration, are autofs map lookups restricted to autofs? Presumably autofs only calls getautomntent when it actually needs the entire map, so maybe an explicit enumeration enable/disable in sssd isn't required. If someone uses an autofs feature that requires the entire map, the entire map will be downloaded, otherwise it won't be. On further thought, an explicit flag might be confusing when someone tries to use direct maps or browse mode and they just mysteriously don't work because sssd has enumeration disabled 8-/.
The problem is whether the /client/ which is the automounter in this case can/should be modified to leverage the new way of doing things.
Changing the SSSD is one part, but automounter needs to be changed as well if I understood Ian correctly as right now the client only iterates over the map.
I will open a ticket for an RFE as Dmitri requested for this issue.
OK
From: Jakub Hrozek Sent: Wednesday, March 18, 2015 12:55 PM
I don't know if it would be worth the effort though. If someone has a
use
case that requires reading the entire map, they probably just need to
keep
the size of the map small enough that it's not an issue to read it all.
Yes, that's what we use for retrieving large groups. But for automounter maps it's not really practically usable I think because we would have to keep the cookie across requests to the back end.
Hmm, are you talking about retrieving the members of a group, or multiple groups? The former is actually ranges, it is the latter that would be the page control. Also, unlike user/groups, where any random number of people could be calling getXXent, for automount maps, you have one and exactly one client. Just store the cookie statically somewhere when setautomountent is called, and use it whenever getautomountent is called and you are out of entries to fetch the next set. Clear the cookie when endautomountent is called...
The problem is whether the /client/ which is the automounter in this case can/should be modified to leverage the new way of doing things.
Changing the SSSD is one part, but automounter needs to be changed as well if I understood Ian correctly as right now the client only iterates over the map.
My understanding is that automounter has plug-ins for accessing each backend, specifically on RHEL6 there is
/usr/lib64/autofs/lookup_sss.so
for using sssd. However, I believe that backend only iterates because sssd only provided iteration 8-/. Other backends, such as the LDAP backend, are more efficient and lookup entries as needed. So yes, I guess the automounter sss lookup backend would need to be updated to match the new feature set of the sssd provider, but I can't imagine that would be particular complicated or difficult as it would simply be duplicating existing functionality in other backends.
Of course, that does take the probability of me ever actually seeing this feature in RHEL 6 from 1/infinity to 1/infinity+1 given there would actually be two RFE's to backport 8-/.
Ian, I don't think there is a separate bug tracker for autofs? Once I open the sssd ticket, should I just create one on bugzilla.redhat.com referencing it with the request to enhance autofs to support the new functionality once it is implemented? Or would it be better to open a Red Hat support case and have somebody there open the bug?
Thanks.
On Wed, Mar 18, 2015 at 01:18:21PM -0700, Paul B. Henson wrote:
From: Jakub Hrozek Sent: Wednesday, March 18, 2015 12:55 PM
I don't know if it would be worth the effort though. If someone has a
use
case that requires reading the entire map, they probably just need to
keep
the size of the map small enough that it's not an issue to read it all.
Yes, that's what we use for retrieving large groups. But for automounter maps it's not really practically usable I think because we would have to keep the cookie across requests to the back end.
Hmm, are you talking about retrieving the members of a group, or multiple groups? The former is actually ranges, it is the latter that would be the page control. Also, unlike user/groups, where any random number of people could be calling getXXent, for automount maps, you have one and exactly one client. Just store the cookie statically somewhere when setautomountent is called, and use it whenever getautomountent is called and you are out of entries to fetch the next set. Clear the cookie when endautomountent is called...
You're right, we only use the cookie when we expect multiple entries.
The problem is whether the /client/ which is the automounter in this case can/should be modified to leverage the new way of doing things.
Changing the SSSD is one part, but automounter needs to be changed as well if I understood Ian correctly as right now the client only iterates over the map.
My understanding is that automounter has plug-ins for accessing each backend, specifically on RHEL6 there is
/usr/lib64/autofs/lookup_sss.so
for using sssd. However, I believe that backend only iterates because sssd only provided iteration 8-/.
yes, that's what I meant.
Other backends, such as the LDAP backend, are more efficient and lookup entries as needed. So yes, I guess the automounter sss lookup backend would need to be updated to match the new feature set of the sssd provider, but I can't imagine that would be particular complicated or difficult as it would simply be duplicating existing functionality in other backends.
Of course, that does take the probability of me ever actually seeing this feature in RHEL 6 from 1/infinity to 1/infinity+1 given there would actually be two RFE's to backport 8-/.
Ian, I don't think there is a separate bug tracker for autofs? Once I open the sssd ticket, should I just create one on bugzilla.redhat.com referencing it with the request to enhance autofs to support the new functionality once it is implemented? Or would it be better to open a Red Hat support case and have somebody there open the bug?
Thanks.
On Wed, 2015-03-18 at 13:18 -0700, Paul B. Henson wrote:
Of course, that does take the probability of me ever actually seeing this feature in RHEL 6 from 1/infinity to 1/infinity+1 given there would actually be two RFE's to backport 8-/.
Ian, I don't think there is a separate bug tracker for autofs? Once I open the sssd ticket, should I just create one on bugzilla.redhat.com referencing it with the request to enhance autofs to support the new functionality once it is implemented? Or would it be better to open a Red Hat support case and have somebody there open the bug?
There isn't, the mailing list is used for upstream requests.
But the truth is that RHEL users have far more demanding environments so RHEL tends to drive autofs development.
As far as logging a case against RHEL goes either way will work but if you can navigate front line support and get them to escalate it without too much effort that is the recommended process.
The risk with logging a bug directly in Bugzilla is that it is necessarily the lowest priority for me because people with subscriptions are meant to log cases with support and those bugs and things that come it via other supported avenues have to be given priority.
Ian
On Wed, 2015-03-18 at 20:54 +0100, Jakub Hrozek wrote:
On Wed, Mar 18, 2015 at 12:14:32PM -0700, Paul B. Henson wrote:
From: Jakub Hrozek Sent: Wednesday, March 18, 2015 12:29 AM
I'm not sure if we can optimize getautomntent() to read the next entries on demand, I think we need to fetch the whole map there.
Theoretically you could use the pagedResultsControl:
https://www.ietf.org/rfc/rfc2696.txt
I don't know if it would be worth the effort though. If someone has a use case that requires reading the entire map, they probably just need to keep the size of the map small enough that it's not an issue to read it all.
Yes, that's what we use for retrieving large groups. But for automounter maps it's not really practically usable I think because we would have to keep the cookie across requests to the back end.
With large group lookups, we simply do several small internal lookups as part of a single large group lookup, so maintaining the cookie is trivial.
But it would be entirely doable to add another LDAP request that would be invoked by getautomntbyname() and would fetch the single key.
Would that provide more efficient API towards automounter?
Absolutely. My suggestion, as in my reply to Ian, would be to treat autofs maps the same way as users/groups are treated. Do not enumerate by default, and lookup entries as necessary. Also, modify the current enumerate config item to allow enumeration to be enabled/disabled separately for users, groups, and autofs maps. Perhaps extended to allow a list of backends in addition to true/false, where true means enable everything, false means enable nothing, and a specific list "passwd,group,autofs" or "autofs" means enable enumeration only for those data sources.
Hmm. Or maybe that's too complicated? Unlike user/groups, where any random user can run setXXent/getXXent and result in enumeration, are autofs map lookups restricted to autofs? Presumably autofs only calls getautomntent when it actually needs the entire map, so maybe an explicit enumeration enable/disable in sssd isn't required. If someone uses an autofs feature that requires the entire map, the entire map will be downloaded, otherwise it won't be. On further thought, an explicit flag might be confusing when someone tries to use direct maps or browse mode and they just mysteriously don't work because sssd has enumeration disabled 8-/.
The problem is whether the /client/ which is the automounter in this case can/should be modified to leverage the new way of doing things.
Changing the SSSD is one part, but automounter needs to be changed as well if I understood Ian correctly as right now the client only iterates over the map.
Not sure I understand what the new way of doing things would be.
First, only automount(8) should be doing lookups.
The way it is now autofs must always call setautomntent() to pass the map name to sss and then an endautomntent() to finish.
When an entire map read is needed autofs will always use getautomntent() to iterate over the map.
When a key lookup is done autofs will always call getautomntbyname().
So there's definite distinction between the two cases.
Clearly if there are changes to return codes then I can deal with that. If you want to change the semantics I can deal with that too but it will introduce version dependencies I'd rather avoid.
Ian
On Thu, Mar 19, 2015 at 11:05:15AM +0800, Ian Kent wrote:
On Wed, 2015-03-18 at 20:54 +0100, Jakub Hrozek wrote:
On Wed, Mar 18, 2015 at 12:14:32PM -0700, Paul B. Henson wrote:
From: Jakub Hrozek Sent: Wednesday, March 18, 2015 12:29 AM
I'm not sure if we can optimize getautomntent() to read the next entries on demand, I think we need to fetch the whole map there.
Theoretically you could use the pagedResultsControl:
https://www.ietf.org/rfc/rfc2696.txt
I don't know if it would be worth the effort though. If someone has a use case that requires reading the entire map, they probably just need to keep the size of the map small enough that it's not an issue to read it all.
Yes, that's what we use for retrieving large groups. But for automounter maps it's not really practically usable I think because we would have to keep the cookie across requests to the back end.
With large group lookups, we simply do several small internal lookups as part of a single large group lookup, so maintaining the cookie is trivial.
But it would be entirely doable to add another LDAP request that would be invoked by getautomntbyname() and would fetch the single key.
Would that provide more efficient API towards automounter?
Absolutely. My suggestion, as in my reply to Ian, would be to treat autofs maps the same way as users/groups are treated. Do not enumerate by default, and lookup entries as necessary. Also, modify the current enumerate config item to allow enumeration to be enabled/disabled separately for users, groups, and autofs maps. Perhaps extended to allow a list of backends in addition to true/false, where true means enable everything, false means enable nothing, and a specific list "passwd,group,autofs" or "autofs" means enable enumeration only for those data sources.
Hmm. Or maybe that's too complicated? Unlike user/groups, where any random user can run setXXent/getXXent and result in enumeration, are autofs map lookups restricted to autofs? Presumably autofs only calls getautomntent when it actually needs the entire map, so maybe an explicit enumeration enable/disable in sssd isn't required. If someone uses an autofs feature that requires the entire map, the entire map will be downloaded, otherwise it won't be. On further thought, an explicit flag might be confusing when someone tries to use direct maps or browse mode and they just mysteriously don't work because sssd has enumeration disabled 8-/.
The problem is whether the /client/ which is the automounter in this case can/should be modified to leverage the new way of doing things.
Changing the SSSD is one part, but automounter needs to be changed as well if I understood Ian correctly as right now the client only iterates over the map.
Not sure I understand what the new way of doing things would be.
First, only automount(8) should be doing lookups.
The way it is now autofs must always call setautomntent() to pass the map name to sss and then an endautomntent() to finish.
When an entire map read is needed autofs will always use getautomntent() to iterate over the map.
When a key lookup is done autofs will always call getautomntbyname().
OK, so the autofs client module already calls getautomntbyname as appropriate and if we switched sssd to only read the requested key on receiving getautomntbyname, you wouldn't need to do any changes to SSSD?
That's the part that wasn't clear to me..and if that's the case, we should do the SSSD changes to make SSSD perform better in large environments..
Paul, can you file the SSSD bug either way?
So there's definite distinction between the two cases.
Clearly if there are changes to return codes then I can deal with that. If you want to change the semantics I can deal with that too but it will introduce version dependencies I'd rather avoid.
Ian
On Thu, 2015-03-19 at 10:09 +0100, Jakub Hrozek wrote:
On Thu, Mar 19, 2015 at 11:05:15AM +0800, Ian Kent wrote:
On Wed, 2015-03-18 at 20:54 +0100, Jakub Hrozek wrote:
On Wed, Mar 18, 2015 at 12:14:32PM -0700, Paul B. Henson wrote:
From: Jakub Hrozek Sent: Wednesday, March 18, 2015 12:29 AM
I'm not sure if we can optimize getautomntent() to read the next entries on demand, I think we need to fetch the whole map there.
Theoretically you could use the pagedResultsControl:
https://www.ietf.org/rfc/rfc2696.txt
I don't know if it would be worth the effort though. If someone has a use case that requires reading the entire map, they probably just need to keep the size of the map small enough that it's not an issue to read it all.
Yes, that's what we use for retrieving large groups. But for automounter maps it's not really practically usable I think because we would have to keep the cookie across requests to the back end.
With large group lookups, we simply do several small internal lookups as part of a single large group lookup, so maintaining the cookie is trivial.
But it would be entirely doable to add another LDAP request that would be invoked by getautomntbyname() and would fetch the single key.
Would that provide more efficient API towards automounter?
Absolutely. My suggestion, as in my reply to Ian, would be to treat autofs maps the same way as users/groups are treated. Do not enumerate by default, and lookup entries as necessary. Also, modify the current enumerate config item to allow enumeration to be enabled/disabled separately for users, groups, and autofs maps. Perhaps extended to allow a list of backends in addition to true/false, where true means enable everything, false means enable nothing, and a specific list "passwd,group,autofs" or "autofs" means enable enumeration only for those data sources.
Hmm. Or maybe that's too complicated? Unlike user/groups, where any random user can run setXXent/getXXent and result in enumeration, are autofs map lookups restricted to autofs? Presumably autofs only calls getautomntent when it actually needs the entire map, so maybe an explicit enumeration enable/disable in sssd isn't required. If someone uses an autofs feature that requires the entire map, the entire map will be downloaded, otherwise it won't be. On further thought, an explicit flag might be confusing when someone tries to use direct maps or browse mode and they just mysteriously don't work because sssd has enumeration disabled 8-/.
The problem is whether the /client/ which is the automounter in this case can/should be modified to leverage the new way of doing things.
Changing the SSSD is one part, but automounter needs to be changed as well if I understood Ian correctly as right now the client only iterates over the map.
Not sure I understand what the new way of doing things would be.
First, only automount(8) should be doing lookups.
The way it is now autofs must always call setautomntent() to pass the map name to sss and then an endautomntent() to finish.
When an entire map read is needed autofs will always use getautomntent() to iterate over the map.
When a key lookup is done autofs will always call getautomntbyname().
OK, so the autofs client module already calls getautomntbyname as appropriate and if we switched sssd to only read the requested key on receiving getautomntbyname, you wouldn't need to do any changes to SSSD?
Don't think so, by the sound of it. I must do specific key lookups because I avoid reading the entire map whenever possible for the same reason that's come up here. Even if I read the entire map I still do individual key lookups to check if the entry has changed (for indirect maps that is).
That's the part that wasn't clear to me..and if that's the case, we should do the SSSD changes to make SSSD perform better in large environments..
Paul, can you file the SSSD bug either way?
So there's definite distinction between the two cases.
Clearly if there are changes to return codes then I can deal with that. If you want to change the semantics I can deal with that too but it will introduce version dependencies I'd rather avoid.
Ian
From: Jakub Hrozek Sent: Thursday, March 19, 2015 2:10 AM
OK, so the autofs client module already calls getautomntbyname as appropriate and if we switched sssd to only read the requested key on receiving getautomntbyname, you wouldn't need to do any changes to SSSD?
There are three components in play here; the generic autofs code, the autofs sss lookup module, and sssd. The generic autofs code is definitely good to go, as it works fine with the autofs ldap lookup module. I took a quick look at the autofs sss lookup module, and I think that does not need any changes. As far as I can tell, it already passes the separate setmntent, getmntent, and getmntbyname calls to the sssd backend, so the change under discussion should be transparent to it. So it looks like Ian is off the hook on this one ;), thanks though for looking at it with us.
Paul, can you file the SSSD bug either way?
Done:
https://fedorahosted.org/sssd/ticket/2607
I also opened Red Hat support case 1386246 referencing the ticket and requesting a fix/backport to RHEL6.
It doesn't seem like this change should be very difficult, basically rather than reading the entire map on the setmntent call, it should only note the map name. Then for getmntbyname calls it can do individual lookups, and only pull over the entire map if getmntent is called. That's basically the same as is already done for users and groups. I might take a look at it myself, but given my lack of familiarity with the sssd code it would probably take me a lot longer than one of you guys to pound it out :).
Thanks.
On 19 Mar 2015, at 19:34, Paul B. Henson henson@acm.org wrote:
From: Jakub Hrozek Sent: Thursday, March 19, 2015 2:10 AM
OK, so the autofs client module already calls getautomntbyname as appropriate and if we switched sssd to only read the requested key on receiving getautomntbyname, you wouldn't need to do any changes to SSSD?
There are three components in play here; the generic autofs code, the autofs sss lookup module, and sssd. The generic autofs code is definitely good to go, as it works fine with the autofs ldap lookup module. I took a quick look at the autofs sss lookup module, and I think that does not need any changes. As far as I can tell, it already passes the separate setmntent, getmntent, and getmntbyname calls to the sssd backend, so the change under discussion should be transparent to it. So it looks like Ian is off the hook on this one ;), thanks though for looking at it with us.
Paul, can you file the SSSD bug either way?
Done:
https://fedorahosted.org/sssd/ticket/2607
I also opened Red Hat support case 1386246 referencing the ticket and requesting a fix/backport to RHEL6.
Thank you, I added a comment to speed up the process.
It doesn't seem like this change should be very difficult, basically rather than reading the entire map on the setmntent call, it should only note the map name. Then for getmntbyname calls it can do individual lookups, and only pull over the entire map if getmntent is called. That's basically the same as is already done for users and groups. I might take a look at it myself, but given my lack of familiarity with the sssd code it would probably take me a lot longer than one of you guys to pound it out :).
I think we need to split the existing request which loads both the keys and the maps into two and let the current Data Provider handler only call the request that reads the map. Then create another DP handler or modify the existing one to allow calling the second part of the original request separately.
On the responder part, we need to link the setmntent call with the new DP request.
Not too hard, just a bit of coding :-)
From: Jakub Hrozek Sent: Thursday, March 19, 2015 1:13 PM
Thank you, I added a comment to speed up the process.
Cool, thanks; I'm sure that saved me a fair amount of hassle :). Sometimes it can be a bit difficult to push an issue past first-tier and escalate it appropriately.
Not too hard, just a bit of coding :-)
Fortunately we don't have a strong need for it, the current direct autofs ldap mechanism is working fine. I do like the idea from a design perspective though to consolidate all LDAP access with sssd, so hopefully you guys can get it tuned up at some point.
Thanks much for the help.
From: Ian Kent Sent: Tuesday, March 17, 2015 6:45 PM
is about the practical limit for reading the entire map. There would be other side effects in autofs spending that much time reading the the entire map too so I don't think it will ever be possible in your case.
I can't really envision a use case for reading the entire map when it is so large.
I can't change the way the lookup and read map works very much specifically because of the problems really large maps bring. If there are any changes made they must also cover the large map requirement.
But we haven't heard from Jakub yet, it may be possible to change sss to lookup and read the map on demand rather than all at once. Not sure though given I'm not familiar with the wider design of sss.
I don't think any changes are required on the autofs side, as I mentioned, when it accesses LDAP directly it works perfectly. I'm sure sssd could support accessing entries on demand. I'm not sure why they decided to just read the entire map always. Based on a design document I found and also I believe mentioned in this thread, the thought was since they didn't know whether they would need the entire map or not, to just go ahead and read the entire map.
Interestingly, the same behavior for user/groups is explicitly disabled by default:
https://fedorahosted.org/sssd/wiki/FAQ#WhenshouldIenableenumerationinSSSDorW...
While sometimes the entire map is required for autofs (direct maps, browse mode) sometimes it's not. Just like for user/groups sometimes the entire map is required (setXXent/getXXent), sometimes it is not. It would make more sense to me to treat them the same way. Although it looks like for user/groups you can only enable enumeration for both or neither, not separately. Ideally, sssd could be enhanced to treat autofs data the same as user/group, with enumeration optional, and also extend the enumeration control to allow enumeration to be toggled individually for users, groups, or autofs.
On 03/18/2015 02:58 PM, Paul B. Henson wrote:
From: Ian Kent Sent: Tuesday, March 17, 2015 6:45 PM
is about the practical limit for reading the entire map. There would be other side effects in autofs spending that much time reading the the entire map too so I don't think it will ever be possible in your case.
I can't really envision a use case for reading the entire map when it is so large.
I can't change the way the lookup and read map works very much specifically because of the problems really large maps bring. If there are any changes made they must also cover the large map requirement.
But we haven't heard from Jakub yet, it may be possible to change sss to lookup and read the map on demand rather than all at once. Not sure though given I'm not familiar with the wider design of sss.
I don't think any changes are required on the autofs side, as I mentioned, when it accesses LDAP directly it works perfectly. I'm sure sssd could support accessing entries on demand. I'm not sure why they decided to just read the entire map always. Based on a design document I found and also I believe mentioned in this thread, the thought was since they didn't know whether they would need the entire map or not, to just go ahead and read the entire map.
Interestingly, the same behavior for user/groups is explicitly disabled by default:
https://fedorahosted.org/sssd/wiki/FAQ#WhenshouldIenableenumerationinSSSDorW...
While sometimes the entire map is required for autofs (direct maps, browse mode) sometimes it's not. Just like for user/groups sometimes the entire map is required (setXXent/getXXent), sometimes it is not. It would make more sense to me to treat them the same way. Although it looks like for user/groups you can only enable enumeration for both or neither, not separately. Ideally, sssd could be enhanced to treat autofs data the same as user/group, with enumeration optional, and also extend the enumeration control to allow enumeration to be toggled individually for users, groups, or autofs.
While I agree that it makes sense to improve autofs enumeration and have it configurable there really no practical value in decoupling enumeration between users and groups. You either cache both or not. Cashing one but not another would not solve any problem.
sssd-users mailing list sssd-users@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/sssd-users
From: Dmitri Pal Sent: Wednesday, March 18, 2015 12:05 PM
it configurable there really no practical value in decoupling enumeration between users and groups. You either cache both or not. Cashing one but not another would not solve any problem.
I said "enumeration", you are saying "caching" -- that's not the same thing. I don't think there would be any value in caching users and not groups, or vice versa, but I can absolutely think of a use case where *enumerating* one but not the other is valuable.
Consider a hypothetical organization with 500,000 users and 1000 groups. They don't want to enable enumeration for users, as that would thrash both their LDAP servers and the clients. On the other hand, they do want to enable enumeration for groups, as they have an application for which that is a requirement. With the current implementation, either their application works and they risk somebody intentionally or accidentally enumerating users and breaking things, or they are not at risk but the application does not work.
Being able to separately configure enumeration for users versus groups would allow this organization to both prevent performance issues and enable their application.
I don't know how frequently such a use case might arise, but I believe I would call it practical :).
On 03/18/2015 04:08 PM, Paul B. Henson wrote:
From: Dmitri Pal Sent: Wednesday, March 18, 2015 12:05 PM
it configurable there really no practical value in decoupling enumeration between users and groups. You either cache both or not. Cashing one but not another would not solve any problem.
I said "enumeration", you are saying "caching" -- that's not the same thing. I don't think there would be any value in caching users and not groups, or vice versa, but I can absolutely think of a use case where *enumerating* one but not the other is valuable.
Consider a hypothetical organization with 500,000 users and 1000 groups. They don't want to enable enumeration for users, as that would thrash both their LDAP servers and the clients. On the other hand, they do want to enable enumeration for groups, as they have an application for which that is a requirement. With the current implementation, either their application works and they risk somebody intentionally or accidentally enumerating users and breaking things, or they are not at risk but the application does not work.
Being able to separately configure enumeration for users versus groups would allow this organization to both prevent performance issues and enable their application.
I don't know how frequently such a use case might arise, but I believe I would call it practical :).
I really do not want to get into this discussion. When I say users and groups I mean also group membership. The groups by itself do not have much value for applications unless you also have memberships. In the given example if you download all groups but not users you would have to download complex group membership on the fly for every user. This is usually costly. So I think the main decision is: you either enumerate group membership and thus you store users and groups at the same time or you do not do it and lookup things as needed. This is why it does not make sense to break them apart. It is possible but does not bring any improvement even in the case you suggested above. In the case about it is actually be worse as you will enumerate all the groups though you might not need all of them.
sssd-users@lists.fedorahosted.org