Hi William,

As usual I want to thank you a lot for all those precious and detailed feedbacks. I will try to summarize your points and to answer them. If I forget a point, please just give me a head up.

best regards
thierry

On 2/14/22 12:23 AM, William Brown wrote:

On 11 Feb 2022, at 19:47, Thierry Bordaz <tbordaz@redhat.com> wrote:

Thank you William for diving into that RFE with as always very good points. I will try to answer.

You are right it is a requirement coming from IPA where future IPA instance will have to support MS client issuing In chain requests. I am not convince it will be very frequent req. It is even possible that this RFE is a kind of supported MR we need for conformity reasons.
Knew it :) 

The request triggers a set of recursive read operations. So it differs from memberof plugin that, does the reads and also update the target entry with memberof values. So it will be by far (I hope) faster than memberof plugin. The drawback is that the result of the computation is lost (not written) so there is no way to benefit from previous req.
It will NOT be faster.

MemberOf is a fundamental trade - trading time during the write path to examine groups and then store these, so that we achieve in a reduction in processing time in the read path.

The plugin is the exact opposite. By saving in the write path (doing nothing), we have to do *all the work* in the read path. LDAP is also a read mostly system, meaning that we are making the exact wrong choice to go down a path to try to generate this in the read op. 

Let's consider a write takes some time value defined as W . And let's assume that the member of takes some time value defined as M. And a read takes time R. 

If we have memberof, with 1 write and 100 reads we have:

1(W + M)
100(R)

With In Chain we have:

1(W)
100(R + M)

So we've increased our work and time by 100M just by moving this. If you want to look at it the other way, MemberOf reduces our time by 100M. 



Further, as mentioned, look at these reports:

https://stackoverflow.com/questions/40024425/1-2-840-113556-1-4-1941-ldap-matching-rule-in-chain-has-performance-problems

"using LDAP_MATCHING_RULE_IN_CHAIN is much slower (in our case, a factor of 10x) than simply doing the recursive "member-of" search that the above quote says LDAP_MATCHING_RULE_IN_CHAIN was intended to replace."

http://dunnry.com/blog/TransitiveLinkValueFilterEvaluation.aspx

"I expected the new filter to run circles around my code and I was pretty shocked when the reverse was actually true ... transitive filter (409ms) as compared to only (41ms) using the recursive search."

https://stackoverflow.com/questions/6143665/single-line-ldap-query-that-enumerates-users-from-a-group-within-a-supergroup

"It works, although the query is slow as hell (~5-10 seconds)"

https://stackoverflow.com/questions/9534669/improving-recursive-active-directory-function

"Code I am using as per JPBlancs's suggestion, whilst this works, it is much slower than my original, am I implementing it incorreclty?"


The problem with memberof is that it updates the target entry. So it is beneficial only if you later search 'memberof'. If you do not it is a waste of time. Also memberof  is very demanding, you need to keep the plugin always enabled to trust the result.
This is a *good thing*. This is the whole point of this trade. We trade write path time and storage for huge improvement in read path time. Especially relevant given that ldap is massively read oriented.

We have seen many cases where memberof was disabled during some admin tasks then reenabled/fixup. So it is not comparing a bad 'in chain' solution vs a perfect 'memberof' solution, both have benefits/drawbacks.
I have previously stated how our MemberOf algorithm is slow and how we could fix it. I have provided proof of these algorithmic improvements, but those suggestions were not applied. 

Finally I have to admit, may be I auto convinced myself ;), that this matching rule provides a quite elegant membership processing without requiring plugins/config. For example with a same MR you can lookup both sides of the memberships (toward the members or toward the groups). (memberof:inChainMatch:=cn=bar) will give all the members of group 'cn=bar' while (member:inChainMatch:=uid=foo) will give  all the groups foo is memberof.
I don't know how to express this but I'm extraordinarily concerned about the impact of this RFE. I think the performance impact has not been considered accurately, and I think it will only compound the performance issues we have within IPA. Especially once we combine with the many pre-existing IPA integration specific performance issues that have repeatedly gone ignored (related to the ipa plugins for ds). 


In order for me to be happy with this RFE as an extension, I would ask that:

* This MUST be a plugin, and ship default disabled. 
* Test cases MUST include group cycles and nesting to deep levels. IE a cycle of 100 items long or similar. We need to prove that we can cycle break since there is a risk of infinite loop and deadlock with this plugin. 
* The implementation must be iterative and not recursive else we may hit stack depth limits. 

Implementation - EITHER:

* This OID internally uses memberOf attributes to perform the resolution IE
  - (member:1.2.840.113556.1.4.1941:=CN=User1,OU=X,DC=domain,DC=com)  -- internally we just search the user, retrieve it's memberOf (which we already have and is pre-computed with nesting) and return the set of groups.
  - (memberOf:1.2.840.113556.1.4.1941:=CN=A Group,OU=Y,DC=domain,DC=com)  -- A no-op, since we already have nesting in memberOf, we just return the result directly. 
  We already have memberOf, and it's already done the work - let's use that information. 
* If the results of this search are virtualAttributes and cached such that repeat searches hit the cache and do not stress the server.

I think having this without a virtual attr cache, or without using existing memberof infrastructure is a recipe for disaster. 



--
Sincerely,

William Brown

Sesion Software Engineer,
Identity and Access Management
SUSE Labs, Australia