On 2/15/22 12:35 AM, William Brown wrote:

On 14 Feb 2022, at 23:23, Thierry Bordaz <tbordaz@redhat.com> wrote:

Hi William,

As usual I want to thank you a lot for all those precious and detailed feedbacks. I will try to summarize your points and to answer them. If I forget a point, please just give me a head up.
All good mate :) 

	• Performance: The InChain search request is expensive, may put high pressure on the server and we should have a mean to protect the server.

I agree that to follow membership link has a cost, a risk (infinite loop) and impacts the server.

On the cost aspect I meant that MR computation (membership traversal) of a *single* search should be much faster than (membership traversal) during memberof plugin call because memberof triggers the same set of searches and in addition         updates the members. This being said, I fully agree that it is much more efficient to do a regular SRCH req '(memberof=cn=bar)' than '(member:1.2.840.113556.1.4.1941:=cn=bar)' and

So there is a very major piece of context that you might be missing here. Since this is related to the (unhealthy obsession) with FreeIPA slowly morphing into AD, there are AD specific behaviours we need to remember in this proposal. 

MS AD Technical Specification - https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-adts/d2435927-0999-4c62-8c6d-13ba31a52e1a

In AD, memberOf is constructed using what's called a "back link attribute" ([MS-ADTS] - v20210625, page 27 ). These back links are derived and inverse from a forward link. If my memory serves correctly, you can consider these as a "pointer". Rather than how we store member / member of with dn's such as:

┌───────────────────┐                  ┌───────────────────┐
│                   │                  │                   │
│dn: cn=group  ◀─ ─ ┼ ─ ─ ┐┌──────────▶│dn: uid=user       │
│member: uid=user───┼──────┴ ─ ─ ─ ─ ─ │memberOf: cn=group │
│                   │                  │                   │
└───────────────────┘                  └───────────────────┘

AD stores these links as a pointer:

┌───────────────────┐                  ┌───────────────────┐
│                   │                  │                   │
│dn: cn=group       │                  │dn: uid=user       │
│linkId: 1  ◀ ─ ─ ─ ┼ ─ ┐┌────────────▶│linkId: 2          │
│member: 2  ────────┼────┴ ─ ─ ─ ─ ─ ─ │memberOf: 1        │
│                   │                  │                   │
└───────────────────┘                  └───────────────────┘

Now, importantly this means that memberOf on AD is NOT able to represent nested groups because:

"""
A forward link attribute can exist with no corresponding back link attribute, but not vice-versa.
"""

What this means is that you can express a member (forward link) and direct memberOf (back link) but not a nested group. 

┌───────────────────┐                 ┌───────────────────┐                  ┌───────────────────┐
│                   │                 │dn: cn=group       │                  │                   │
│dn: cn=group       │   ┌────────────▶│linkId: 2  ◀─ ─ ─ ─│─ ─               │dn: uid=user       │
│linkId: 1 ◀ ─ ─ ─ ─│─ ┐│             │member: 3 ─────────┼───┼────────────▶ │linkId: 3          │
│member: 2 ─────────┼───┴ ─ ─ ─ ─ ─ ─ │memberOf: 1        │    ─ ─ ─ ─ ─ ─ ─ │memberOf: 2        │
│                   │                 │                   │                  │                   │
└───────────────────┘                 └───────────────────┘                  └───────────────────┘


If we were to add memberOf for linkId: 1 notice:

┌───────────────────┐                 ┌───────────────────┐                  ┌───────────────────┐
│                   │                 │dn: cn=group       │                  │                   │
│dn: cn=group       │   ┌────────────▶│linkId: 2  ◀─ ─ ─ ─│─ ─               │dn: uid=user       │
│linkId: 1 ◀ ─ ─ ─ ─│─ ┐│             │member: 3 ─────────┼───┼────────────▶ │linkId: 3          │
│member: 2 ─────────┼──┼┴ ─ ─ ─ ─ ─ ─ │memberOf: 1        │    ─ ─ ─ ─ ─ ─ ─ │memberOf: 2        │
│                   │                 │                   │                 ││memberOf: 1        │
└───────────────────┘  │              └───────────────────┘                  └───────────────────┘
                                                                            │                     
                       └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─                      

There is no *forward link* from linkId 1 to user (memberOf 1), thus violating the constraints of link attributes in AD. 

This means that in AD memberOf values are not nested and can not express or show nested groups. This is what the chain-link oid solves - since this follows each stack of the links building up the complete group graph, chain link IS able to show group nesting. 

a clever application should prefer 'memberof'.
Because of this fact, a clever application that wants *nested groups* can NOT use memberOf in an AD environment, and is forced to rely on the link chain oid. There is also no magic sauce that an application will see or know about to know they are in a FreeIPA hybrid AD monster, which lets them know that our memberOf state is a full and complete group graph with nesting included so they can prefer that. 

The main drawback of 'memberof' option is that it only works for 'memberof' attribute. While InChain works for any membership attribute.
But the main usage of this will be memberships to groups due to the nesting details above. Groups also tend to have larger and more complex structures. That's why this is a focus. 

Regarding the risk of infinite loop, the current memberof_get_groups() has a loop detection mechanism.

Regarding the impact of the server, you mentioned to enforce a limit of nesting. I agree with this idea and worth to be added to https://github.com/389ds/389-ds-base/issues/5156

Regarding https://github.com/389ds/389-ds-base/issues/5156, I think it is a good time to implement/compare current memberof_get_group implementation with your previous proposal.
Sure, it seems like a good idea. My "previous proposal" is already in use in kanidm, and it's extremely fast for memberOf updates compared to 389.

I think that it would be hugely value for us to consider memberOf a "default and required" element of the server, and to improve it as such. That way our get_group capabilities can just search on memberOf which should be a single search. 

	• Impact of the RFE on the server

Once registered the InChain MR is presented as supported MR on base DSE. That is the only impact until a SRCH use the MR. Should we limit SRCH InChain to list of bound DN ?
I think we should have aci's to limit access to this yes. Alternately, it should have a cn=config switch to enable/disable this control, and it ships default disabled.  

Regarding performance risk, I think it would be somehow similar to what a deref control is (deref drawback being aci while InChain would be nested membership) and there is no limit to the use of deref control.
So IMHO there is no consequence on the server of supporting or not this RFE.

	• InChain being a plugin
It is a plugin (syntax), the change is the move of nested memberships searches into a slapi private function.

	• Using iteration rather than recursive
Knowing that there is a loop detection mechanism, I see not a huge benefit of iteration vs recursive. Indeed in identity membership we can not anticipated huge nesting.
Iteration won't cause your whole server instance to panic and crash because of a weird group setup. 

	• Shortcut InChain 'member' and 'memberof' into a use  of 'memberof' attribute looks a very nice idea. Documentation should warn that InChain 'member' relies on 'memberof' properly updated (risk of enable/disable memberof plugin)
Yep - we already have that data, so we can rely on it and use it. I think we should change memberOf to enabled by default TBH. 


Updating the design with your feedback I am facing a difficulty.
There is a kind of agreement that a search like '(member:1.2.840.113556.1.4.1941:=uid=foo,dc=com)'
that computes all groups 'foo' is memberof, should shortcut the membership computation to use 'memberof' value that is in 'foo' entry.

My concern is related to access control. Let assume we have

dn: G0
member: G1

dn: G1
member: G2

dn: G2
member: foo

dn: foo
memberof: G0
memberof: G1
memberof: G2

Then ACI definition, allows (to the inChain bound user) read/search access to

InChain MR computation will retrieve G2 (because of access granted on G2.member), then will not retrieve G1 (because of lack of rights to access to G1.member) and so not retrieve G0. => returns 'G2'

Shortcut InChain MR, retrieve G0, G1 and G2 (because of access granted to foo.memberof). If it returns G0, G1, G2 it violates lack of access to G1.
If it tests access of 'member' on G0, G1, G2. It may return G0 and G2 (because access is granted) but it violates the fact that G0 was hidden behind G1.
So even if the 'member' are pre-computed, we still have to access (srch) each of the member value to evaluate rights and also to create a graph of the membership to detect "hidden" entries.
In clear I am unsure of the benefit of the shortcut

any thoughts.


best regards
thierry



	• membership attribute being a virtual attribute, is a bad idea. This will results in unindexed searches. Caching result is not possible as virtual attribute evaluation may differ between two searches. May be I misunderstand how you intend to use virtual attribute in such case.
best regards
thierry

On 2/14/22 12:23 AM, William Brown wrote:

          
On 11 Feb 2022, at 19:47, Thierry Bordaz <tbordaz@redhat.com>
 wrote:

Thank you William for diving into that RFE with as always very good points. I will try to answer.

You are right it is a requirement coming from IPA where future IPA instance will have to support MS client issuing In chain requests. I am not convince it will be very frequent req. It is even possible that this RFE is a kind of supported MR we need for conformity reasons.

Knew it :) 


The request triggers a set of recursive read operations. So it differs from memberof plugin that, does the reads and also update the target entry with memberof values. So it will be by far (I hope) faster than memberof plugin. The drawback is that the result of the computation is lost (not written) so there is no way to benefit from previous req.

It will NOT be faster.

MemberOf is a fundamental trade - trading time during the write path to examine groups and then store these, so that we achieve in a reduction in processing time in the read path.

The plugin is the exact opposite. By saving in the write path (doing nothing), we have to do *all the work* in the read path. LDAP is also a read mostly system, meaning that we are making the exact wrong choice to go down a path to try to generate this in the read op. 

Let's consider a write takes some time value defined as W . And let's assume that the member of takes some time value defined as M. And a read takes time R. 

If we have memberof, with 1 write and 100 reads we have:

1(W + M)
100(R)

With In Chain we have:

1(W)
100(R + M)

So we've increased our work and time by 100M just by moving this. If you want to look at it the other way, MemberOf reduces our time by 100M. 



Further, as mentioned, look at these reports:


https://stackoverflow.com/questions/40024425/1-2-840-113556-1-4-1941-ldap-matching-rule-in-chain-has-performance-problems


"using LDAP_MATCHING_RULE_IN_CHAIN is much slower (in our case, a factor of 10x) than simply doing the recursive "member-of" search that the above quote says LDAP_MATCHING_RULE_IN_CHAIN was intended to replace.
"

http://dunnry.com/blog/TransitiveLinkValueFilterEvaluation.aspx

"
I expected the new filter to run circles around my code and I was pretty shocked when the reverse was actually true ... transitive filter (409ms) as compared to only (41ms) using the recursive search.
"

https://stackoverflow.com/questions/6143665/single-line-ldap-query-that-enumerates-users-from-a-group-within-a-supergroup

"
It works, although the query is slow as hell (~5-10 seconds)
"

https://stackoverflow.com/questions/9534669/improving-recursive-active-directory-function

"
Code I am using as per JPBlancs's suggestion, whilst this works, it is much slower than my original, am I implementing it incorreclty?"



The problem with memberof is that it updates the target entry. So it is beneficial only if you later search 'memberof'. If you do not it is a waste of time. Also memberof  is very demanding, you need to keep the plugin always enabled to trust the result.

This is a *good thing*. This is the whole point of this trade. We trade write path time and storage for huge improvement in read path time. Especially relevant given that ldap is massively read oriented.


We have seen many cases where memberof was disabled during some admin tasks then reenabled/fixup. So it is not comparing a bad 'in chain' solution vs a perfect 'memberof' solution, both have benefits/drawbacks.

I have previously stated how our MemberOf algorithm is slow and how we could fix it. I have provided proof of these algorithmic improvements, but those suggestions were not applied. 


Finally I have to admit, may be I auto convinced myself ;), that this matching rule provides a quite elegant membership processing without requiring plugins/config. For example with a same MR you can lookup both sides of the memberships (toward the members or toward the groups). (memberof:inChainMatch:=cn=bar) will give all the members of group 'cn=bar' while (member:inChainMatch:=uid=foo) will give  all the groups foo is memberof.

I don't know how to express this but I'm extraordinarily concerned about the impact of this RFE. I think the performance impact has not been considered accurately, and I think it will only compound the performance issues we have within IPA. Especially once we combine with the many pre-existing IPA integration specific performance issues that have repeatedly gone ignored (related to the ipa plugins for ds). 


In order for me to be happy with this RFE as an extension, I would ask that:

* This MUST be a plugin, and ship default disabled. 
* Test cases MUST include group cycles and nesting to deep levels. IE a cycle of 100 items long or similar. We need to prove that we can cycle break since there is a risk of infinite loop and deadlock with this plugin. 
* The implementation must be iterative and not recursive else we may hit stack depth limits. 

Implementation - EITHER:

* This OID internally uses memberOf attributes to perform the resolution IE
  - (member:1.2.840.113556.1.4.1941:=CN=User1,OU=X,DC=domain,DC=com)  -- internally we just search the user, retrieve it's memberOf (which we already have and is pre-computed with nesting) and return the set of groups.
  - (memberOf:1.2.840.113556.1.4.1941:=CN=A Group,OU=Y,DC=domain,DC=com)  -- A no-op, since we already have nesting in memberOf, we just return the result directly. 
  We already have memberOf, and it's already done the work - let's use that information. 
* If the results of this search are virtualAttributes and cached such that repeat searches hit the cache and do not stress the server.

I think having this without a virtual attr cache, or without using existing memberof infrastructure is a recipe for disaster. 



--
Sincerely,

William Brown

Sesion Software Engineer,
Identity and Access Management
SUSE Labs, Australia


--
Sincerely,

William Brown

Sesion Software Engineer,
Identity and Access Management
SUSE Labs, Australia