Perhaps it is the process. I raised it via the Red Hat support portal as a bug and basically repeated everything I've mentioned here, along with details about how to reproduce on demand. I got a canned pre-canned response pointing me to an access knowledgebase article saying it is not supported.

I've had a very hard time reporting bugs via upstream projects in the past. Often times, I get responses years later asking me to test a proposed patch. Makes it a difficult process and not one I generally have time to do in my day to day job. IMO, the process needs to be easier.

I'll look at reporting it via bugzilla. Thanks.

From: William Brown <wbrown@suse.de>
Sent: Tuesday, March 9, 2021 4:10 PM
To: 389-users@lists.fedoraproject.org <389-users@lists.fedoraproject.org>
Subject: [389-users] Re: Chain on Update problem
 
[External Mail]


> On 9 Mar 2021, at 15:45, Grant Byers <Grant.Byers@aarnet.edu.au> wrote:
>
> FWIW, Red Hat don't want a bar of it given we're not running their IPA or RHDS products. We've enabled anonymous binds on our production masters to work around this issue, but as far as I'm concerned, the bug can languish.

That doesn't seem right ... many of the RH 389 devs are on this list and I'll personally vouch that they care a lot about this kind of issue :) In my experience even if you aren't running RHDS/IPA if you raise an RH bugzilla, they'll support you. Perhaps your issue is with the phone/email support services?

Another option is to raise it upstream against the version you are running.

https://github.com/389ds/389-ds-base

Regardless, once again, I guarantee that the RH developers of 389-ds really do care :)



>
> Regards,
> Grant
> From: William Brown <wbrown@suse.de>
> Sent: Tuesday, March 9, 2021 9:27 AM
> To: 389-users@lists.fedoraproject.org <389-users@lists.fedoraproject.org>
> Subject: [389-users] Re: Chain on Update problem
>
> [External Mail]
>
>
> > On 8 Mar 2021, at 20:42, Pierre Rogier <progier@redhat.com> wrote:
> >
> > Hi Grant,
> >
> > Good finding about the root cause !
> >
> > About the solution, I think we should continue to send an anonymous bind but considers that ping is successful if the return code is either LDAP_SUCCESS or LDAP_INAPPROPRIATE_AUTH
> >  (anyway we do not care that the bind is really successful, we only want to know that server
> >    is responsive)
> >
> > FYI: My concerns with the search are that:
> >    - we would hit the same issue if acl deny the search
>
> But do we see many ACI on the rootdse to deny this? IIRC that would cause a lot of issues.
>
> >    - it would increase the root dse contention
>
> The rootdse is fully in memory and when it populates the response it takes a series of locks over different config types. The extensions and features all look like a read lock, but the sasl mechs are a mutex so that would be my only concern. I don't think that it would be "too much" though, since how often would chaining really need to do these keep alive checks?
>
>
> Perhaps the "best" option here is a config toggle that flips between rootdse or the target backed for the anonymous keep alive check.
>
> Alternately we could specify what DN is checked by anonymous for the check. I'd prefer the toggle just to help limit what we need to test though, and it makes configuration and admin a bit easier.
>
> >
> > Regards
> >    Pierre
> >
> > On Mon, Mar 8, 2021 at 5:56 AM Grant Byers <Grant.Byers@aarnet.edu.au> wrote:
> > Confirmed. I made the following simple change and it allows cb_ping_farm to work with anonymous binds only enabled for rootdse;
> >
> >
> > diff -urN a/ldap/servers/plugins/chainingdb/cb_conn_stateless.c b/ldap/servers/plugins/chainingdb/cb_conn_stateless.c
> > --- a/ldap/servers/plugins/chainingdb/cb_conn_stateless.c       2020-03-17 04:52:57.000000000 +1000
> > +++ b/ldap/servers/plugins/chainingdb/cb_conn_stateless.c       2021-03-08 14:04:48.413647052 +1000
> > @@ -883,7 +883,7 @@
> >      /* NOTE: This will fail if we implement the ability to disable
> >         anonymous bind */
> > -    rc = ldap_search_ext_s(ld, NULL, LDAP_SCOPE_BASE, "objectclass=*", attrs, 1, NULL,
> > +    rc = ldap_search_ext_s(ld, "", LDAP_SCOPE_BASE, "objectclass=*", attrs, 1, NULL,
> >                             NULL, &timeout, 1, &result);
> >      if (LDAP_SUCCESS != rc) {
> >          slapi_ldap_unbind(ld);
> >
> >
> > I don't believe this will break any functionality, but since we're running RHEL7, i'll raise this with Red Hat directly and they can review and/or push upstream.
> >
> > Regards,
> > Grant
> >
> > From: Grant Byers <Grant.Byers@aarnet.edu.au>
> > Sent: Monday, March 8, 2021 12:27 PM
> > To: 389-users@lists.fedoraproject.org <389-users@lists.fedoraproject.org>
> > Subject: Re: [389-users] Re: Chain on Update problem
> >
> > Thanks.
> >
> > I have tested various combinations of the tuning params without success. I've done further debugging and confirmed that it always starts after a bind operation timeout. Looking into the chaining plugin code, I see that on operation timeout results in a call to cb_ping_farm to see if we can find another server in the pool that is available. However, it performs this operation (the comment is telling);
> >
> >     /* NOTE: This will fail if we implement the ability to disable
> >        anonymous bind */
> >     rc = ldap_search_ext_s(ld, NULL, LDAP_SCOPE_BASE, "objectclass=*", attrs, 1, NULL,
> >                            NULL, &timeout, 1, &result);
> >     if (LDAP_SUCCESS != rc) {
> >         slapi_ldap_unbind(ld);
> >         cb_update_failed_conn_cpt(cb);
> >         return LDAP_SERVER_DOWN;
> >     }
> >
> > So basically, because we've disallowed anonymous bind for anything but rootdse, it will always fail to find another available server. I have confirmed this by allowing anonymous bind on our masters while the issue was present, then subsequent binds on the consumers start working again.
> >
> > I would think it more appropriate for that code to do a search against the rootdse instead. Is there any good reason why it shouldn't? If not, I might test modifying it.
> >
> >
> > Thanks,
> > Grant
> >
> >
> > From: William Brown <wbrown@suse.de>
> > Sent: Friday, March 5, 2021 3:52 PM
> > To: 389-users@lists.fedoraproject.org <389-users@lists.fedoraproject.org>
> > Subject: [389-users] Re: Chain on Update problem
> >
> > [External Mail]
> >
> >
> > > On 5 Mar 2021, at 12:03, Grant Byers <Grant.Byers@aarnet.edu.au> wrote:
> > >
> > > Hi All,
> > >
> > > Version: 1.3.10
> > >
> > > In our environment, we'd like to use a chaining backend to push BIND operations up to masters by way of the consumer (rather than client referral). We'd like to do this to ensure password lockout attributes are propagated to all consumers equally via our standard replication agreements. This is described here - https://directory.fedoraproject.org/docs/389ds/howto/howto-chainonupdate.html.
> > >
> > > NOTE, we do not have hubs in our topology. Just masters and consumers, so no intermediate chaining.
> > >
> > > We tested this process in our environment and it worked beautifully until we took it to production. Currently, we have just 2 masters and they are both sitting on some over-subscribed hardware that suffers from I/O starvation at certain times of the day. The plan is to scale out our masters eventually, but we're a little hamstrung with other projects and priorities. It worked extremely well until that time of day when masters suffered from I/O starvation, and hence, very long I/O wait times. This is generally short lived and happens at alternate times of the day for each of the masters. However, it seems that once both nsfarmservers have "failed", there is never any attempt by the consumer to retry them. This leads to bind errors as follows;
> > >
> > > ldapwhoami -x -D "<binddn>" -W
> > > Enter LDAP Password:
> > > ldap_bind: Operations error (1)
> > >         additional info: FARM SERVER TEMPORARY UNAVAILABLE
> > >
> > > Except it is not temporary. It never recovers, even though all members of nsfarmservers are now healthy again (and are never unhealthy at the same time). We can confirm this by performing binds from the consumers directly against the masters. I thought that setting nsConnectionLife to something larger than 0 (indefinite) would help this, but it has not.
> >
> > The chain on update appears to use the chaining plugin timeouts, so you could look at adjusting these parameters which may help.
> >
> > nsBindTimeout
> > nsOperationTimeout
> > nsBindRetryLimit
> > nsMaxResponseDelay
> > nsMaxTestResponseDelay
> >
> >
> >
> > >
> > > Is this by design, a bug, or an implementation fault on my behalf? Configuration below;
> > >
> > > Thanks,
> > > Grant
> > >
> > >
> > >
> > > ## On masters, create a dedicated user for chaining backend
> > > dn: cn=proxy,cn=config
> > > objectClass: person
> > > objectClass: top
> > > cn: proxy
> > > sn: admin
> > >
> > > ## On all consumers, create chaining backend;
> > > dn: cn=chainbe1,cn=chaining database,cn=plugins,cn=config
> > > objectclass: top
> > > objectclass: extensibleObject
> > > objectclass: nsBackendInstance
> > > nsslapd-suffix: <suffix>
> > > nsfarmserverurl: ldaps://<master1>:636 <master2>:636/
> > > nsMultiplexorBindDN: <binddn>>
> > > nsMultiplexorCredentials: <bindpw>
> > > nsCheckLocalACI: on
> > > nsConnectionLife: 30
> > > cn: chainbe1
> > >
> > > ## On all consumers, add the backend and repl_chain_on_update function
> > > dn: cn="<suffix>",cn=mapping tree,cn=config
> > > changetype: modify
> > > add: nsslapd-backend
> > > nsslapd-backend: chainbe1
> > > -
> > > add: nsslapd-distribution-plugin
> > > nsslapd-distribution-plugin: libreplication-plugin
> > > -
> > > add: nsslapd-distribution-funct
> > > nsslapd-distribution-funct: repl_chain_on_update
> > >
> > > ## On all servers, enable global pasword policy
> > > dn: cn=config
> > > changetype: modify
> > > replace: passwordIsGlobalPolicy
> > > passwordIsGlobalPolicy: on
> > >
> > > _______________________________________________
> > > 389-users mailing list -- 389-users@lists.fedoraproject.org
> > > To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org
> > > Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> > > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> > > List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
> > > Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
> >
> >
> > Sincerely,
> >
> > William Brown
> >
> > Senior Software Engineer, 389 Directory Server
> > SUSE Labs, Australia
> > _______________________________________________
> > 389-users mailing list -- 389-users@lists.fedoraproject.org
> > To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org
> > Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> > List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
> > Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
> > _______________________________________________
> > 389-users mailing list -- 389-users@lists.fedoraproject.org
> > To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org
> > Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> > List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
> > Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
> >
> >
> > --
> > --
> >
> > 389 Directory Server Development Team
> > _______________________________________________
> > 389-users mailing list -- 389-users@lists.fedoraproject.org
> > To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org
> > Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> > List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
> > Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
>
>
> Sincerely,
>
> William Brown
>
> Senior Software Engineer, 389 Directory Server
> SUSE Labs, Australia
> _______________________________________________
> 389-users mailing list -- 389-users@lists.fedoraproject.org
> To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org
> Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
> Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
> _______________________________________________
> 389-users mailing list -- 389-users@lists.fedoraproject.org
> To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org
> Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
> Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure


Sincerely,

William Brown

Senior Software Engineer, 389 Directory Server
SUSE Labs, Australia
_______________________________________________
389-users mailing list -- 389-users@lists.fedoraproject.org
To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure