On 19 Nov 2020, at 10:34, Graham Leggett <minfrin(a)sharp.fm> wrote:
> Raised the bug here:
https://bugzilla.redhat.com/show_bug.cgi?id=1771979
Coming back to this one - got to the bottom of this while investigating something else
that wasn’t working.
This wasn’t a regression in NSS, but rather a regression in the openldap libraries
shipped by RHEL7.5 and above.
For reasons that I haven’t found, there was an architecture change made half way through
the RHEL7 lifecycle where openldap was linked to openssl instead of NSS.
Openldap's NSS support and openldap’s openssl support differ in a fundamental way -
with NSS, when openldap makes an SSL connection intermediate certificates are filled in by
the client side as normal. With openssl, when openldap makes an SSL connection
intermediate certificates are ignored, and the connection breaks.
The hack workaround above fixes this because openldap’s openssl support expects you to
place intermediate certs in your trusted certificate store. As soon as you mark the
intermediates as trusted in NSS, the hack workaround in 389ds that makes replication
sort-of work bound to two different crypto libraries exports trusted certs across into the
ca certificate list passed to openldap. Openldap then finds the intermediates and things
work.
Fundamentally there are two bugs:
-
https://bugzilla.redhat.com/show_bug.cgi?id=1898924
- An architectural change half way through the lifecycle of what is supposed to be a
stable OS.
End of 2023, the bug is still present in RHEL9:
[11/Dec/2023:23:02:09.510906411 +0000] - ERR - slapi_ldap_bind - Could not send bind
request for id [(anon)] authentication mechanism [EXTERNAL]: error -1 (Can't contact
LDAP server), system error -5987 (Invalid function argument.), network error 0 (Unknown
error, host “ldap2.example.com:636")
This time, the workaround of forcing the intermediate certificates to be marked trusted no
longer works. We now get a low level complaint about a certificate verification failure.
The error message doesn’t tell us which certificate failed, but this message is an openssl
message.
[11/Dec/2023:19:45:28.115134273 +0000] - ERR - NSMMReplicationPlugin - bind_and_check_pwp
- agmt=“cn=ldap2" (thor:636) - Replication bind with EXTERNAL auth failed: LDAP error
-1 (Can't contact LDAP server) (error:0A000086:SSL routines::certificate verify failed
(self-signed certificate in certificate chain))
There are no self-signed certificates being used, they are certs issued by public CAs,
which like all public CAs, have intermediate certs.
The bugs I raised in 2020 were all abandoned and closed.
Regards,
Graham
—