sssd performance on large domains
by zfnoctis@gmail.com
Hi,
I'm wondering if there are any plans to improve sssd performance on large active directory domains (100k+ users, 40k+ groups), or if there are settings I am not aware of that can greatly improve performance, specifically for workstation use cases.
Currently if I do not set "ignore_group_members = True" in sssd.conf, logins can take upwards of 6 minutes and "sssd_be" will max the CPU for up to 20 minutes after logon, which makes it a non-starter. The reason I want to allow group members to be seen is that I want certain domain groups to be able to perform elevated actions using polkit. If I ignore group members, polkit reports that the group is empty and so no one can elevate in the graphical environment.
Ultimately this means that Linux workstations are at a severe disadvantage since they cannot be bound to the domain and have the normal set of access features users and IT expect from macOS or Windows.
Distributions used: Ubuntu 16.04 (sssd 1.13.4-1ubuntu1.1), Ubuntu 16.10 (sssd 1.13.4-3) and Fedora 24 (sssd-1.13.4-3.fc24). All exhibit the same problems.
I've also tried "ldap_group_nesting_level = 1" without seeing any noticeable improvement with respect to performance. Putting the database on /tmp isn't viable as these are workstations that will reboot semi-frequently, and I don't believe this is an I/O bound performance issue anyways.
Thanks for your time.
1 year, 10 months
kcm, gssproxy and klist
by Winberg Adam
With KCM and gssproxy we often see a long list of credentials when doing a 'klist':
[user.u@lxserv2114 ~]$ klist
Ticket cache: KCM:17098:66803
Default principal: user.u@AD
Valid starting Expires Service principal
01/01/1970 00:00:00 01/01/1970 00:00:00 Encrypted/Credentials/v1@X-GSSPROXY:
01/01/1970 00:00:00 01/01/1970 00:00:00 Encrypted/Credentials/v1@X-GSSPROXY:
01/01/1970 00:00:00 01/01/1970 00:00:00 Encrypted/Credentials/v1@X-GSSPROXY:
01/01/1970 00:00:00 01/01/1970 00:00:00 Encrypted/Credentials/v1@X-GSSPROXY:
01/01/1970 00:00:00 01/01/1970 00:00:00 Encrypted/Credentials/v1@X-GSSPROXY:
01/01/1970 00:00:00 01/01/1970 00:00:00 Encrypted/Credentials/v1@X-GSSPROXY:
01/01/1970 00:00:00 01/01/1970 00:00:00 Encrypted/Credentials/v1@X-GSSPROXY:
and so on...
The actual gssproxy credentials at /var/lib/gssproxy/clients/ does not correspond with this output, it only contains what could be expected - a TGT and maybe some service tickets.
The ever growing 'klist' list of credentials is a problem, after a while the user can no longer get any new credentials and therefore has no access to its NFS homedir (sec=krb5). I'm guessing it's the 'max_uid_ccaches' option in sssd-kcm that prevents this.
What is going on here - have we configured gssproxy/kcm wrong or is this a bug?
Regards
Adam
1 year, 11 months
SSSD entry_cache_nowait_percentage/ enum_cache_timeout not working properly?
by Robert Wagensveld
Hi all, I was hoping you could help me with this, as I am essentially clueless by this point.Even setting debug logging to 8 does not give much information as to what the problem might be.
I have chosen to set the enum_cache_timeout to a high value, e.g. 26000 seconds. This because we have a very large environment in terms of AD groups (we use Kerberos over LDAP) and this takes a long time to retrieve all groups. Weird part is, although this helped on some clients, it does not actually reduce login/sudo times on others. I have set the following values in sssd.conf:
entry_cache_nowait_percentage = 50
entry_cache_timeout = 60
My reason for this is that defining a value for the nowait percentage automatically update entries in the background. Not sure if I set the percentage rights though. https://linux.die.net/man/5/sssd.conf
What is wise to do in this regard? My desired behavior would be that it returns entries from cache even while offline as often as possible, and updates the cache in the background. I don't want users to have to wait for SSSD to iterate through all our insane amounts of groups in the foreground.
Thanks in advance!
2 years, 1 month
Group cache entry of removed member in be
by Thomas HUMMEL
Hello,
sssd-2.3.0 on RHEL/CentOS 8.3, I'm trying to figure out how refreshing a
user cache entry could impact (or not) a group entry this user is or has
been a member of in the backend
I'm using sssd-2.3.0 on RHEL/CentOS 8.3 with
auth_provider = ldap
ldap_schema = AD
access_provider = ldap
id_provider = ldap
authselect provides the following setup of nsswitch:
passwd: sss files systemd
group: sss files systemd
Initial experience on *some* hosts using the same sssd config:
a user <user> which has been for some time (greater than any sssd
timeout) removed from group <group> in the AD backend still appeared as
a member when running getent group <group>
"refreshing" this user with getent passwd <user> would fix the issue
I don't quite get the reasoning behind this behavior.
Does the link between the 2 actions lies in the ghost/unghost mechanism ?
I did not manage to reproduce it. The tests I made on another similar
host raised some more questions though:
a) I was able to verify the expected behavior of
entry_cache_group_timeout (or entry_cache_timeout for that matter):
after 1h30m getent group <group> would get the group entry from the
backend. Before that result were fed by the cache and could not be
consistent with the backend
b) I'm having trouble understanding how sss_cache -u or -g works as,
using ldbsearch -H <cache_ldb_file> I can see the dataExpireTimestamp
set to 1 but after what I thought would be a refresh (getent passwd or
group), I still see this value : is this because sssd RAM cache is not
sync'ed ?
Finally, do you confirm that the setting of ldap_purge_cache_timeout
should not change how the membership of a user in a group is return ?
Thanks for you help
--
Thomas HUMMEL
2 years, 1 month
Ubuntu 20.04 server joined to child AD domain unable to auth users from parent domain
by Joseph Agored
I have an Ubuntu 20.04 server that I have successfully joined to my domain using realm, US.EXAMPLE.COM.
The way our AD is structured is that all machines are joined to the child domain for their region and all users are setup in the parent domain, EXAMPLE.COM. With full trust, etc, of course.
I can successfully look up users in the US.EXAMPLE.COM domain with id or getent passwd, but cannot look up any users in the parent EXAMPLE.COM domain.
I can successfully kinit to the parent domain.
I have tried adding capaths to the krb5.conf as well as adding the parent domain to the sssd.conf file like this:
[domain/EXAMPLE.com]
inherit_from = US.EXAMPLE.com
id_provider = ad
debug_level = 7
krb5_validate = False
The closest I seem to get is an error message saying the server isn't found in the parent domain's Kerberous DB.
Client 'host/SERVER01(a)US.EXAMPLE.COM' not found in Kerberos database
Surely there is a way to use the existing trust to let the machine joined to the child domain authenticate users from the parent domain? I have done this successfully in this domain before with a RedHat server using FQDN, but I am trying to get away from RedHat and move to Ubuntu and the same configuration tricks were unable to let the Ubuntu server lookup or authenticate users from the parent domain even with FQDNs.
2 years, 1 month
Timing sensitive issue on `testing pam_acct_mgmt` which is also seen on SSH logins
by Aitor Pazos
Hi all,
Let me introduce the symptoms that triggered this investigation.
First, versions:
- OS: Ubuntu 20.04
- SSSD: 2.2.3-3ubuntu0.7
- Platform: x86_64
Some weeks after starting using a new region/provider we noticed some intermittent issue impacting SSH logins (ie: clients got `Connection closed by UNKNOWN port 65535` on `ssh ...`). When looking into it, I was able to reproduce it somewhat reliably. It happens after some time of inactivity for that user, but if I constantly issue commands (ie: `watch sssctl user-checks <user>` or `watch ssh <host> whoami` from a client) the issue wont trigger. If I stopped trying out those commands and tried again ~30 minutes later, it would fail.
With these symptoms I had these possibilities in mind: some caching issue, some authentication expiring, connections pools issues...
I enabled detailed logging and compared logs from a successful request and a failed one and I found what seems to be the source of the error from sssd logs perspective, but still can't make much sense of it.
Log extracted from sssd_<domain>.log:
```
(Wed Oct 13 19:40:52 2021) [be[<domain>]] [ipa_hbac_rule_info_next] (0x0400): Sending request for next search base: [cn=hbac,dc=ipa,dc=corp,<dc <domain>>][2][(&(objectclass=ipaHBACRule)(ipaenabledflag=TRUE)(accessRuleType=allow)(|(hostCategory=all)(...)))]
(Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_print_server] (0x2000): Searching <ip10.1.10.102:389
(Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] (0x0400): calling ldap_search_ext with [(&(objectclass=ipaHBACRule)(ipaenabledflag=TRUE)(accessRuleType=allow)(|(hostCategory=all)(...)][cn=hbac,dc=ipa,dc=corp,<dc <domain>>].
(Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [objectclass]
(Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [cn]
(Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [ipauniqueid]
(Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [ipaenabledflag]
(Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [accessRuleType]
(Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [memberUser]
(Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [userCategory]
(Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [memberService]
(Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [serviceCategory]
(Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [sourceHost]
(Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [sourceHostCategory]
(Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [externalHost]
(Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [memberHost]
(Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [hostCategory]
(Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_get_generic_ext_step] (0x2000): ldap_search_ext called, msgid = 18
(Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_op_add] (0x2000): New operation 18 timeout 60
(Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_process_result] (0x2000): Trace: sh[0x55d349f96d00], connected[1], ops[0x55d349ff16e0], ldap[0x55d34a10cba0]
(Wed Oct 13 19:40:52 2021) [be[<domain>]] [sdap_process_result] (0x2000): Trace: end of ldap_result list
(Wed Oct 13 19:41:52 2021) [be[<domain>]] [sdap_op_timeout] (0x1000): Issuing timeout for 18
(Wed Oct 13 19:41:52 2021) [be[<domain>]] [sdap_op_destructor] (0x1000): Abandoning operation 18
(Wed Oct 13 19:41:52 2021) [be[<domain>]] [generic_ext_search_handler] (0x0040): sdap_get_generic_ext_recv failed [110]: Connection timed out
(Wed Oct 13 19:41:52 2021) [be[<domain>]] [ipa_hbac_rule_info_done] (0x0080): Could not retrieve HBAC rules
(Wed Oct 13 19:41:52 2021) [be[<domain>]] [ipa_pam_access_handler_done] (0x0020): Unable to fetch HBAC rules [110]: Connection timed out
(Wed Oct 13 19:41:52 2021) [be[<domain>]] [dp_req_done] (0x0400): DP Request [PAM Account #21]: Request handler finished [0]: Success
```
I tried using `ldap_enumeration_search_timeout = 180` which updated the timeout in (`[sdap_op_add] (0x2000): New operation 18 timeout 60 `) but same result (`enumerate` and `subdomain_enumerate` config options are left to their defaults, `false` and `none` respectively).
I looked at the IPA servers LDAP logs and saw now related requests regarding `ipaHBACRules` so I assume the request didn't even left the server.
I tried tweaking some other caching options like `cached_auth_timeout` and `pam_id_timeout` without any impact. Also tried clearing up the caches with `sss_cache -E` or `systemctl stop sssd && rm /var/lib/sss/db/* && systemctl start sssd` and nothing changed.
This is the error returned by `sssctl user-checks` when it fails:
```
$ sudo sssctl user-checks <user>
user: <user>
action: acct
service: system-auth
SSSD nss user lookup result:
- user name: <user>
- user id: <uid>
- group id: <gid>
- gecos: <name>
- home directory: /home/<user>
- shell: /bin/bash
InfoPipe operation failed. Check that SSSD is running and the InfoPipe responder is enabled. Make sure 'ifp' is listed in the 'services' option in sssd.conf.InfoPipe User lookup with [<user>] failed.
testing pam_acct_mgmt
pam_acct_mgmt: System error
PAM Environment:
- no env -
```
Any pointer here would be very useful.
Thank you!
Aitor
2 years, 1 month
[SSSD] Announcing SSSD 2.5.2
by Pavel Březina
# SSSD 2.6.0
The SSSD team is proud to announce the release of version 2.6.0 of the
System Security Services Daemon. The tarball can be downloaded from:
https://github.com/SSSD/sssd/releases/tag/2.6.0
See the full release notes at:
https://sssd.io/release-notes/sssd-2.6.0.html
RPM packages will be made available for Fedora shortly.
## Feedback
Please provide comments, bugs and other feedback via the sssd-devel
or sssd-users mailing lists:
https://lists.fedorahosted.org/mailman/listinfo/sssd-devel
https://lists.fedorahosted.org/mailman/listinfo/sssd-users
## Highlights
### General information
* Support of legacy json format for ccaches was dropped
* Support of long time deprecated `secrets` responder was dropped.
* Support of long time deprecated `local` provider was dropped.
* This release drops support of `--with-unicode-lib` configure option.
`libunistring` will be used unconditionally for Unicode processing.
* This release removes pcre1 support. pcre2 is used unconditionally.
* p11_child does not stop at the first empty slot when searching for tokens
* A flaw was found in SSSD, where the sssctl command was vulnerable to
shell command injection via the logs-fetch and cache-expire subcommands.
This flaw allows an attacker to trick the root user into running a
specially crafted sssctl command, such as via sudo, to gain root access.
The highest threat from this vulnerability is to confidentiality,
integrity, as well as system availability. This patch fixes a flaw by
replacing `system()` with `execvp()`.
### New features
* Basic support of user's 'subuid and subgid ranges' for IPA provider
and corresponding plugin for shadow-utils were introduced. Limitations:
- single subid interval pair (subuid+subgid) per user - idviews aren't
supported - only forward lookup (user -> subid ranges) Take a note, this
is MVP of experimental feature. Significant changes might be required
later, after initial feedback. Corresponding support in shadow-utils was
merged upstream, but since there is no upstream release available yet,
SSSD feature isn't built by default. Build can be enabled with
`--with-subid` configure option. Plugin's install path can be configured
with `--with-subid-lib-path=` (`${libdir}` by default)
### Important fixes
* KCM now replace the old credential with new one when storing an
updated credential that is however already present in the ccache to
avoid unnecessary growth of the ccache.
* Improve mpg search filter to be more reliable with id-overrides and
the new auto_private_groups options.
* Even if the forest root is disabled for lookups all required internal
data is initialized to be able to refresh the list of trusted domains in
the forest from a DC of the forest root.
* ccache files are created with the right ownership during offline
Smartcard authentication
* AD ping is now sent over `ldap` if `cldap` support is not available
during build. This helps to build SSSD on distributions without `cldap`
support in `libldap`.
* CVE-2021-3621
### Configuration changes
* New IPA provider's option `ipa_subid_ranges_search_base` allows
configuration of search base for user's subid ranges. Default:
`cn=subids,%basedn`
2 years, 1 month
[SSSD] Announcing SSSD 2.6.0
by Pavel Březina
# SSSD 2.6.0
The SSSD team is proud to announce the release of version 2.6.0 of the
System Security Services Daemon. The tarball can be downloaded from:
https://github.com/SSSD/sssd/releases/tag/2.6.0
See the full release notes at:
https://sssd.io/release-notes/sssd-2.6.0.html
RPM packages will be made available for Fedora shortly.
## Feedback
Please provide comments, bugs and other feedback via the sssd-devel
or sssd-users mailing lists:
https://lists.fedorahosted.org/mailman/listinfo/sssd-devel
https://lists.fedorahosted.org/mailman/listinfo/sssd-users
## Highlights
### General information
* Support of legacy json format for ccaches was dropped
* Support of long time deprecated `secrets` responder was dropped.
* Support of long time deprecated `local` provider was dropped.
* This release drops support of `--with-unicode-lib` configure option.
`libunistring` will be used unconditionally for Unicode processing.
* This release removes pcre1 support. pcre2 is used unconditionally.
* p11_child does not stop at the first empty slot when searching for tokens
* A flaw was found in SSSD, where the sssctl command was vulnerable to
shell command injection via the logs-fetch and cache-expire subcommands.
This flaw allows an attacker to trick the root user into running a
specially crafted sssctl command, such as via sudo, to gain root access.
The highest threat from this vulnerability is to confidentiality,
integrity, as well as system availability. This patch fixes a flaw by
replacing `system()` with `execvp()`.
### New features
* Basic support of user's 'subuid and subgid ranges' for IPA provider
and corresponding plugin for shadow-utils were introduced. Limitations:
- single subid interval pair (subuid+subgid) per user - idviews aren't
supported - only forward lookup (user -> subid ranges) Take a note, this
is MVP of experimental feature. Significant changes might be required
later, after initial feedback. Corresponding support in shadow-utils was
merged upstream, but since there is no upstream release available yet,
SSSD feature isn't built by default. Build can be enabled with
`--with-subid` configure option. Plugin's install path can be configured
with `--with-subid-lib-path=` (`${libdir}` by default)
### Important fixes
* KCM now replace the old credential with new one when storing an
updated credential that is however already present in the ccache to
avoid unnecessary growth of the ccache.
* Improve mpg search filter to be more reliable with id-overrides and
the new auto_private_groups options.
* Even if the forest root is disabled for lookups all required internal
data is initialized to be able to refresh the list of trusted domains in
the forest from a DC of the forest root.
* ccache files are created with the right ownership during offline
Smartcard authentication
* AD ping is now sent over `ldap` if `cldap` support is not available
during build. This helps to build SSSD on distributions without `cldap`
support in `libldap`.
* CVE-2021-3621
### Configuration changes
* New IPA provider's option `ipa_subid_ranges_search_base` allows
configuration of search base for user's subid ranges. Default:
`cn=subids,%basedn`
2 years, 1 month
feasible to use sssd in mostly offline mode?
by James Ralston
For our on-site Linux machines, we use the sssd-ad provider to both
map users/groups from Active Directory, and to authenticate users via
Kerberos. It works fantastically well, to the point where we have
absolutely no desire to go back to maintaining local users/groups in
/etc/passwd and /etc/group (respectively).
We are contemplating offering remote Linux laptops to our users. But
our InfoSec team is adamant that our AD DCs must not be reachable from
the Internet at large. So unless the owner of the laptop logs in to
the laptop locally and then connects to our VPN, the laptop will have
no access to Active Directory, and therefore sssd will be in offline
mode.
We are wondering whether it would be feasible to set the various sssd
caches to have long values (e.g. 180 days) so that as long as the user
fires up the laptop and connects to the VPN once every 180 days, they
will still be able to login (using their cached password) and
getpwnam/getpwuid/getgrnam/getgrgid will still work for their
uids/gids (because sssd will return the values it cached from the last
time it was able to reach the AD DCs).
It’s not clear to us how we would implement this. We could adjust
various timeouts; e.g., from:
entry_cache_timeout = 5400
cache_credentials = FALSE
To:
entry_cache_timeout = 7776000
cache_credentials = TRUE
…but I don’t think this is going to do what we want. We only want
sssd to keep entries in the cache for 90 days when it is offline. If
sssd comes back online (because the user connected to the VPN and thus
AD is reachable again), we want AD to refresh any cached entries that
are older than 90 minutes.
So, what we really want is an option for sssd that essentially says,
“You are frequently going to be operating in offline mode, so only
kick things out of the cache when you are in online mode, because
otherwise there’s a good chance that the user won’t be able to login
to fire up the VPN so that you can switch to online mode.”
Has anyone attempted running sssd in a mostly-offline environment like
this? If so, how well does it work, and what settings are you using?
Thanks!
2 years, 1 month
https://bugzilla.redhat.com/show_bug.cgi?id=1984591 not understanding the nature of the sssd bug introduced recently…
by Spike White
All (but particularly Sumit since he wrote the comments on
https://bugzilla.redhat.com/show_bug.cgi?id=1984591),
There are at least two problems created by this recently-introduced sssd
bug. One problem is solvable by the suggested work-around, the other is
not. The work-around suggested is:
[domain/name.of.joined.domain]
ad_enabled_domains = dom1.example.com, dom2.example.com,
dom3.example.com
In order to query only the desired AD domains.
What is the bug?
the sssd-ad man page says "The AD provider can be used to get user
information and authenticate users from trusted domains. Currently
only trusted domains in the same forest are recognized.".
What is happening is that untrusted AD domains are being discovered. A
very specific type of untrusted domains. When the joined domain has no
trust with that other domain, but that other domain trusts the original
domain – that is a one-way trust (the wrong way). To the joined domain,
this is an untrusted domain and should not be discovered.
This is actually very common in corporate environments.
You may have a main AD domain, call it CORP.COMPANY.COM. Then for testing
and new production evaluation, you might have a test AD domain called
LAB-TEST.COMPANY.COM. CORP.COMPANY.COM is tightly controlled, with full
audits and corporate security. LAB-TEST.COMPANY.COM is a test AD domain –
it’s the wild, wild west!
So LAB-TEST.COMPANY.COM trusts the main AD domain (in order that users can
log into this test domain with their CORP accounts). But CORP.COMPANY.COM
does not trust LAB-TEST.COMPANY.COM – nor should it!! (That’s the wild,
wild west, doing so would compromise corporate security.)
Thus, a server joined to domain CORP.COMPANY.COM should discover
CORP.COMPANY.COM and any domains trusted by CORP.COMPANY.COM. It should
*NOT* discover LAB-TEST.COMPANY.COM, as CORP.COMPANY.COM does not trust
this domain.
A server joined to LAB-TEST.COMPANY.COM should discover LAB-TEST.COMPANY.COM
and all domains trusted by LAB-TEST.COMPANY.COM. Including CORP.COMPANY.COM,
as LAB-TEST.COMPANY.COM trusts CORP.COMPANY.COM.
The bug is that a server joined to CORP.COMPANY.COM discovers
LAB-TEST.COMPANY.COM, which it shouldn’t.
What problems does this cause?
Two problems.
1. Many of these untrusted discovered “lab” domains are accessible
only to specific network locations. That is, they’re firewalled off to a
particular lab. So sssd attempts to query these inaccessible AD domains
and takes a long time to time out. This problem can be worked around by
the suggested work-around in the Bugzilla:
[domain/corp.company.com]
ad_enabled_domains = corp.company.com
So then, while LAB-TEST.COMPANY.COM is still erroneously discovered, it is
no longer searched. Sssd is again fast.
2. Bogus messages in /var/log/sssd_nss.log file. Even with no debug
level set in the [nss] stanza, these error messages appear multiple times a
second. It quickly fills up the /var/log filesystem.
[root@auspdfdlobv01 sssd]# cat sssd_nss.log |grep "The Data Provider
returned an error"
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
From debug level 9, it is clear that this is arising from a query of these
erroneously-discovered untrusted domains. Here’s an example of one
instance of above with debug level 9 turned on. So
emeaicmd.geodll.company.com is one of these erroneously-discovered
untrusted lab domains, that happens to be firewalled off from this
particular AD client:
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x1000): Got reply from
Data Provider - DP error code: 0 errno: 0 error message: Success
(2021-10-07 9:50:02): [nss] [cache_req_search_cache] (0x0400): CR #9:
Looking up [oracle(a)company.com] in cache
(2021-10-07 9:50:02): [nss] [cache_req_search_cache] (0x0400): CR #9:
Object [oracle(a)company.com] was not found in cache
(2021-10-07 9:50:02): [nss] [cache_req_search_ncache_add_to_domain]
(0x0400): CR #9: Adding [oracle(a)company.com] to negative cache
(2021-10-07 9:50:02): [nss] [is_user_local_by_name] (0x0400): User
oracle(a)company.com is a local user
(2021-10-07 9:50:02): [nss] [sss_ncache_set_str] (0x0400): Adding
[NCE/USER/company.com/oracle@company.com] to negative cache
(2021-10-07 9:50:02): [nss] [cache_req_validate_domain_type] (0x2000):
Request type POSIX-only for domain EMEAICMD.geodll.company.com type POSIX
is valid
(2021-10-07 9:50:02): [nss] [cache_req_set_domain] (0x0400): CR #9: Using
domain [EMEAICMD.geodll.company.com]
(2021-10-07 9:50:02): [nss] [cache_req_prepare_domain_data] (0x0400): CR
#9: Preparing input data for domain [EMEAICMD.geodll.company.com] rules
(2021-10-07 9:50:02): [nss] [cache_req_search_send] (0x0400): CR #9:
Looking up oracle(a)emeaicmd.geodll.company.com
(2021-10-07 9:50:02): [nss] [cache_req_search_ncache] (0x0400): CR #9:
Checking negative cache for [oracle(a)emeaicmd.geodll.company.com]
(2021-10-07 9:50:02): [nss] [sss_ncache_check_str] (0x2000): Checking
negative cache for [NCE/USER/
EMEAICMD.geodll.company.com/oracle@emeaicmd.geodll.company.com]
(2021-10-07 9:50:02): [nss] [cache_req_search_ncache] (0x0400): CR #9: [
oracle(a)emeaicmd.geodll.company.com] is not present in negative cache
(2021-10-07 9:50:02): [nss] [cache_req_search_cache] (0x0400): CR #9:
Looking up [oracle(a)emeaicmd.geodll.company.com] in cache
(2021-10-07 9:50:02): [nss] [cache_req_search_cache] (0x0400): CR #9:
Object [oracle(a)emeaicmd.geodll.company.com] was not found in cache
(2021-10-07 9:50:02): [nss] [cache_req_search_dp] (0x0400): CR #9: Looking
up [oracle(a)emeaicmd.geodll.company.com] in data provider
(2021-10-07 9:50:02): [nss] [sss_dp_issue_request] (0x0400): Issuing
request for [0x564d6be36a70:3:oracle@emeaicmd.geodll.company.com@
EMEAICMD.geodll.company.com]
(2021-10-07 9:50:02): [nss] [sss_dp_get_account_msg] (0x0400): Creating
request for [EMEAICMD.geodll.company.com
][0x3][BE_REQ_INITGROUPS][name=oracle@emeaicmd.geodll.company.com:-]
(2021-10-07 9:50:02): [nss] [sbus_add_timeout] (0x2000): 0x564d6ccd6670
(2021-10-07 9:50:02): [nss] [sss_dp_internal_get_send] (0x0400): Entering
request [0x564d6be36a70:3:oracle@emeaicmd.geodll.company.com@
EMEAICMD.geodll.company.com]
(2021-10-07 9:50:02): [nss] [cache_req_search_cache] (0x0400): CR #12:
Looking up [oracle(a)company.com] in cache
(2021-10-07 9:50:02): [nss] [cache_req_search_cache] (0x0400): CR #12:
Object [oracle(a)company.com] was not found in cache
(2021-10-07 9:50:02): [nss] [cache_req_search_ncache_add_to_domain]
(0x0400): CR #12: Adding [oracle(a)company.com] to negative cache
(2021-10-07 9:50:02): [nss] [is_user_local_by_name] (0x0400): User
oracle(a)company.com is a local user
(2021-10-07 9:50:02): [nss] [sss_ncache_set_str] (0x0400): Adding
[NCE/USER/company.com/oracle@company.com] to negative cache
(2021-10-07 9:50:02): [nss] [cache_req_validate_domain_type] (0x2000):
Request type POSIX-only for domain EMEAICMD.geodll.company.com type POSIX
is valid
(2021-10-07 9:50:02): [nss] [cache_req_set_domain] (0x0400): CR #12: Using
domain [EMEAICMD.geodll.company.com]
(2021-10-07 9:50:02): [nss] [cache_req_prepare_domain_data] (0x0400): CR
#12: Preparing input data for domain [EMEAICMD.geodll.company.com] rules
(2021-10-07 9:50:02): [nss] [cache_req_search_send] (0x0400): CR #12:
Looking up oracle(a)emeaicmd.geodll.company.com
(2021-10-07 9:50:02): [nss] [cache_req_search_ncache] (0x0400): CR #12:
Checking negative cache for [oracle(a)emeaicmd.geodll.company.com]
(2021-10-07 9:50:02): [nss] [sss_ncache_check_str] (0x2000): Checking
negative cache for [NCE/USER/
EMEAICMD.geodll.company.com/oracle@emeaicmd.geodll.company.com]
(2021-10-07 9:50:02): [nss] [cache_req_search_ncache] (0x0400): CR #12: [
oracle(a)emeaicmd.geodll.company.com] is not present in negative cache
(2021-10-07 9:50:02): [nss] [cache_req_search_cache] (0x0400): CR #12:
Looking up [oracle(a)emeaicmd.geodll.company.com] in cache
(2021-10-07 9:50:02): [nss] [cache_req_search_cache] (0x0400): CR #12:
Object [oracle(a)emeaicmd.geodll.company.com] was not found in cache
(2021-10-07 9:50:02): [nss] [cache_req_search_dp] (0x0400): CR #12:
Looking up [oracle(a)emeaicmd.geodll.company.com] in data provider
(2021-10-07 9:50:02): [nss] [sss_dp_issue_request] (0x0400): Issuing
request for [0x564d6be36a70:3:oracle@emeaicmd.geodll.company.com@
EMEAICMD.geodll.company.com]
(2021-10-07 9:50:02): [nss] [sss_dp_issue_request] (0x0400): Identical
request in progress: [0x564d6be36a70:3:oracle@emeaicmd.geodll.company.com@
EMEAICMD.geodll.company.com]
(2021-10-07 9:50:02): [nss] [sss_dp_req_destructor] (0x0400): Deleting
request: [0x564d6be36a70:3:oracle@company.com@company.com]
(2021-10-07 9:50:02): [nss] [sbus_remove_timeout] (0x2000): 0x564d6ccd6670
(2021-10-07 9:50:02): [nss] [sbus_dispatch] (0x4000): dbus conn:
0x564d6ccc9300
(2021-10-07 9:50:02): [nss] [sbus_dispatch] (0x4000): Dispatching.
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
The suggested work-around does not resolve problem #2.
BTW, here is a listing of the domains discovered on that sssd client:
[root@auspdfdlobv01 ~]# sssctl domain-list
amer.company.com
company.com
japn.company.com
emea.company.com
apac.company.com
EMEAICMD.geodll.company.com
geodll.company.com
EMEAICM.GEODLL.COMPANY.COM
alienware.com
corp.svcs
perotsystems.net
companyservices.dmz
Beer.Town
production.online.company.com
jp-poclab.companypoc.com
emea-poclab.companypoc.com
oldev.preol.company.com
olqa.preol.company.com
ap-poclab.companypoc.com
[root@auspdfdlobv01 ~]#
This sssd client is joined to amer.company.com, so the only trusted domains
are the first 5. The parent domain and the 4 regional domains. All
those other domains below that are untrusted domains. More specifically,
they trust company.com, but company.com does not trust them. (one way
trust – the wrong way.) Some look like the real wild wild west (Beer.Town
?).
Spike
2 years, 1 month