On Thu, Oct 20, 2016 at 05:58:50AM -0000, Aleksey Maksimov wrote:
Hello, Justin!
> "I cannot confirm if the link you provided has the correct steps"
These steps I tested on several other servers. Everything works.
The problem occurred on only one server (I wrote about this earlier).
This is the most strange in this situation.
> "I would search for 'mark_offline' in the domain log file and look just
above this to get an idea of what causes the backend to be set offline. "
Here's what I found in the domain log:
(Wed Oct 19 16:52:25 2016) [sssd[be[ad.holding.com]]] [fo_set_port_status] (0x0100):
Marking port 389 of server 'msk-dc01.holding.com' as 'not working'
(Wed Oct 19 16:52:25 2016) [sssd[be[ad.holding.com]]] [ad_user_data_cmp] (0x1000):
Comparing LDAP with LDAP
(Wed Oct 19 16:52:25 2016) [sssd[be[ad.holding.com]]] [fo_set_port_status] (0x0400):
Marking port 389 of duplicate server 'msk-dc01.holding.com' as 'not
working'
(Wed Oct 19 16:52:25 2016) [sssd[be[ad.holding.com]]] [ad_user_data_cmp] (0x1000):
Comparing LDAP with LDAP
(Wed Oct 19 16:52:25 2016) [sssd[be[ad.holding.com]]] [sdap_handle_release] (0x2000):
Trace: sh[0x7f590093da10], connected[1], ops[(nil)], ldap[0x7f5900927d10],
destructor_lock[0], release_memory[0]
(Wed Oct 19 16:52:25 2016) [sssd[be[ad.holding.com]]] [remove_connection_callback]
(0x4000): Successfully removed connection callback.
(Wed Oct 19 16:52:25 2016) [sssd[be[ad.holding.com]]] [be_mark_offline] (0x2000): Going
offline!
(Wed Oct 19 16:52:25 2016) [sssd[be[ad.holding.com]]] [be_mark_offline] (0x2000):
Initialize check_if_online_ptask.
(Wed Oct 19 16:52:25 2016) [sssd[be[ad.holding.com]]] [be_ptask_create] (0x0400):
Periodic task [Check if online (periodic)] was created
(Wed Oct 19 16:52:25 2016) [sssd[be[ad.holding.com]]] [be_ptask_schedule] (0x0400): Task
[Check if online (periodic)]: scheduling task 82 seconds from now [1476885227]
(Wed Oct 19 16:52:25 2016) [sssd[be[ad.holding.com]]] [be_run_offline_cb] (0x0080): Going
offline. Running callbacks.
(Wed Oct 19 16:52:25 2016) [sssd[be[ad.holding.com]]] [sdap_id_op_connect_done] (0x4000):
notify offline to op #1
(Wed Oct 19 16:52:25 2016) [sssd[be[ad.holding.com]]] [be_mark_dom_offline] (0x1000):
Marking subdomain
holding.com offline
(Wed Oct 19 16:52:25 2016) [sssd[be[ad.holding.com]]] [be_mark_subdom_offline] (0x1000):
Marking subdomain
holding.com as inactive
(Wed Oct 19 16:52:25 2016) [sssd[be[ad.holding.com]]] [ad_subdomains_root_conn_done]
(0x0040): Failed to connect to AD server: [11](Resource temporarily unavailable)
(Wed Oct 19 16:52:25 2016) [sssd[be[ad.holding.com]]] [get_subdomains_callback] (0x0400):
Backend returned: (1, 11, <NULL>) [Provider is Offline (Have exhausted maximum
number of retries for service)]
(Wed Oct 19 16:52:25 2016) [sssd[be[ad.holding.com]]] [be_queue_next_request] (0x4000):
Queued request filed successfully.
(Wed Oct 19 16:52:25 2016) [sssd[be[ad.holding.com]]] [sdap_id_op_connect_done] (0x4000):
notify offline to op #2
(Wed Oct 19 16:52:25 2016) [sssd[be[ad.holding.com]]] [be_mark_dom_offline] (0x1000):
Marking subdomain
holding.com offline
(Wed Oct 19 16:52:25 2016) [sssd[be[ad.holding.com]]] [be_mark_subdom_offline] (0x4000):
Subdomain already inactive
(Wed Oct 19 16:52:25 2016) [sssd[be[ad.holding.com]]] [ad_subdomains_root_conn_done]
(0x0040): Failed to connect to AD server: [11](Resource temporarily unavailable)
(Wed Oct 19 16:52:25 2016) [sssd[be[ad.holding.com]]] [sdap_id_release_conn_data]
(0x4000): releasing unused connection
(Wed Oct 19 16:52:25 2016) [sssd[be[ad.holding.com]]] [get_subdomains_callback] (0x0400):
Backend returned: (0, 0, <NULL>) [Success (Success)]
(Wed Oct 19 16:52:25 2016) [sssd[be[ad.holding.com]]] [be_queue_next_request] (0x4000):
Request queue is empty.
Why SSSD is trying to access the root domain controller (
msk-dc01.holding.com) ?
Because it's trying to read the list of trusted domains in the forest.
In our case, the root domain controllers have access restrictions. This is a normal
situation for large domains.
In my configuration file sssd.conf explicitly specified domain controllers that SSSD
should use for authorization
[
domain/ad.holding.com]
ad_server =
kom-dc01.ad.holding.com,
kom-dc02.ad.holding.com
Yes, but the trusted domains are discovered by the "subdomains_provider"
which always uses SRV discovery to read the list of AD DCs. Starting
with sssd-1.14 you can use a new option "ad_enabled_domains" to only
list the domains sssd should be allowed to "see".
In previous versions, the only other workaround is to disable the
subdomains provider completely:
subdomains_provider = none
but then you need to define the domain SID manually if you want to use
ID-mapping.