On 09/04/2015 02:35 PM, Jakub Hrozek wrote:
>Hi,
>
>the attached patches implement
https://fedorahosted.org/sssd/ticket/2637
>
>There are two main use-cases:
> 1) If AD DCs are not reachable on the IPA server itself, we should
> avoid going offline completely, at least the IPA domain should be
> still reachable.
> 2) If SSSD is connected to a non-root AD DC, we still try to contact
> the forest root because only the forest root normally knows all the
> subdomains. But in many setups, the forest root is not reachable
> due to network restrictions.
>
>The full design is described here:
>https://fedorahosted.org/sssd/wiki/DesignDocs/OneWayTrusts#Subdomainofflinestatuschanges
>
>There is one area that I'm not sure about myself -- in the
>ad_subdomains.c changes, I always set ignore_mark_offline to true for
>the root DC. I think it's safe because in this case the root domain is a
>subdomain and we don't want a subdomain to allow marking the be as
>offline, but I would like to ask that this change is double-checked.
I think it is ok as well.
>
>I mostly checked by pausing AD DC VMs in my test setups and making sure
>that the backend stays offline after lookup error, a subsequent lookup
>is answered as if we were online on the backend side and that the
>inactive status is reset. Also, main domain failures still must mark
>sssd as offline.
Hi, I'm about to code-ack those patches. But I would like to make more clear
about the tri-state in the following function.
>+static void be_mark_subdom_offline(struct sss_domain_info *subdom,
>+ struct be_ctx *be_ctx)
>+{
>+ struct timeval tv;
>+ struct tevent_timer *timeout = NULL;
>+ int reset_status_timeout;
>+
>+ reset_status_timeout = get_offline_timeout(be_ctx);
>+ tv = tevent_timeval_current_ofs(reset_status_timeout, 0);
>+
>+ switch (subdom->state) {
>+ case DOM_DISABLED:
>+ DEBUG(SSSDBG_MINOR_FAILURE, "Won't touch disabled
subdomain\n");
>+ return;
>+ case DOM_INACTIVE:
>+ DEBUG(SSSDBG_TRACE_ALL, "Subdomain already disabled\n");
Subdomain is inactive not disabled.
>+ return;
>+ case DOM_ENABLED:
>+ DEBUG(SSSDBG_TRACE_LIBS, "Disabling subdomain %s\n",
subdom->name);
You are marking it as inactive not as disabled.
>+ break;
>+ }
>+
>+ timeout = tevent_add_timer(be_ctx->ev, be_ctx, tv,
>+ be_subdom_reset_status, subdom);
>+ if (timeout == NULL) {
>+ DEBUG(SSSDBG_OP_FAILURE, "Cannot create timer\n");
>+ return;
>+ }
>+
>+ subdom->state = DOM_INACTIVE;
^^
>+}
So are we marking it as disabled, inactive or offline?
Have you considered to rename DOM_INACTIVE -> DOM_OFFLINE maybe? I think it
more complies to the situation.
Also DOM_ENABLED should be DOM_ACTIVE (according to design document and it
makes more sense) or DOM_ONLINE (if you decide to rename DOM_INACTIVE to
DOM_OFFLINE).
I will test the patches tomorrow.
The only reason I didn't name the constants as online/offline is that I
still think this is a bit of a hack and we should have a proper failover
support in the future.
But it's also only an internal change, so I'm fine changing the patches
as well. (I can do that in-tree and then resubmit if there are more
changes required).