On Mon, 2014-06-02 at 17:36 +0200, Joschi Brauchle wrote:
> On 06/02/2014 07:51 AM, John Hodrien wrote:
> > On Mon, 2 Jun 2014, Stephen Gallagher wrote:
> >
> >> This is the real problem. If SSSD can route to the IP address,
> >> then we have to proceed assuming that the LDAP server should be
> >> available (thereby attempting to connect to it and perform
> >> online authentication). There's really no way to determine ahead
> >> of time whether the service is "supposed" to be available.
> >>
> >> You may want to play with the option 'ldap_opt_timeout' (see
> >> sssd-ldap(5)). It controls how long the OpenLDAP client libraries
> >> will wait for a response (in your case, how long it will wait
> >> while the packets are dropped. It defaults to 6s).
> >
> > This should be a one off hit though, right? If I discover the
> > LDAP server is offline, I should remember this, admittedly recheck
> > periodically, but never cause another delay waiting for it to
> > spring back into life. Given the way some of these laptops are
> > used, I'd even quite like to configure it to default to this
> > state.
> >
> > When I last tried this (which was a while ago) these delays would
> > happen repeatedly, so the setup was unusable, and I had to ditch
> > sssd on the laptop.
>
> Well, in most common cases, the LDAP server is unresolvable when not
> on the VPN/inside the network, so SSSD immediately detects that it
> can't get there and the delay is unnoticeable.
>
> It's those cases where the server is addressable but unresponsive that
> is much harder to handle.
>
> Right now, we have a two-minute sleep between operations trying to go
> online again. (I think I saw a patch go in for 1.12 that makes this
> configurable). That's mostly so that we catch cases where you've
> connected to the VPN but for one reason or another SSSD doesn't get
> notified that the network state changed (there are lots of edge-cases
> that cause this).
I am not 100% sure that the LDAP server being unresponsive is the
cause... Once I have the logs I will know more!
But isn't this is design flaw of the LDAP connectivity test?
If connectivity is tested only after some application/the system is
requesting information from SSSD and the server is unresponsive, this
causes a long and unpleasant delay if the request is kept pending until
the connection times out.
Hence, I'd suggest that SSSD periodically tests the LDAP connection in
the background (or after network state change) *without* an actual
request triggering this. As long as the LDAP server is unreachable or
unresponsive, SSSD should stay in offline mode and answer requests right
away with cached results.
SSSD should already do this by way of the midway refresh feature.
However I am not sure it works as expected when the fast cache is in
use.
You can temporarily workaround this by having a background script
(cron ?) that regularly runs a getent passwd username
So that hopefully users will almost always hit sssd when it is already
offline.
Simo.
--
Simo Sorce * Red Hat, Inc * New York