On (24/04/15 12:43), Jakub Hrozek wrote:
On Thu, Apr 23, 2015 at 07:29:07AM -0400, Stephen Gallagher wrote:
On Thu, 2015-04-23 at 08:14 +0200, Lukas Slebodnik wrote:
On (20/04/15 14:38), Stephen Gallagher wrote:
On Mon, 2015-04-20 at 08:53 +0200, Lukas Slebodnik wrote:
ehlo,
attached patch fixes crash in https://fedorahosted.org/sssd/ticket/2629
Nack. I'd rather we fixed the root of this problem. I did some digging this afternoon and tracked the issue back to ad_gpo.c line 3499 (in current master). If we get back a NULL result or num_results == 0, then we just skip over this item in the list and start processing the next one. Unfortunately, that leaves an item in the candidate_gpos list that was never properly constructed.
You are right.
We got a referral to GPO and therefore we do not find any attributes.
[sdap_sd_search_send] (0x0400): Searching entry [cn={2BA15B73-9524- 419F-B4B7-185E1F0D3DCF},cn=policies,cn=system,DC=example,DC=com] using SD [sdap_print_server] (0x2000): Searching 10.1.1.14 [sdap_get_generic_ext_step] (0x0400): calling ldap_search_ext with [(objectclass=*)][cn={2BA15B73-9524-419F-B4B7- 185E1F0D3DCF},cn=policies,cn=system,DC=example,DC=com]. [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [nTSecurityDescriptor] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [cn] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gPCFileSysPath] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gPCMachineExtensionNames] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gPCFunctionalityVersion] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [flags] [sdap_get_generic_ext_step] (0x2000): ldap_search_ext called, msgid = 14 [sdap_process_result] (0x2000): Trace: sh[0x7f5d409a8c60], connected[1], ops[0x7f5d409d2c10], ldap[0x7f5d409a9ed0] [sdap_process_result] (0x2000): Trace: ldap_result found nothing! [sdap_process_result] (0x2000): Trace: sh[0x7f5d409a8c60], connected[1], ops[0x7f5d409d2c10], ldap[0x7f5d409a9ed0] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_RESULT] [sdap_get_generic_op_finished] (0x0400): Search result: Referral(10), 0000202B: RefErr: DSID-0310063C, data 0, 1 access points ref 1: 'lzb.hq'
[sdap_get_generic_op_finished] (0x1000): Ref: ldap://lzb.hq/cn=%7B2BA15B73-9524-419F-B4B7- 185E1F0D3DCF%7D,cn=policies,cn=system,DC=example,DC=com [ad_gpo_get_gpo_attrs_done] (0x0040): no attrs found for GPO; try next GPO.
Under what circumstances can the secinfo_dacl search return success but with zero results? Is there a bug or a race here (such as the AD server has updated the GPO since we got the list of candidate GPOs?). How best to handle this?
With your patch here, it looks like we're assuming that it's okay to just skip over this GPO. If that's the case, then what we really need to be doing in ad_gpo_get_gpo_attrs_step() is to mark the gp_gpo as being invalid and then after we've gone through them all, shrink the array, removing all of the invalid entries. This will be more future- proof, as it's not just the gpo_sd that is uninitialized here. Every member of this gp_gpo is NULL except for the DN.
If it's *not* okay that we've gotten no results for this lookup (such as in the race case; we don't want to be skipping over a GPO that might properly be denying users), we may need to restart processing at least a couple times to try to avoid the race and go offline if we can't complete the processing (so we at least stick to our cached rules).
I'm sorry I didn't noticed it in log files the first time. Now we know the root of problem. Which of your proposal do you prefer now?
LS
OK, I hadn't considered the referral scenario. (I wasn't actually aware that GPOs could *be* referred to elsewhere). Is this referral pointing to a different machine than the host is enrolled to? It's trying to reach a DN at lzb.hq that is identical to the one we just requested.
I'd really like to know how to reproduce the setup for this situation; is it inheriting a GPO from a parent domain somehow?
Neither of my above proposals are a proper solution for this case, unfortunately. The first one would make the crash go away, but if this GPO really exists and is restrictive, not processing it is *really bad*. Right now we're actually slightly lucky in that the crash has a net result of causing an access denial rather than erroneously resulting in a user login, but if we were to "solve" this problem the way Lukas's original patch intended (or my first alternative above), it would be introducing a security issue. So let's not do either of those.
The second option I suggested there is invalid as well, since the entry *exists*, it's just been converted to a referral. There are therefore only two safe options at this time:
- Quick-and-ugly: If we encounter a referral, skip straight to denial
and loudly warn in the logs that this is a situation we are unable to handle yet. This will avoid the crash, but will leave the GPO behavior equivalently bad to the current state. Might be acceptable if this is only happening on a very small subset of systems.
I'm afraid we need to go this way now.
Users experiencing this might be able to set "ldap_referrals = True" on the affected systems to quietly solve the problem at the expense of all the other side-effects to that (particularly performance)
I wouldn't even suggest enabling referrals. I've seen the LDAP provider misbehave completely with referrals enabled, openldap-libs is not really great at chasing them and LDAP provider would go offline etc.. So enabling referrals doesn't just degrade performance, but also functionality..
I also tested that using Global Catalog wouldn't help here, some attributes we require are not replicated to GC, such as gPCFileSysPath.
Are you sure we're using the right connection to the right domain? We can select the right connection based on the DN that we know. I'm sorry, but I couldn't reproduce the bug myself.
Stephen wrote a reproducer into ticket https://fedorahosted.org/sssd/ticket/2629
We agreed to do the "Quick-and-ugly" way for 1.12 + documentation but he mentioned that it is walid use-case and it would be good to implement it in next release.
LS