On Mon, Oct 05, 2015 at 04:46:55PM +0200, Jakub Hrozek wrote:
On Mon, Oct 05, 2015 at 12:33:48PM +0200, Sumit Bose wrote:
> On Mon, Oct 05, 2015 at 12:08:26PM +0200, Jakub Hrozek wrote:
> > On Fri, Oct 02, 2015 at 10:32:27AM +0200, Jakub Hrozek wrote:
> > > On Thu, Oct 01, 2015 at 01:41:07PM +0200, Jakub Hrozek wrote:
> > > > I don't have a good idea how to reproduce except simulate the
failure in
> > > > gdb, sorry... at least I verified that after setting the context to
NULL:
> > > > (gdb) set clist[1] = 0
> > > > the request runs to completion and SSSD doesn't crash anymore. So
I think
> > > > we should just add a check..
> > >
> > > Turns out this is easier to reproduce and hence more embarassing. Just
> > > add POSIX attributes to AD but don't replicate them to GC..
> >
> > Any takers for review?
>
> (I already took it some time ago in patchworks :-)
>
> While the patch fixes the issue and is also a right way to fix the issue
> I would recommend to change it a bit to make the reason more clear.
>
> As you said to reproduce the issue you need trust to AD configured which
> POSIX attributes but not replicate them to the Global Catalog. If you
> new start with an empty cache a sequence like
>
> getent group group_with_posix_gid(a)ad.domain
> sss_cache -E
> getent group group_with_posix_gid(a)ad.domain
>
> will trigger the crash.
>
> What happens is that during the first lookup SSSD sees that the GC does
> not have the POSIX attributes and disable the GC lookup. As a
> consequence ad_gc_conn_list() will return only a single entry for the
> second lookup and unconditionally accessing the second entry will fail.
> While the fix is correct, I would recommend to add a boolean option to
> ad_gc_conn_list() and set ignore_mark_offline in ad_gc_conn_list().
> Otherwise you have to add extra logic to ipa_get_ad_acct_send() to make
> sure ignore_mark_offline is set even it ad_gc_conn_list() only returns
> one element.
You're right that the logic was scattered all over the place. The
attached patches consolidate it into ad_common.c so all callers use the
same code. It's a larger change than before, but hopefully it's still
digestable and as a bonus we can have tests for different combinations
of contexts, GC status and master/sub domains.
The second patch is not really needed, the only reason I included it is
that I think it's better to have similar code grouped together in small
functions and again, tests.
If you think these are too invasive, we can only apply them to master
and do exactly what you proposed for sssd-1-13.
btw I had a bit of trouble with downstream tests, the seem to fail in
unrelated way, so I'm sending the patches to the list after some manual
testing and will resume the downstream tests tomorrow..
The attached patches fix some more Sumit's review comments. With Lukas'
help I was also able to run downstream ad_forest tests and didn't see
any issues even with the patches.