On Tue, 2012-08-07 at 13:15 -0400, Stephen Gallagher wrote:
On Tue, 2012-08-07 at 19:08 +0200, Michal Zidek wrote:
> On 08/07/2012 04:11 PM, Simo Sorce wrote:
> > On Tue, 2012-08-07 at 15:40 +0200, Michal Zidek wrote:
> >> On 08/06/2012 08:57 PM, Simo Sorce wrote:
> >>> The major question I have is that this new code introduces O(N^2)
> >>> behavior, if there are more then a handful of groups it will be quite
> >>> costly. Can we find a way that is not so expensive ?
> >> I think this is not a big problem. If I understand the code correctly,
> >> the inner cycle only iterates through potential direct parent groups (we
> >> add group member to them later) and it is unlikely to be a big number in
> >> real scenarios (Or am I wrong? Is it common that a group has so many
> >> direct parents that it would be expensive?).
> > On what data do you base your assertion about probability, not saying
> > you are wrong, but is it a gut feeling or do you have actual data ?
> >
>
> It is just my feeling. Database with many groups where each has few
> DIRECT parents sounds OK, but
> many groups with many direct parents sound like a mess to me. But maybe
> I am wrong, I tested it only
> on a very small test database. If you have a large test database with
> complex group hierarchy, you could
> test how much time sssd_be spends in this loops.
>
> Another thing to consider is that this process only takes place when the
> information about group
> memberships can not be retrieved from local cache.
>
> > You may want to add a talloc_free(add); before the continue; but you
> > could get even more gains if you instead of freeing the add array at
> > each loop always just realloc it if grp_count is > than the previous
> > grp_count. Allocations tend to be expensive and here you can spare a
> > lot of allocation given you always leave the array in consistent state
> > by adding the terminating NULL.
>
> Good idea, I rewrote it using the talloc_realloc. Updated patch attached.
My motto here would be "Get it right first". If we find out later that
it's causing a bottleneck, we'll optimize it later. We don't have any
real-world data suggesting that O(n^2) is too slow for this particular
operation.
Ok, makes sense, and ack to the last incarnation of the patch.
Good work Michal.
Simo.
--
Simo Sorce * Red Hat, Inc * New York