On 08/07/2012 07:35 PM, Simo Sorce wrote:
On Tue, 2012-08-07 at 13:15 -0400, Stephen Gallagher wrote:
> On Tue, 2012-08-07 at 19:08 +0200, Michal Zidek wrote:
>> On 08/07/2012 04:11 PM, Simo Sorce wrote:
>>> On Tue, 2012-08-07 at 15:40 +0200, Michal Zidek wrote:
>>>> On 08/06/2012 08:57 PM, Simo Sorce wrote:
>>>>> The major question I have is that this new code introduces O(N^2)
>>>>> behavior, if there are more then a handful of groups it will be
quite
>>>>> costly. Can we find a way that is not so expensive ?
>>>> I think this is not a big problem. If I understand the code correctly,
>>>> the inner cycle only iterates through potential direct parent groups (we
>>>> add group member to them later) and it is unlikely to be a big number in
>>>> real scenarios (Or am I wrong? Is it common that a group has so many
>>>> direct parents that it would be expensive?).
>>> On what data do you base your assertion about probability, not saying
>>> you are wrong, but is it a gut feeling or do you have actual data ?
>>>
>>
>> It is just my feeling. Database with many groups where each has few
>> DIRECT parents sounds OK, but
>> many groups with many direct parents sound like a mess to me. But maybe
>> I am wrong, I tested it only
>> on a very small test database. If you have a large test database with
>> complex group hierarchy, you could
>> test how much time sssd_be spends in this loops.
>>
>> Another thing to consider is that this process only takes place when the
>> information about group
>> memberships can not be retrieved from local cache.
>>
>>> You may want to add a talloc_free(add); before the continue; but you
>>> could get even more gains if you instead of freeing the add array at
>>> each loop always just realloc it if grp_count is > than the previous
>>> grp_count. Allocations tend to be expensive and here you can spare a
>>> lot of allocation given you always leave the array in consistent state
>>> by adding the terminating NULL.
>>
>> Good idea, I rewrote it using the talloc_realloc. Updated patch attached.
>
>
> My motto here would be "Get it right first". If we find out later that
> it's causing a bottleneck, we'll optimize it later. We don't have any
> real-world data suggesting that O(n^2) is too slow for this particular
> operation.
Ok, makes sense, and ack to the last incarnation of the patch.
Good work Michal.
Simo.
i was instructed to test it, successfull, ack
--
Ondrej Kos
Associate Software Engineer
Identity Management
Red Hat Czech
cell: +420-736-417-909
phone: +420-532-294-558
ext.: 82-62558
irc: okos @ #brno