Hi Simon,
I have to admit that the patch is really quite big and, actually, it
has by far exceeded size and time limits I would normally apply to
patches to third party components.
The patch can be theoretically split into 3 parts:
1. Changes to ldap_child related to returned ticket expiration date;
2. Changes to failover subsystem needed to return number of servers
registered for failover;
3. Changes to LDAP ID backend connection and retry logic.
As you can see, the first two items are really small and absolutely
pointless without the last.
The reason why the changes to LDAP ID backend connection and retry
logic must go together are very simple: old logic relies on gsh member
of sdap_id_ctx, while in new logic there is no such a member.
The reason why gsh needs to go away is as follows:
1. gsh enforces that there will be one and only one connection to DS;
2. When connection is about to expire we can not use it for new
request as it will expire halfway;
3. But at the same time connection could be yet busy with previous request;
4. Therefore we have to make a new connection and close old one as
soon as requests using it are finished.
> The goal of SSSD is to never use more than one connection at a time for
> account information. So your patch is kind of changing our fundamental
> goal by allowing multiple connections. We need to carefully evaluate
> that part.
As I have explained above, the only time we have more then one
connection to DS is when old connection is about to expire and we need
to open new one. So when ticket lifetime is long enough (as it is in
normal Kerberos configuration) there will be no more then 2
connections open.
> I see you started passing around sdap_id_op. The memory hierarchy
> around sdap_is_op is very delicate and required a lot of very careful
> handling to avoid having it disappear under our feet at the wrong time.
> It is meant to represent a single ldap operation tied to a specific ldap
> context, any changes to its use should be in a separate patch that I
> want to review carefully. But ideally sdap_id_op is opaque to most of
> the code and is internal to the processing of replies from the openldap
> libraries. It should never be used out of this context.
I agree that both sdap_id_op and sdap_id_connection are opaque types.
You can move the definitions to ldap_common.c from headers. More over
even declaration of sdap_id_connection can be visible only to
ldap_common.c.
I were not really sure what coding style is used in project. There are
files coded quite differently from each other.
On the other hand I do not see why you find handling of these
structures delicate:
1. sdap_id_op is owned by operation state (e.g. by global_enum_state).
So it will be automatically destroyed as operation (tevent request) is
completed
2. sdap_id_connection is owned by sdap_id_ctx and logic of its
life-cycle boils down to single method - sdap_id_release_connection.
Connection is released when:
a) There is no operation using it
b) It is not cached
c) It is not in connection notify loop (notify_lock == 0)
I hope I have explained why changes were made the way they have been done.
I really do not see a way to split the patch and would appreciate very
much if you give me some advice on how to make it more readable and
easier to understand.
If you have any ideas on how to split the patch, I am ready to discuss
them and implement if needed.
Regards, Eugene