The sdap_id_release_conn_data function is already called inside sdap_id_op_hook_conn_data() so the net result is a double free. I can't think of a reason when you'd want to call release_conn_data but not op_hook_conn_data..
This hunk was changed during the work on the primary servers. I verified that primary/backup server reconnection logic still works and the connection to the backup server is properly closed.
Jan, what was the reason behind the hunk?
Dne úterý 04 září 2012 18:34:23, Jakub Hrozek napsal(a):
The sdap_id_release_conn_data function is already called inside sdap_id_op_hook_conn_data() so the net result is a double free. I can't think of a reason when you'd want to call release_conn_data but not op_hook_conn_data..
This hunk was changed during the work on the primary servers. I verified that primary/backup server reconnection logic still works and the connection to the backup server is properly closed.
Jan, what was the reason behind the hunk?
This hunk is not my best work as I can see in retrospective ;-) You can certainly delete it as hotfix.
The main point of this code was to prevent deleting the connection every time operation ends. The patch introducing the disconnection flag should be solving two major things:
Don't accept new operations for the connection that is dicsonnecting. This is solved in another part of the code.
Don't free the connection immediatelly when you detect it's not working. Rather mark it as disconnecting and let all operations on that connection finish (probably with an error). Only the last finished operation should be deleting the connection. Otherwise you will sometimes get hanging operations. IIRC, that's what I wanted to solve by this hunk of the code. I probably didn't notice the code deleting connection every time the operation ends.
HTH Thanks Jan
On Wed, Sep 05, 2012 at 11:08:36AM +0200, Jan Zelený wrote:
Dne úterý 04 září 2012 18:34:23, Jakub Hrozek napsal(a):
The sdap_id_release_conn_data function is already called inside sdap_id_op_hook_conn_data() so the net result is a double free. I can't think of a reason when you'd want to call release_conn_data but not op_hook_conn_data..
This hunk was changed during the work on the primary servers. I verified that primary/backup server reconnection logic still works and the connection to the backup server is properly closed.
Jan, what was the reason behind the hunk?
This hunk is not my best work as I can see in retrospective ;-) You can certainly delete it as hotfix.
The main point of this code was to prevent deleting the connection every time operation ends. The patch introducing the disconnection flag should be solving two major things:
Don't accept new operations for the connection that is dicsonnecting. This is solved in another part of the code.
Don't free the connection immediatelly when you detect it's not working. Rather mark it as disconnecting and let all operations on that connection finish (probably with an error). Only the last finished operation should be deleting the connection. Otherwise you will sometimes get hanging operations. IIRC, that's what I wanted to solve by this hunk of the code. I probably didn't notice the code deleting connection every time the operation ends.
I pushed the patch to get rid of the segfault and filed https://fedorahosted.org/sssd/ticket/1507 to track the above.
Given that it is hard to follow the code even for the original developer, I think the whole module is up for some commenting and maybe even careful refactoring.
On Tue, Sep 04, 2012 at 06:34:23PM +0200, Jakub Hrozek wrote:
The sdap_id_release_conn_data function is already called inside sdap_id_op_hook_conn_data() so the net result is a double free. I can't think of a reason when you'd want to call release_conn_data but not op_hook_conn_data..
This hunk was changed during the work on the primary servers. I verified that primary/backup server reconnection logic still works and the connection to the backup server is properly closed.
Jan, what was the reason behind the hunk?
ACK
bye, Sumit
On Wed, Sep 05, 2012 at 11:32:54AM +0200, Sumit Bose wrote:
On Tue, Sep 04, 2012 at 06:34:23PM +0200, Jakub Hrozek wrote:
The sdap_id_release_conn_data function is already called inside sdap_id_op_hook_conn_data() so the net result is a double free. I can't think of a reason when you'd want to call release_conn_data but not op_hook_conn_data..
This hunk was changed during the work on the primary servers. I verified that primary/backup server reconnection logic still works and the connection to the backup server is properly closed.
Jan, what was the reason behind the hunk?
ACK
bye, Sumit
Pushed to master.
sssd-devel@lists.fedorahosted.org