As the commit message says, nothing more. Isn't it right to wait for 6 seconds as the timeout says ? Can you add debug to see what errno is returned (if any) ? Or does the code never trigger and only the timeout kick in ?
We can revert that change in tevent flags if they cause a regression, but I want a comment in the code that the connect() man page is misleading if that's the case.
Simo.
On Sat, 2016-06-18 at 20:02 +0200, Lukas Slebodnik wrote:
ehlo,
Simo Did you have a special reason for changing connection handling in commit https://git.fedorahosted.org/cgit/sssd.git/commit/?id=e05d3f5872263aadfbc2f6...
Because with such change it takes mugh longer unit sssd go into offline mode. I added few debug messages into sssd_async_connect_done. But most important is time of operations.
old behaviour with TEVENT_FD_READ | TEVENT_FD_WRITE
(Sat Jun 18 11:47:03 2016) [sssd[be[LDAP]]] [generic_ext_search_handler] (0x0040): sdap_get_generic_ext_recv failed [110]: Connection timed out (Sat Jun 18 11:47:03 2016) [sssd[be[LDAP]]] [sdap_id_op_done] (0x0200): communication error on cached connection, moving to next server (Sat Jun 18 11:47:03 2016) [sssd[be[LDAP]]] [fo_resolve_service_send] (0x0100): Trying to resolve service 'LDAP' (Sat Jun 18 11:47:03 2016) [sssd[be[LDAP]]] [sdap_uri_callback] (0x0400): Constructed uri 'ldaps://ibm-x3650m4-01-vm-07.example.com' (Sat Jun 18 11:47:03 2016) [sssd[be[LDAP]]] [sssd_async_socket_init_send] (0x4000): Using file descriptor [21] for the connection. (Sat Jun 18 11:47:03 2016) [sssd[be[LDAP]]] [sssd_async_connect_send] (0x0020): Creating request (Sat Jun 18 11:47:03 2016) [sssd[be[LDAP]]] [sssd_async_connect_send] (0x0020): Before connect (Sat Jun 18 11:47:03 2016) [sssd[be[LDAP]]] [sssd_async_connect_send] (0x0020): After connect ret:-1 (Sat Jun 18 11:47:03 2016) [sssd[be[LDAP]]] [sssd_async_connect_send] (0x0020): Calling tevent_add_fd TEVENT_FD_READ | TEVENT_FD_WRITE (Sat Jun 18 11:47:03 2016) [sssd[be[LDAP]]] [sssd_async_socket_init_send] (0x0400): Setting 6 seconds timeout for connecting (Sat Jun 18 11:47:04 2016) [sssd[be[LDAP]]] [sssd_async_connect_done] (0x0020): Before connect (Sat Jun 18 11:47:04 2016) [sssd[be[LDAP]]] [sssd_async_connect_done] (0x0020): After connect ret:-1 errno:113 (Sat Jun 18 11:47:04 2016) [sssd[be[LDAP]]] [sssd_async_connect_done] (0x0020): connect failed [113][No route to host]. (Sat Jun 18 11:47:04 2016) [sssd[be[LDAP]]] [sssd_async_socket_init_done] (0x0020): sdap_async_sys_connect request failed: [113]: No route to host. (Sat Jun 18 11:47:04 2016) [sssd[be[LDAP]]] [sssd_async_socket_state_destructor] (0x0400): closing socket [21] (Sat Jun 18 11:47:04 2016) [sssd[be[LDAP]]] [sss_ldap_init_sys_connect_done] (0x0020): sssd_async_socket_init request failed: [113]: No route to host. (Sat Jun 18 11:47:04 2016) [sssd[be[LDAP]]] [sdap_sys_connect_done] (0x0020): sdap_async_connect_call request failed: [113]: No route to host. (Sat Jun 18 11:47:04 2016) [sssd[be[LDAP]]] [get_port_status] (0x1000): Port status of port 636 for server 'ibm-x3650m4-01-vm-07.example.com' is 'not working' (Sat Jun 18 11:47:04 2016) [sssd[be[LDAP]]] [fo_resolve_service_send] (0x0020): No available servers for service 'LDAP' (Sat Jun 18 11:47:04 2016) [sssd[be[LDAP]]] [be_resolve_server_done] (0x1000): Server resolution failed: [5]: Input/output error (Sat Jun 18 11:47:04 2016) [sssd[be[LDAP]]] [sdap_id_op_connect_done] (0x0020): Failed to connect, going offline (5 [Input/output error])
current behaviour with TEVENT_FD_WRITE
(Sat Jun 18 12:04:27 2016) [sssd[be[LDAP]]] [generic_ext_search_handler] (0x0040): sdap_get_generic_ext_recv failed [110]: Connection timed out (Sat Jun 18 12:04:27 2016) [sssd[be[LDAP]]] [sdap_id_op_done] (0x0200): communication error on cached connection, moving to next server (Sat Jun 18 12:04:28 2016) [sssd[be[LDAP]]] [fo_resolve_service_send] (0x0100): Trying to resolve service 'LDAP' (Sat Jun 18 12:04:28 2016) [sssd[be[LDAP]]] [sdap_uri_callback] (0x0400): Constructed uri 'ldaps://ibm-x3650m4-01-vm-07.example.com' (Sat Jun 18 12:04:28 2016) [sssd[be[LDAP]]] [sssd_async_connect_send] (0x0020): Creating request (Sat Jun 18 12:04:28 2016) [sssd[be[LDAP]]] [sssd_async_connect_send] (0x0020): Before connect (Sat Jun 18 12:04:28 2016) [sssd[be[LDAP]]] [sssd_async_connect_send] (0x0020): After connect ret:-1 (Sat Jun 18 12:04:28 2016) [sssd[be[LDAP]]] [sssd_async_connect_send] (0x0020): Calling tevent_add_fd (Sat Jun 18 12:04:28 2016) [sssd[be[LDAP]]] [sssd_async_socket_init_send] (0x0400): Setting 6 seconds timeout for connecting (Sat Jun 18 12:04:34 2016) [sssd[be[LDAP]]] [sssd_async_connect_timeout] (0x0100): The connection timed out (Sat Jun 18 12:04:34 2016) [sssd[be[LDAP]]] [sssd_async_socket_init_done] (0x0020): sdap_async_sys_connect request failed: [110]: Connection timed out. (Sat Jun 18 12:04:34 2016) [sssd[be[LDAP]]] [sss_ldap_init_sys_connect_done] (0x0020): sssd_async_socket_init request failed: [110]: Connection timed out. (Sat Jun 18 12:04:34 2016) [sssd[be[LDAP]]] [sdap_sys_connect_done] (0x0020): sdap_async_connect_call request failed: [110]: Connection timed out. (Sat Jun 18 12:04:34 2016) [sssd[be[LDAP]]] [sdap_handle_release] (0x2000): Trace: sh[0xfe49dc0], connected[0], ops[(nil)], ldap[(nil)], destructor_lock[0], release_memory[0] (Sat Jun 18 12:04:34 2016) [sssd[be[LDAP]]] [get_port_status] (0x1000): Port status of port 636 for server 'ibm-x3650m4-01-vm-07.example.com' is 'not working' (Sat Jun 18 12:04:34 2016) [sssd[be[LDAP]]] [fo_resolve_service_send] (0x0020): No available servers for service 'LDAP' (Sat Jun 18 12:04:34 2016) [sssd[be[LDAP]]] [be_resolve_server_done] (0x1000): Server resolution failed: [5]: Input/output error (Sat Jun 18 12:04:34 2016) [sssd[be[LDAP]]] [sdap_id_op_connect_done] (0x0020): Failed to connect, going offline (5 [Input/output error])
If you did not have a special reason for this change then I would appreciate if we could change it back.
Two patches attached.
LS