On 04/15/2013 08:26 AM, John Hodrien wrote:
Enumerate = false
Jh
Hi We have that. Are you saying it's case sensitive? Just tested it but it doesn't seem to be.
Question: are we supposed to be able to get a list of all users in ldap using getent passwd? Cheers, Steve
On Mon, 15 Apr 2013, steve wrote:
On 04/15/2013 08:26 AM, John Hodrien wrote:
Enumerate = false
Jh
Hi We have that. Are you saying it's case sensitive? Just tested it but it doesn't seem to be.
Question: are we supposed to be able to get a list of all users in ldap using getent passwd?
Sorry, I was posting in a hurry before I went out. This is my reading of things.
You're disabling enumerate, and then finding out you can't enumerate all users. Surely this shouldn't be a surprise? I'd also question why you even want to be able to do that. What's your reason for wanting to be able to list *all* users?
jh
On Mon, Apr 15, 2013 at 09:13:43AM +0100, John Hodrien wrote:
On Mon, 15 Apr 2013, steve wrote:
On 04/15/2013 08:26 AM, John Hodrien wrote:
Enumerate = false
Jh
Hi We have that. Are you saying it's case sensitive? Just tested it but it doesn't seem to be.
Question: are we supposed to be able to get a list of all users in ldap using getent passwd?
Sorry, I was posting in a hurry before I went out. This is my reading of things.
You're disabling enumerate, and then finding out you can't enumerate all users. Surely this shouldn't be a surprise? I'd also question why you even want to be able to do that. What's your reason for wanting to be able to list *all* users?
jh
I agree with John, enumerate=true is what you're looking for.
On 15/04/13 10:13, John Hodrien wrote:
On Mon, 15 Apr 2013, steve wrote:
On 04/15/2013 08:26 AM, John Hodrien wrote:
Enumerate = false
Jh
Hi We have that. Are you saying it's case sensitive? Just tested it but it doesn't seem to be.
Question: are we supposed to be able to get a list of all users in ldap using getent passwd?
Sorry, I was posting in a hurry before I went out. This is my reading of things.
You're disabling enumerate, and then finding out you can't enumerate all users. Surely this shouldn't be a surprise? I'd also question why you even want to be able to do that. What's your reason for wanting to be able to list *all* users?
jh
Hi I t was that enumerate = true wasn't working either. I've just added the Ubuntu update ppa: https://launchpad.net/~*sssd*/+archive/updates The version that comes with 12.10 seems totally broken. This one works as I expect: getent passwd returns a list of both local and domain users, but it's not workable. Even with only 3 boxes connected, the server just grinds to a halt. The ldap queries never stop. With nslcd, it's just one quick read for the whole lot and the next time you ask, it doesn't read anything at all from the directory. i wonder what sssd is doing with all those reads?
We use getent passwd in many of our scripts which are mostly wrappers around samba-tool which ships with the Samba 4.0 AD. It's nothing that we couldn't do with wbinfo instead of getent, but I would prefer to leave the scripts as they are and stick with nslcd. We like sssd because it's got everything in one place.
Anyway, we've not given up yet. Maybe the method used for enumeration on sssd will change to something more like nslcd or winbind in future. Steve
On Mon, Apr 15, 2013 at 11:22:22AM +0200, steve wrote:
On 15/04/13 10:13, John Hodrien wrote:
On Mon, 15 Apr 2013, steve wrote:
On 04/15/2013 08:26 AM, John Hodrien wrote:
Enumerate = false
Jh
Hi We have that. Are you saying it's case sensitive? Just tested it but it doesn't seem to be.
Question: are we supposed to be able to get a list of all users in ldap using getent passwd?
Sorry, I was posting in a hurry before I went out. This is my reading of things.
You're disabling enumerate, and then finding out you can't enumerate all users. Surely this shouldn't be a surprise? I'd also question why you even want to be able to do that. What's your reason for wanting to be able to list *all* users?
jh
Hi I t was that enumerate = true wasn't working either. I've just added the Ubuntu update ppa: https://launchpad.net/~*sssd*/+archive/updates The version that comes with 12.10 seems totally broken. This one works as I expect: getent passwd returns a list of both local and domain users, but it's not workable. Even with only 3 boxes connected, the server just grinds to a halt. The ldap queries never stop. With nslcd, it's just one quick read for the whole lot and the next time you ask, it doesn't read anything at all from the directory. i wonder what sssd is doing with all those reads?
Have you been able to check what kind of searches (with wireshark or similar or even just looking at the debug logs) sssd is doing? There should be one search for users and one for groups every 300 seconds by default (can be overriden with ldap_enumeration_refresh_timeout).
Also how many users/groups do you have on the server?
We use getent passwd in many of our scripts which are mostly wrappers around samba-tool which ships with the Samba 4.0 AD. It's nothing that we couldn't do with wbinfo instead of getent, but I would prefer to leave the scripts as they are and stick with nslcd. We like sssd because it's got everything in one place.
Anyway, we've not given up yet. Maybe the method used for enumeration on sssd will change to something more like nslcd or winbind in future. Steve
Well, one reason for the enumeration being slower in sssd is that sssd (unlike nscd at least) stores all the entries it downloaded from LDAP to a persistent on-disk cache. That cache write can take up to a couple of seconds, depending on the size of the directory.
On 04/15/2013 11:33 AM, Jakub Hrozek wrote:
Have you been able to check what kind of searches (with wireshark or similar or even just looking at the debug logs) sssd is doing? There should be one search for users and one for groups every 300 seconds by default (can be overriden with ldap_enumeration_refresh_timeout).
Also how many users/groups do you have on the server?
Hi Test setup with 2 users and 2 groups. The ldap server (Samba4 AD) is flooded with requests every second. I attach small bits from each log. If you coud have a look that would be great.
TIA Steve
Samba 4 DC Kerberos: AS-REQ DOLORESDC$@DOLORES.SITE from ipv4:192.168.1.100:51613 for krbtgt/DOLORES.SITE@DOLORES.SITE Kerberos: Client sent patypes: 149 Kerberos: Looking for PKINIT pa-data -- DOLORESDC$@DOLORES.SITE Kerberos: Looking for ENC-TS pa-data -- DOLORESDC$@DOLORES.SITE Kerberos: No preauth found, returning PREAUTH-REQUIRED -- DOLORESDC$@DOLORES.SITE Kerberos: AS-REQ DOLORESDC$@DOLORES.SITE from ipv4:192.168.1.100:50962 for krbtgt/DOLORES.SITE@DOLORES.SITE Kerberos: Client sent patypes: encrypted-timestamp, 149 Kerberos: Looking for PKINIT pa-data -- DOLORESDC$@DOLORES.SITE Kerberos: Looking for ENC-TS pa-data -- DOLORESDC$@DOLORES.SITE Kerberos: ENC-TS Pre-authentication succeeded -- DOLORESDC$@DOLORES.SITE using arcfour-hmac-md5 Kerberos: AS-REQ authtime: 2013-04-15T12:00:40 starttime: unset endtime: 2013-04-15T22:00:40 renew till: 2013-04-16T12:00:40 Kerberos: Client supported enctypes: aes256-cts-hmac-sha1-96, aes128-cts-hmac-sha1-96, des3-cbc-sha1, arcfour-hmac-md5, using arcfour-hmac-md5/arcfour-hmac-md5 Kerberos: Requested flags: renewable-ok, canonicalize Kerberos: TGS-REQ DOLORESDC$@DOLORES.SITE from ipv4:192.168.1.100:39978 for ldap/doloresdc.dolores.site@DOLORES.SITE [canonicalize, renewable] Kerberos: TGS-REQ authtime: 2013-04-15T12:00:40 starttime: 2013-04-15T12:00:40 endtime: 2013-04-15T22:00:40 renew till: 2013-04-16T12:00:40 Not authoritative for 'dolores.site.dragonet.es', forwarding ldb_wrap open of secrets.ldb Terminating connection - 'LDAP_PROTOCOL_ERROR' single_terminate: reason[LDAP_PROTOCOL_ERROR] Terminating connection - 'ldapsrv_call_loop: tstream_read_pdu_blob_recv() - NT_STATUS_CONNECTION_DISCONNECTED' single_terminate: reason[ldapsrv_call_loop: tstream_read_pdu_blob_recv() - NT_STATUS_CONNECTION_DISCONNECTED] ldb_wrap open of secrets.ldb Kerberos: AS-REQ DOLORESDC$@DOLORES.SITE from ipv4:192.168.1.100:60133 for krbtgt/DOLORES.SITE@DOLORES.SITE Kerberos: Client sent patypes: 149 Kerberos: Looking for PKINIT pa-data -- DOLORESDC$@DOLORES.SITE Kerberos: Looking for ENC-TS pa-data -- DOLORESDC$@DOLORES.SITE Kerberos: No preauth found, returning PREAUTH-REQUIRED -- DOLORESDC$@DOLORES.SITE Kerberos: AS-REQ DOLORESDC$@DOLORES.SITE from ipv4:192.168.1.100:43142 for krbtgt/DOLORES.SITE@DOLORES.SITE Kerberos: Client sent patypes: encrypted-timestamp, 149 Kerberos: Looking for PKINIT pa-data -- DOLORESDC$@DOLORES.SITE Kerberos: Looking for ENC-TS pa-data -- DOLORESDC$@DOLORES.SITE Kerberos: ENC-TS Pre-authentication succeeded -- DOLORESDC$@DOLORES.SITE using arcfour-hmac-md5 Kerberos: AS-REQ authtime: 2013-04-15T12:00:43 starttime: unset endtime: 2013-04-15T22:00:43 renew till: 2013-04-16T12:00:43 Kerberos: Client supported enctypes: aes256-cts-hmac-sha1-96, aes128-cts-hmac-sha1-96, des3-cbc-sha1, arcfour-hmac-md5, using arcfour-hmac-md5/arcfour-hmac-md5 Kerberos: Requested flags: renewable-ok, canonicalize Kerberos: TGS-REQ DOLORESDC$@DOLORES.SITE from ipv4:192.168.1.100:36908 for ldap/doloresdc.dolores.site@DOLORES.SITE [canonicalize, renewable] Kerberos: TGS-REQ authtime: 2013-04-15T12:00:43 starttime: 2013-04-15T12:00:43 endtime: 2013-04-15T22:00:43 renew till: 2013-04-16T12:00:43 Not authoritative for 'dolores.site.dragonet.es', forwarding ldb_wrap open of secrets.ldb Terminating connection - 'LDAP_PROTOCOL_ERROR' single_terminate: reason[LDAP_PROTOCOL_ERROR] Terminating connection - 'ldapsrv_call_loop: tstream_read_pdu_blob_recv() - NT_STATUS_CONNECTION_DISCONNECTED' single_terminate: reason[ldapsrv_call_loop: tstream_read_pdu_blob_recv() - NT_STATUS_CONNECTION_DISCONNECTED]
------------------------ sssd.conf [sssd] debug_level = 6 services = nss, pam config_file_version = 2 domains = default
[nss] debug_level = 6
[pam] debug_level = 6
[domain/default] debug_level = 6 ldap_schema = rfc2307bis access_provider = simple Enumerate = true cache_credentials = true id_provider = ldap auth_provider = krb5 chpass_provider = krb5 krb5_realm = DOLORES.SITE krb5_server = 192.168.1.100 krb5_kpasswd = 192.168.1.100
ldap_uri = ldap://192.168.1.100/ ldap_search_base = dc=dolores,dc=site #ldap_tls_cacertdir = /usr/local/samba/private/tls #ldap_id_use_start_tls = true ldap_user_object_class = user ldap_user_name = samAccountName ldap_user_uid_number = uidNumber ldap_user_gid_number = gidNumber ldap_user_home_directory = unixHomeDirectory ldap_user_shell = loginShell ldap_group_object_class = group ldap_group_search_base = dc=dolores,dc=site ldap_group_name = cn ldap_group_member = member ldap_user_search_filter =(&(objectCategory=user)(uidNumber=*))
ldap_sasl_mech = GSSAPI ldap_sasl_authid = DOLORESDC$ ldap_krb5_keytab = /etc/krb5.keytab ldap_krb5_init_creds = true ------------------------------------
ldap_child.log (Mon Apr 15 12:00:34 2013) [[sssd[ldap_child[4238]]]] [main] (0x0400): ldap_child completed successfully (Mon Apr 15 12:00:39 2013) [[sssd[ldap_child[4241]]]] [main] (0x0400): ldap_child started. (Mon Apr 15 12:00:39 2013) [[sssd[ldap_child[4241]]]] [ldap_child_get_tgt_sync] (0x0100): Principal name is: [DOLORESDC$@DOLORES.SITE] (Mon Apr 15 12:00:39 2013) [[sssd[ldap_child[4241]]]] [ldap_child_get_tgt_sync] (0x0100): Using keytab [/etc/krb5.keytab] (Mon Apr 15 12:00:39 2013) [[sssd[ldap_child[4241]]]] [ldap_child_get_tgt_sync] (0x0100): Will canonicalize principals (Mon Apr 15 12:00:39 2013) [[sssd[ldap_child[4241]]]] [prepare_response] (0x0400): Building response for result [0] (Mon Apr 15 12:00:39 2013) [[sssd[ldap_child[4241]]]] [main] (0x0400): ldap_child completed successfully (Mon Apr 15 12:00:40 2013) [[sssd[ldap_child[4254]]]] [main] (0x0400): ldap_child started. (Mon Apr 15 12:00:40 2013) [[sssd[ldap_child[4254]]]] [ldap_child_get_tgt_sync] (0x0100): Principal name is: [DOLORESDC$@DOLORES.SITE] (Mon Apr 15 12:00:40 2013) [[sssd[ldap_child[4254]]]] [ldap_child_get_tgt_sync] (0x0100): Using keytab [/etc/krb5.keytab] (Mon Apr 15 12:00:40 2013) [[sssd[ldap_child[4254]]]] [ldap_child_get_tgt_sync] (0x0100): Will canonicalize principals (Mon Apr 15 12:00:40 2013) [[sssd[ldap_child[4254]]]] [prepare_response] (0x0400): Building response for result [0] (Mon Apr 15 12:00:40 2013) [[sssd[ldap_child[4254]]]] [main] (0x0400): ldap_child completed successfully (Mon Apr 15 12:00:40 2013) [[sssd[ldap_child[4257]]]] [main] (0x0400): ldap_child started. (Mon Apr 15 12:00:40 2013) [[sssd[ldap_child[4257]]]] [ldap_child_get_tgt_sync] (0x0100): Principal name is: [DOLORESDC$@DOLORES.SITE] (Mon Apr 15 12:00:40 2013) [[sssd[ldap_child[4257]]]] [ldap_child_get_tgt_sync] (0x0100): Using keytab [/etc/krb5.keytab] (Mon Apr 15 12:00:40 2013) [[sssd[ldap_child[4257]]]] [ldap_child_get_tgt_sync] (0x0100): Will canonicalize principals (Mon Apr 15 12:00:40 2013) [[sssd[ldap_child[4257]]]] [prepare_response] (0x0400): Building response for result [0] (Mon Apr 15 12:00:40 2013) [[sssd[ldap_child[4257]]]] [main] (0x0400): ldap_child completed successfully (Mon Apr 15 12:00:43 2013) [[sssd[ldap_child[4260]]]] [main] (0x0400): ldap_child started. (Mon Apr 15 12:00:43 2013) [[sssd[ldap_child[4260]]]] [ldap_child_get_tgt_sync] (0x0100): Principal name is: [DOLORESDC$@DOLORES.SITE] (Mon Apr 15 12:00:43 2013) [[sssd[ldap_child[4260]]]] [ldap_child_get_tgt_sync] (0x0100): Using keytab [/etc/krb5.keytab] (Mon Apr 15 12:00:43 2013) [[sssd[ldap_child[4260]]]] [ldap_child_get_tgt_sync] (0x0100): Will canonicalize principals (Mon Apr 15 12:00:43 2013) [[sssd[ldap_child[4260]]]] [prepare_response] (0x0400): Building response for result [0] (Mon Apr 15 12:00:43 2013) [[sssd[ldap_child[4260]]]] [main] (0x0400): ldap_child completed successfully
--------------------------------- sssd_default.log (Mon Apr 15 12:00:40 2013) [sssd[be[default]]] [sdap_get_generic_ext_done] (0x0400): Search result: Success(0), no errmsg set (Mon Apr 15 12:00:40 2013) [sssd[be[default]]] [get_naming_context] (0x0200): Using value from [defaultNamingContext] as naming context. (Mon Apr 15 12:00:40 2013) [sssd[be[default]]] [sdap_set_search_base] (0x0100): Setting option [ldap_sudo_search_base] to [DC=dolores,DC=site]. (Mon Apr 15 12:00:40 2013) [sssd[be[default]]] [common_parse_search_base] (0x0100): Search base added: [SUDO][DC=dolores,DC=site][SUBTREE][] (Mon Apr 15 12:00:40 2013) [sssd[be[default]]] [sdap_get_server_opts_from_rootdse] (0x0100): Setting AD compatibility level to [4] (Mon Apr 15 12:00:40 2013) [sssd[be[default]]] [sdap_kinit_send] (0x0400): Attempting kinit (/etc/krb5.keytab, DOLORESDC$, DOLORES.SITE, 86400) (Mon Apr 15 12:00:40 2013) [sssd[be[default]]] [fo_resolve_service_send] (0x0100): Trying to resolve service 'KERBEROS' (Mon Apr 15 12:00:40 2013) [sssd[be[default]]] [be_resolve_server_process] (0x0200): Found address for server 192.168.1.100: [192.168.1.100] TTL 7200 (Mon Apr 15 12:00:40 2013) [sssd[be[default]]] [set_tgt_child_timeout] (0x0400): Setting 6 seconds timeout for tgt child (Mon Apr 15 12:00:40 2013) [sssd[be[default]]] [write_pipe_handler] (0x0400): All data has been sent! (Mon Apr 15 12:00:40 2013) [sssd[be[default]]] [child_sig_handler] (0x0100): child [4257] finished successfully. (Mon Apr 15 12:00:40 2013) [sssd[be[default]]] [read_pipe_handler] (0x0400): EOF received, client finished (Mon Apr 15 12:00:40 2013) [sssd[be[default]]] [sdap_get_tgt_recv] (0x0400): Child responded: 0 [FILE:/var/lib/sss/db/ccache_DOLORES.SITE], expired on [1366056040] (Mon Apr 15 12:00:40 2013) [sssd[be[default]]] [sdap_cli_auth_step] (0x0100): expire timeout is 900 (Mon Apr 15 12:00:40 2013) [sssd[be[default]]] [sasl_bind_send] (0x0100): Executing sasl bind mech: GSSAPI, user: DOLORESDC$ (Mon Apr 15 12:00:40 2013) [sssd[be[default]]] [fo_set_port_status] (0x0100): Marking port 389 of server '192.168.1.100' as 'working' (Mon Apr 15 12:00:40 2013) [sssd[be[default]]] [set_server_common_status] (0x0100): Marking server '192.168.1.100' as 'working' (Mon Apr 15 12:00:40 2013) [sssd[be[default]]] [sdap_get_users_next_base] (0x0400): Searching for users with base [dc=dolores,dc=site] (Mon Apr 15 12:00:40 2013) [sssd[be[default]]] [sdap_get_generic_ext_step] (0x0400): calling ldap_search_ext with [(&(&(objectclass=user)(samAccountName=*)(uidNumber=*)(gidNumber=*))(&(objectCategory=user)(uidNumber=*)))][dc=dolores,dc=site]. (Mon Apr 15 12:00:41 2013) [sssd[be[default]]] [sdap_process_result] (0x0040): ldap_result error: [Can't contact LDAP server] (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [server_setup] (0x0400): CONFDB: /var/lib/sss/db/config.ldb (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [recreate_ares_channel] (0x0100): Initializing new c-ares channel (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [fo_context_init] (0x0400): Created new fail over context, retry timeout is 30 (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [sysdb_domain_init_internal] (0x0200): DB File for default: /var/lib/sss/db/cache_default.ldb (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [ldb] (0x0400): asq: Unable to register control with rootdse! (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [sbus_init_connection] (0x0200): Adding connection 9A24720 (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [monitor_common_send_id] (0x0100): Sending ID: (%BE_default,1) (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [sss_names_init] (0x0100): Using re [(?P<name>[^@]+)@?(?P<domain>[^@]*$)]. (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [sbus_new_server] (0x0400): D-BUS Server listening on unix:path=/var/lib/sss/pipes/private/sbus-dp_default.4259,guid=c7bed4b88515d11a9bda7a81516bcfcb (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_uri has value ldap://192.168.1.100/ (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_backup_uri has no value (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_search_base has value dc=dolores,dc=site (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_default_bind_dn has no value (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_default_authtok_type has value password (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_default_authtok has no binary value. (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_search_timeout has value 6 (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_network_timeout has value 6 (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_opt_timeout has value 6 (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_tls_reqcert has value hard (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_user_search_base has no value (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_user_search_scope has value sub (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_user_search_filter has value (&(objectCategory=user)(uidNumber=*)) (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_group_search_base has value dc=dolores,dc=site (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_group_search_scope has value sub (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_group_search_filter has no value (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_service_search_base has no value (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_sudo_search_base has no value (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_sudo_full_refresh_interval has value 21600 (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_sudo_smart_refresh_interval has value 900 (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_sudo_use_host_filter is TRUE (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_sudo_hostnames has no value (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_sudo_ip has no value (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_sudo_include_netgroups is TRUE (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_sudo_include_regexp is TRUE (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_autofs_search_base has no value (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_schema has value rfc2307bis (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_offline_timeout has value 60 (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_force_upper_case_realm is FALSE (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_enumeration_refresh_timeout has value 300 (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_purge_cache_timeout has value 10800 (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_tls_cacert has no value (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_tls_cacertdir has no value (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_tls_cert has no value (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_tls_key has no value (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_tls_cipher_suite has no value (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_id_use_start_tls is FALSE (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_id_mapping is FALSE (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_sasl_mech has value GSSAPI (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_sasl_authid has value DOLORESDC$ (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_sasl_realm has no value (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_sasl_minssf has value -1 (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_krb5_keytab has value /etc/krb5.keytab (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_krb5_init_creds is TRUE (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option krb5_server has value 192.168.1.100 (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option krb5_backup_server has no value (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option krb5_realm has value DOLORES.SITE (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option krb5_canonicalize is TRUE (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_pwd_policy has value none (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_referrals is TRUE (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option account_cache_expiration has value 0 (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_dns_service_name has value ldap (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_krb5_ticket_lifetime has value 86400 ------------------------------
sss.log (Mon Apr 15 12:00:40 2013) [sssd] [client_registration] (0x0100): Received ID registration: (pam,1) (Mon Apr 15 12:00:40 2013) [sssd] [mark_service_as_started] (0x0200): Marking pam as started. (Mon Apr 15 12:00:40 2013) [sssd] [sbus_server_init_new_connection] (0x0200): Entering. (Mon Apr 15 12:00:40 2013) [sssd] [sbus_server_init_new_connection] (0x0200): Adding connection 0x8708c60. (Mon Apr 15 12:00:40 2013) [sssd] [sbus_init_connection] (0x0200): Adding connection 8708C60 (Mon Apr 15 12:00:40 2013) [sssd] [sbus_server_init_new_connection] (0x0200): Got a connection (Mon Apr 15 12:00:40 2013) [sssd] [monitor_service_init] (0x0400): Initializing D-BUS Service (Mon Apr 15 12:00:40 2013) [sssd] [client_registration] (0x0100): Received ID registration: (nss,1) (Mon Apr 15 12:00:40 2013) [sssd] [mark_service_as_started] (0x0200): Marking nss as started. (Mon Apr 15 12:00:40 2013) [sssd] [mark_service_as_started] (0x0400): SSSD is initialized, terminating parent process (Mon Apr 15 12:00:40 2013) [sssd] [sbus_dispatch] (0x0080): Connection is not open for dispatching. (Mon Apr 15 12:00:40 2013) [sssd] [mt_svc_exit_handler] (0x0040): Child [default] terminated with signal [6] (Mon Apr 15 12:00:40 2013) [sssd] [mt_svc_restart] (0x0400): Scheduling service default for restart 1 (Mon Apr 15 12:00:40 2013) [sssd] [get_ping_config] (0x0100): Time between service pings for [default]: [10] (Mon Apr 15 12:00:40 2013) [sssd] [get_ping_config] (0x0100): Time between SIGTERM and SIGKILL for [default]: [60] (Mon Apr 15 12:00:40 2013) [sssd] [start_service] (0x0100): Queueing service default for startup (Mon Apr 15 12:00:40 2013) [sssd] [sbus_server_init_new_connection] (0x0200): Entering. (Mon Apr 15 12:00:40 2013) [sssd] [sbus_server_init_new_connection] (0x0200): Adding connection 0x870c7a8. ---------------------------------
sssd_nss.log (Mon Apr 15 12:00:06 2013) [sssd[nss]] [server_setup] (0x0400): CONFDB: /var/lib/sss/db/config.ldb (Mon Apr 15 12:00:06 2013) [sssd[nss]] [sbus_init_connection] (0x0200): Adding connection 97EEC00 (Mon Apr 15 12:00:06 2013) [sssd[nss]] [monitor_common_send_id] (0x0100): Sending ID: (nss,1) (Mon Apr 15 12:00:06 2013) [sssd[nss]] [sss_names_init] (0x0100): Using re [(?P<name>[^@]+)@?(?P<domain>[^@]*$)]. (Mon Apr 15 12:00:06 2013) [sssd[nss]] [sbus_init_connection] (0x0200): Adding connection 97EDD08 (Mon Apr 15 12:00:06 2013) [sssd[nss]] [dp_common_send_id] (0x0100): Sending ID to DP: (1,NSS) (Mon Apr 15 12:00:06 2013) [sssd[nss]] [sysdb_domain_init_internal] (0x0200): DB File for default: /var/lib/sss/db/cache_default.ldb (Mon Apr 15 12:00:06 2013) [sssd[nss]] [ldb] (0x0400): asq: Unable to register control with rootdse! (Mon Apr 15 12:00:06 2013) [sssd[nss]] [sss_process_init] (0x0400): Responder Initialization complete (Mon Apr 15 12:00:06 2013) [sssd[nss]] [sss_parse_name_for_domains] (0x0200): name 'root' matched without domain, user is root (Mon Apr 15 12:00:06 2013) [sssd[nss]] [sss_parse_name_for_domains] (0x0200): using default domain [(null)] (Mon Apr 15 12:00:06 2013) [sssd[nss]] [sss_ncache_set_str] (0x0400): Adding [NCE/USER/default/root] to negative cache permanently (Mon Apr 15 12:00:06 2013) [sssd[nss]] [sss_parse_name_for_domains] (0x0200): name 'root' matched without domain, user is root (Mon Apr 15 12:00:06 2013) [sssd[nss]] [sss_parse_name_for_domains] (0x0200): using default domain [(null)] (Mon Apr 15 12:00:06 2013) [sssd[nss]] [sss_ncache_set_str] (0x0400): Adding [NCE/GROUP/default/root] to negative cache permanently (Mon Apr 15 12:00:06 2013) [sssd[nss]] [nss_get_etc_shells] (0x0400): Found shell /bin/sh in /etc/shells (Mon Apr 15 12:00:06 2013) [sssd[nss]] [nss_get_etc_shells] (0x0400): Found shell /bin/dash in /etc/shells (Mon Apr 15 12:00:06 2013) [sssd[nss]] [nss_get_etc_shells] (0x0400): Found shell /bin/bash in /etc/shells (Mon Apr 15 12:00:06 2013) [sssd[nss]] [nss_get_etc_shells] (0x0400): Found shell /bin/rbash in /etc/shells (Mon Apr 15 12:00:06 2013) [sssd[nss]] [responder_set_fd_limit] (0x0100): Maximum file descriptors set to [8192] (Mon Apr 15 12:00:06 2013) [sssd[nss]] [nss_process_init] (0x0400): NSS Initialization complete (Mon Apr 15 12:00:06 2013) [sssd[nss]] [id_callback] (0x0100): Got id ack and version (1) from Monitor (Mon Apr 15 12:00:06 2013) [sssd[nss]] [dp_id_callback] (0x0100): Got id ack and version (1) from DP (Mon Apr 15 12:00:07 2013) [sssd[nss]] [sbus_dispatch] (0x0020): Performing auto-reconnect (Mon Apr 15 12:00:08 2013) [sssd[nss]] [sbus_reconnect] (0x0080): Making reconnection attempt 1 to [unix:path=/var/lib/sss/pipes/private/sbus-dp_default] (Mon Apr 15 12:00:08 2013) [sssd[nss]] [sbus_reconnect] (0x0020): Failed to open connection: name=org.freedesktop.DBus.Error.NoServer, message=Failed to connect to socket /var/lib/sss/pipes/private/sbus-dp_default: Connection refused (Mon Apr 15 12:00:11 2013) [sssd[nss]] [sbus_reconnect] (0x0080): Making reconnection attempt 2 to [unix:path=/var/lib/sss/pipes/private/sbus-dp_default] (Mon Apr 15 12:00:11 2013) [sssd[nss]] [sbus_reconnect] (0x0020): Failed to open connection: name=org.freedesktop.DBus.Error.NoServer, message=Failed to connect to socket /var/lib/sss/pipes/private/sbus-dp_default: Connection refused (Mon Apr 15 12:00:15 2013) [sssd[nss]] [sss_responder_ctx_destructor] (0x0400): Responder is being shut down (Mon Apr 15 12:00:15 2013) [sssd[nss]] [server_setup] (0x0400): CONFDB: /var/lib/sss/db/config.ldb (Mon Apr 15 12:00:15 2013) [sssd[nss]] [sbus_init_connection] (0x0200): Adding connection 9B13C00 (Mon Apr 15 12:00:15 2013) [sssd[nss]] [monitor_common_send_id] (0x0100): Sending ID: (nss,1) (Mon Apr 15 12:00:15 2013) [sssd[nss]] [sss_names_init] (0x0100): Using re [(?P<name>[^@]+)@?(?P<domain>[^@]*$)]. (Mon Apr 15 12:00:15 2013) [sssd[nss]] [sbus_init_connection] (0x0200): Adding connection 9B12D08 (Mon Apr 15 12:00:15 2013) [sssd[nss]] [dp_common_send_id] (0x0100): Sending ID to DP: (1,NSS) (Mon Apr 15 12:00:15 2013) [sssd[nss]] [sysdb_domain_init_internal] (0x0200): DB File for default: /var/lib/sss/db/cache_default.ldb (Mon Apr 15 12:00:15 2013) [sssd[nss]] [ldb] (0x0400): asq: Unable to register control with rootdse! (Mon Apr 15 12:00:15 2013) [sssd[nss]] [sss_process_init] (0x0400): Responder Initialization complete (Mon Apr 15 12:00:15 2013) [sssd[nss]] [sss_parse_name_for_domains] (0x0200): name 'root' matched without domain, user is root (Mon Apr 15 12:00:15 2013) [sssd[nss]] [sss_parse_name_for_domains] (0x0200): using default domain [(null)] (Mon Apr 15 12:00:15 2013) [sssd[nss]] [sss_ncache_set_str] (0x0400): Adding [NCE/USER/default/root] to negative cache permanently (Mon Apr 15 12:00:15 2013) [sssd[nss]] [sss_parse_name_for_domains] (0x0200): name 'root' matched without domain, user is root (Mon Apr 15 12:00:15 2013) [sssd[nss]] [sss_parse_name_for_domains] (0x0200): using default domain [(null)] ----------------------------------
sssd_pam.log (Mon Apr 15 12:00:06 2013) [sssd[pam]] [server_setup] (0x0400): CONFDB: /var/lib/sss/db/config.ldb (Mon Apr 15 12:00:06 2013) [sssd[pam]] [sbus_init_connection] (0x0200): Adding connection 84E0D88 (Mon Apr 15 12:00:06 2013) [sssd[pam]] [monitor_common_send_id] (0x0100): Sending ID: (pam,1) (Mon Apr 15 12:00:06 2013) [sssd[pam]] [sss_names_init] (0x0100): Using re [(?P<name>[^@]+)@?(?P<domain>[^@]*$)]. (Mon Apr 15 12:00:06 2013) [sssd[pam]] [sbus_init_connection] (0x0200): Adding connection 84E2858 (Mon Apr 15 12:00:06 2013) [sssd[pam]] [dp_common_send_id] (0x0100): Sending ID to DP: (1,PAM) (Mon Apr 15 12:00:06 2013) [sssd[pam]] [sysdb_domain_init_internal] (0x0200): DB File for default: /var/lib/sss/db/cache_default.ldb (Mon Apr 15 12:00:06 2013) [sssd[pam]] [ldb] (0x0400): asq: Unable to register control with rootdse! (Mon Apr 15 12:00:06 2013) [sssd[pam]] [sss_process_init] (0x0400): Responder Initialization complete (Mon Apr 15 12:00:06 2013) [sssd[pam]] [sss_parse_name_for_domains] (0x0200): name 'root' matched without domain, user is root (Mon Apr 15 12:00:06 2013) [sssd[pam]] [sss_parse_name_for_domains] (0x0200): using default domain [(null)] (Mon Apr 15 12:00:06 2013) [sssd[pam]] [sss_ncache_set_str] (0x0400): Adding [NCE/USER/default/root] to negative cache permanently (Mon Apr 15 12:00:06 2013) [sssd[pam]] [sss_parse_name_for_domains] (0x0200): name 'root' matched without domain, user is root (Mon Apr 15 12:00:06 2013) [sssd[pam]] [sss_parse_name_for_domains] (0x0200): using default domain [(null)] (Mon Apr 15 12:00:06 2013) [sssd[pam]] [sss_ncache_set_str] (0x0400): Adding [NCE/GROUP/default/root] to negative cache permanently (Mon Apr 15 12:00:06 2013) [sssd[pam]] [responder_set_fd_limit] (0x0100): Maximum file descriptors set to [8192] (Mon Apr 15 12:00:06 2013) [sssd[pam]] [id_callback] (0x0100): Got id ack and version (1) from Monitor (Mon Apr 15 12:00:06 2013) [sssd[pam]] [dp_id_callback] (0x0100): Got id ack and version (1) from DP (Mon Apr 15 12:00:07 2013) [sssd[pam]] [sbus_dispatch] (0x0020): Performing auto-reconnect (Mon Apr 15 12:00:08 2013) [sssd[pam]] [sbus_reconnect] (0x0080): Making reconnection attempt 1 to [unix:path=/var/lib/sss/pipes/private/sbus-dp_default] (Mon Apr 15 12:00:08 2013) [sssd[pam]] [sbus_reconnect] (0x0020): Failed to open connection: name=org.freedesktop.DBus.Error.NoServer, message=Failed to connect to socket /var/lib/sss/pipes/private/sbus-dp_default: Connection refused (Mon Apr 15 12:00:11 2013) [sssd[pam]] [sbus_reconnect] (0x0080): Making reconnection attempt 2 to [unix:path=/var/lib/sss/pipes/private/sbus-dp_default] (Mon Apr 15 12:00:11 2013) [sssd[pam]] [sbus_reconnect] (0x0020): Failed to open connection: name=org.freedesktop.DBus.Error.NoServer, message=Failed to connect to socket /var/lib/sss/pipes/private/sbus-dp_default: Connection refused (Mon Apr 15 12:00:15 2013) [sssd[pam]] [sss_responder_ctx_destructor] (0x0400): Responder is being shut down
On Mon, Apr 15, 2013 at 12:17:33PM +0200, steve wrote:
On 04/15/2013 11:33 AM, Jakub Hrozek wrote:
Have you been able to check what kind of searches (with wireshark or similar or even just looking at the debug logs) sssd is doing? There should be one search for users and one for groups every 300 seconds by default (can be overriden with ldap_enumeration_refresh_timeout).
Also how many users/groups do you have on the server?
Hi Test setup with 2 users and 2 groups. The ldap server (Samba4 AD) is flooded with requests every second. I attach small bits from each log. If you coud have a look that would be great.
Thanks for the logs. I suspect that what happens is that the sssd_be crashes or is restarted for some reason and after being respawned starts another enumeration which is why you are seeing so many searches.
As a matter of fact, there is a patch on the list to fix a potential crash when enumerating. Which version are you running? Maybe you're hitting that exact problem.
TIA Steve
[snip]
sssd_default.log
[snip]
(Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_dns_service_name has value ldap (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_krb5_ticket_lifetime has value 86400
Is there something more in the log? The part you pasted pretty much describes the option being loaded.
sss.log (Mon Apr 15 12:00:40 2013) [sssd] [client_registration] (0x0100): Received ID registration: (pam,1) (Mon Apr 15 12:00:40 2013) [sssd] [mark_service_as_started] (0x0200): Marking pam as started. (Mon Apr 15 12:00:40 2013) [sssd] [sbus_server_init_new_connection] (0x0200): Entering. (Mon Apr 15 12:00:40 2013) [sssd] [sbus_server_init_new_connection] (0x0200): Adding connection 0x8708c60. (Mon Apr 15 12:00:40 2013) [sssd] [sbus_init_connection] (0x0200): Adding connection 8708C60 (Mon Apr 15 12:00:40 2013) [sssd] [sbus_server_init_new_connection] (0x0200): Got a connection (Mon Apr 15 12:00:40 2013) [sssd] [monitor_service_init] (0x0400): Initializing D-BUS Service (Mon Apr 15 12:00:40 2013) [sssd] [client_registration] (0x0100): Received ID registration: (nss,1) (Mon Apr 15 12:00:40 2013) [sssd] [mark_service_as_started] (0x0200): Marking nss as started. (Mon Apr 15 12:00:40 2013) [sssd] [mark_service_as_started] (0x0400): SSSD is initialized, terminating parent process (Mon Apr 15 12:00:40 2013) [sssd] [sbus_dispatch] (0x0080): Connection is not open for dispatching. (Mon Apr 15 12:00:40 2013) [sssd] [mt_svc_exit_handler] (0x0040): Child [default] terminated with signal [6]
Here is some more evidence, signal [6] is SIGABRT so the back end crashed.
It's quite possible that you hit https://fedorahosted.org/sssd/ticket/1799, if you could collect the backtrace then we would know for sure. We already have a patch for that problem and will include the fix in the shortly upcoming 1.9.5 release.
On Mon, 15 Apr 2013, Jakub Hrozek wrote:
Thanks for the logs. I suspect that what happens is that the sssd_be crashes or is restarted for some reason and after being respawned starts another enumeration which is why you are seeing so many searches.
As a matter of fact, there is a patch on the list to fix a potential crash when enumerating. Which version are you running? Maybe you're hitting that exact problem.
enumerate = true is completely unusable on our setup, and sssd_be definitely appears to exit and restart.
sssd-1.9.2-82.4.el6_4.x86_64
That said, I'm fairly sure I'd never want to use enumerate = true
jh
On Mon, Apr 15, 2013 at 12:13:36PM +0100, John Hodrien wrote:
On Mon, 15 Apr 2013, Jakub Hrozek wrote:
Thanks for the logs. I suspect that what happens is that the sssd_be crashes or is restarted for some reason and after being respawned starts another enumeration which is why you are seeing so many searches.
As a matter of fact, there is a patch on the list to fix a potential crash when enumerating. Which version are you running? Maybe you're hitting that exact problem.
enumerate = true is completely unusable on our setup, and sssd_be definitely appears to exit and restart.
From past conversations I know that your environment is rather big, so I
suspect you are hitting either ldap_enumeration_search_timeout (the actual ldapsearch takes more than 60 seconds) or then saving the results takes so long that the monitor kills the back end.
sssd-1.9.2-82.4.el6_4.x86_64
That said, I'm fairly sure I'd never want to use enumerate = true
I agree, but the reality is that users turn on enumeration quite frequently so longish-term (1.11 maybe, but probably 1.12) we should improve the enumeration task to be able to handle even very large directories.
Currently, as you see from your experience, it's not great when it comes to handling large number of users and groups..
jh _______________________________________________ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/sssd-devel
On Mon, 15 Apr 2013, Jakub Hrozek wrote:
From past conversations I know that your environment is rather big, so I suspect you are hitting either ldap_enumeration_search_timeout (the actual ldapsearch takes more than 60 seconds) or then saving the results takes so long that the monitor kills the back end.
That sounds plausible.
sssd-1.9.2-82.4.el6_4.x86_64
That said, I'm fairly sure I'd never want to use enumerate = true
I agree, but the reality is that users turn on enumeration quite frequently so longish-term (1.11 maybe, but probably 1.12) we should improve the enumeration task to be able to handle even very large directories.
I guess I'm not clear *why* people feel they need enumeration.
Currently, as you see from your experience, it's not great when it comes to handling large number of users and groups..
But is it *ever* going to be possible to get to a state where full enumeration with large trees is sane?
jh
On Mon, Apr 15, 2013 at 01:20:37PM +0100, John Hodrien wrote:
On Mon, 15 Apr 2013, Jakub Hrozek wrote:
From past conversations I know that your environment is rather big, so I suspect you are hitting either ldap_enumeration_search_timeout (the actual ldapsearch takes more than 60 seconds) or then saving the results takes so long that the monitor kills the back end.
That sounds plausible.
sssd-1.9.2-82.4.el6_4.x86_64
That said, I'm fairly sure I'd never want to use enumerate = true
I agree, but the reality is that users turn on enumeration quite frequently so longish-term (1.11 maybe, but probably 1.12) we should improve the enumeration task to be able to handle even very large directories.
I guess I'm not clear *why* people feel they need enumeration.
It's not completely clear to me either, the most frequent answer I heard was "convenience".
Currently, as you see from your experience, it's not great when it comes to handling large number of users and groups..
But is it *ever* going to be possible to get to a state where full enumeration with large trees is sane?
jh
I haven't really done any experiments or design, but we would have to split the current single huge request into several smaller ones in order to avoid starving the sssd_be event queue. I'm not exactly sure how, though.
On 04/15/2013 12:36 PM, Jakub Hrozek wrote:
On Mon, Apr 15, 2013 at 12:17:33PM +0200, steve wrote:
On 04/15/2013 11:33 AM, Jakub Hrozek wrote:
Have you been able to check what kind of searches (with wireshark or similar or even just looking at the debug logs) sssd is doing? There should be one search for users and one for groups every 300 seconds by default (can be overriden with ldap_enumeration_refresh_timeout).
Also how many users/groups do you have on the server?
Hi Test setup with 2 users and 2 groups. The ldap server (Samba4 AD) is flooded with requests every second. I attach small bits from each log. If you coud have a look that would be great.
Thanks for the logs. I suspect that what happens is that the sssd_be crashes or is restarted for some reason and after being respawned starts another enumeration which is why you are seeing so many searches.
As a matter of fact, there is a patch on the list to fix a potential crash when enumerating. Which version are you running? Maybe you're hitting that exact problem.
TIA Steve
[snip]
sssd_default.log
[snip]
(Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_dns_service_name has value ldap (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_krb5_ticket_lifetime has value 86400
Is there something more in the log? The part you pasted pretty much describes the option being loaded.
Hi The log got bounced because it was too big. I've put it here: https://dl.dropboxusercontent.com/u/45150875/sssd_default.log
sss.log (Mon Apr 15 12:00:40 2013) [sssd] [client_registration] (0x0100): Received ID registration: (pam,1) (Mon Apr 15 12:00:40 2013) [sssd] [mark_service_as_started] (0x0200): Marking pam as started. (Mon Apr 15 12:00:40 2013) [sssd] [sbus_server_init_new_connection] (0x0200): Entering. (Mon Apr 15 12:00:40 2013) [sssd] [sbus_server_init_new_connection] (0x0200): Adding connection 0x8708c60. (Mon Apr 15 12:00:40 2013) [sssd] [sbus_init_connection] (0x0200): Adding connection 8708C60 (Mon Apr 15 12:00:40 2013) [sssd] [sbus_server_init_new_connection] (0x0200): Got a connection (Mon Apr 15 12:00:40 2013) [sssd] [monitor_service_init] (0x0400): Initializing D-BUS Service (Mon Apr 15 12:00:40 2013) [sssd] [client_registration] (0x0100): Received ID registration: (nss,1) (Mon Apr 15 12:00:40 2013) [sssd] [mark_service_as_started] (0x0200): Marking nss as started. (Mon Apr 15 12:00:40 2013) [sssd] [mark_service_as_started] (0x0400): SSSD is initialized, terminating parent process (Mon Apr 15 12:00:40 2013) [sssd] [sbus_dispatch] (0x0080): Connection is not open for dispatching. (Mon Apr 15 12:00:40 2013) [sssd] [mt_svc_exit_handler] (0x0040): Child [default] terminated with signal [6]
Here is some more evidence, signal [6] is SIGABRT so the back end crashed.
It's quite possible that you hit https://fedorahosted.org/sssd/ticket/1799, if you could collect the backtrace then we would know for sure. We already have a patch for that problem and will include the fix in the shortly upcoming 1.9.5 release.
Thanks. Our version is 1.9.4 on Ubuntu 12.10. Will gladly test anything on 1.9.5 and report back. Cheers, Steve
On 04/15/2013 01:29 PM, steve wrote:
On 04/15/2013 12:36 PM, Jakub Hrozek wrote:
On Mon, Apr 15, 2013 at 12:17:33PM +0200, steve wrote:
On 04/15/2013 11:33 AM, Jakub Hrozek wrote:
Have you been able to check what kind of searches (with wireshark or similar or even just looking at the debug logs) sssd is doing? There should be one search for users and one for groups every 300 seconds by default (can be overriden with ldap_enumeration_refresh_timeout).
Also how many users/groups do you have on the server?
Hi Test setup with 2 users and 2 groups. The ldap server (Samba4 AD) is flooded with requests every second. I attach small bits from each log. If you coud have a look that would be great.
Thanks for the logs. I suspect that what happens is that the sssd_be crashes or is restarted for some reason and after being respawned starts another enumeration which is why you are seeing so many searches.
As a matter of fact, there is a patch on the list to fix a potential crash when enumerating. Which version are you running? Maybe you're hitting that exact problem.
TIA Steve
[snip]
sssd_default.log
[snip]
(Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_dns_service_name has value ldap (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_krb5_ticket_lifetime has value 86400
Is there something more in the log? The part you pasted pretty much describes the option being loaded.
Hi The log got bounced because it was too big. I've put it here: https://dl.dropboxusercontent.com/u/45150875/sssd_default.log
sss.log (Mon Apr 15 12:00:40 2013) [sssd] [client_registration] (0x0100): Received ID registration: (pam,1) (Mon Apr 15 12:00:40 2013) [sssd] [mark_service_as_started] (0x0200): Marking pam as started. (Mon Apr 15 12:00:40 2013) [sssd] [sbus_server_init_new_connection] (0x0200): Entering. (Mon Apr 15 12:00:40 2013) [sssd] [sbus_server_init_new_connection] (0x0200): Adding connection 0x8708c60. (Mon Apr 15 12:00:40 2013) [sssd] [sbus_init_connection] (0x0200): Adding connection 8708C60 (Mon Apr 15 12:00:40 2013) [sssd] [sbus_server_init_new_connection] (0x0200): Got a connection (Mon Apr 15 12:00:40 2013) [sssd] [monitor_service_init] (0x0400): Initializing D-BUS Service (Mon Apr 15 12:00:40 2013) [sssd] [client_registration] (0x0100): Received ID registration: (nss,1) (Mon Apr 15 12:00:40 2013) [sssd] [mark_service_as_started] (0x0200): Marking nss as started. (Mon Apr 15 12:00:40 2013) [sssd] [mark_service_as_started] (0x0400): SSSD is initialized, terminating parent process (Mon Apr 15 12:00:40 2013) [sssd] [sbus_dispatch] (0x0080): Connection is not open for dispatching. (Mon Apr 15 12:00:40 2013) [sssd] [mt_svc_exit_handler] (0x0040): Child [default] terminated with signal [6]
Here is some more evidence, signal [6] is SIGABRT so the back end crashed.
It's quite possible that you hit https://fedorahosted.org/sssd/ticket/1799, if you could collect the backtrace then we would know for sure. We already have a patch for that problem and will include the fix in the shortly upcoming 1.9.5 release.
Thanks. Our version is 1.9.4 on Ubuntu 12.10. Will gladly test anything on 1.9.5 and report back. Cheers, Steve
Hi Steve,
Does ldapsearch against the configured server work? Because from the provided logs, I see that there's issue with contacting it.
[sdap_process_result] (0x0040): ldap_result error: [Can't contact LDAP server]
On 04/15/2013 01:59 PM, Ondrej Kos wrote:
On 04/15/2013 01:29 PM, steve wrote:
On 04/15/2013 12:36 PM, Jakub Hrozek wrote:
On Mon, Apr 15, 2013 at 12:17:33PM +0200, steve wrote:
On 04/15/2013 11:33 AM, Jakub Hrozek wrote:
Have you been able to check what kind of searches (with wireshark or similar or even just looking at the debug logs) sssd is doing? There should be one search for users and one for groups every 300 seconds by default (can be overriden with ldap_enumeration_refresh_timeout).
Also how many users/groups do you have on the server?
Hi Test setup with 2 users and 2 groups. The ldap server (Samba4 AD) is flooded with requests every second. I attach small bits from each log. If you coud have a look that would be great.
Thanks for the logs. I suspect that what happens is that the sssd_be crashes or is restarted for some reason and after being respawned starts another enumeration which is why you are seeing so many searches.
As a matter of fact, there is a patch on the list to fix a potential crash when enumerating. Which version are you running? Maybe you're hitting that exact problem.
TIA Steve
[snip]
sssd_default.log
[snip]
(Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_dns_service_name has value ldap (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_krb5_ticket_lifetime has value 86400
Is there something more in the log? The part you pasted pretty much describes the option being loaded.
Hi The log got bounced because it was too big. I've put it here: https://dl.dropboxusercontent.com/u/45150875/sssd_default.log
sss.log (Mon Apr 15 12:00:40 2013) [sssd] [client_registration] (0x0100): Received ID registration: (pam,1) (Mon Apr 15 12:00:40 2013) [sssd] [mark_service_as_started] (0x0200): Marking pam as started. (Mon Apr 15 12:00:40 2013) [sssd] [sbus_server_init_new_connection] (0x0200): Entering. (Mon Apr 15 12:00:40 2013) [sssd] [sbus_server_init_new_connection] (0x0200): Adding connection 0x8708c60. (Mon Apr 15 12:00:40 2013) [sssd] [sbus_init_connection] (0x0200): Adding connection 8708C60 (Mon Apr 15 12:00:40 2013) [sssd] [sbus_server_init_new_connection] (0x0200): Got a connection (Mon Apr 15 12:00:40 2013) [sssd] [monitor_service_init] (0x0400): Initializing D-BUS Service (Mon Apr 15 12:00:40 2013) [sssd] [client_registration] (0x0100): Received ID registration: (nss,1) (Mon Apr 15 12:00:40 2013) [sssd] [mark_service_as_started] (0x0200): Marking nss as started. (Mon Apr 15 12:00:40 2013) [sssd] [mark_service_as_started] (0x0400): SSSD is initialized, terminating parent process (Mon Apr 15 12:00:40 2013) [sssd] [sbus_dispatch] (0x0080): Connection is not open for dispatching. (Mon Apr 15 12:00:40 2013) [sssd] [mt_svc_exit_handler] (0x0040): Child [default] terminated with signal [6]
Here is some more evidence, signal [6] is SIGABRT so the back end crashed.
It's quite possible that you hit https://fedorahosted.org/sssd/ticket/1799, if you could collect the backtrace then we would know for sure. We already have a patch for that problem and will include the fix in the shortly upcoming 1.9.5 release.
Thanks. Our version is 1.9.4 on Ubuntu 12.10. Will gladly test anything on 1.9.5 and report back. Cheers, Steve
Hi Steve,
Does ldapsearch against the configured server work? Because from the provided logs, I see that there's issue with contacting it.
[sdap_process_result] (0x0040): ldap_result error: [Can't contact LDAP server]
Hi Yes. ldapsearch is working. e.g.
sudo ldapsearch -h doloresdc.dolores.site -b 'dc=dolores,dc=site' "(cn=steve2)" -Y GSSAPI SASL/GSSAPI authentication started SASL username: Administrator@DOLORES.SITE SASL SSF: 56 SASL data security layer installed. # extended LDIF # # LDAPv3 # base <dc=dolores,dc=site> with scope subtree # filter: (cn=steve2) # requesting: ALL #
# steve2, Users, dolores.site dn: CN=steve2,CN=Users,DC=dolores,DC=site cn: steve2 instanceType: 4 whenCreated: 20130415110516.0Z uSNCreated: 3794 name: steve2 objectGUID:: jiXWdhBr6kWi1lB53iBxLQ== badPwdCount: 0 codePage: 0 countryCode: 0 badPasswordTime: 0 lastLogoff: 0 lastLogon: 0 primaryGroupID: 513 objectSid:: AQUAAAAAAAUVAAAAVV34Nfc0pG0Q3HZqWgQAAA== logonCount: 0 sAMAccountName: steve2 sAMAccountType: 805306368 userPrincipalName: steve2@dolores.site objectCategory: CN=Person,CN=Schema,CN=Configuration,DC=dolores,DC=site pwdLastSet: 130104975160000000 userAccountControl: 66048 accountExpires: 0 uidNumber: 3000024 gidNumber: 20513 unixHomeDirectory: /home/users/steve2 loginShell: /bin/bash objectClass: top objectClass: posixAccount objectClass: person objectClass: organizationalPerson objectClass: user profilePath: \doloresdc\profiles\steve2 homeDrive: Z: homeDirectory: \doloresdc\users\steve2 whenChanged: 20130415110519.0Z uSNChanged: 3800 memberOf: CN=staff,CN=Users,DC=dolores,DC=site distinguishedName: CN=steve2,CN=Users,DC=dolores,DC=site
# search reference ref: ldap://dolores.site/CN=Configuration,DC=dolores,DC=site
# search reference ref: ldap://dolores.site/DC=DomainDnsZones,DC=dolores,DC=site
# search reference ref: ldap://dolores.site/DC=ForestDnsZones,DC=dolores,DC=site
# search result search: 4 result: 0 Success
# numResponses: 5 # numEntries: 1 # numReferences: 3
Any ideas? Thanks for your time.
On Mon, Apr 15, 2013 at 04:47:15PM +0200, steve wrote:
On 04/15/2013 01:59 PM, Ondrej Kos wrote:
On 04/15/2013 01:29 PM, steve wrote:
On 04/15/2013 12:36 PM, Jakub Hrozek wrote:
On Mon, Apr 15, 2013 at 12:17:33PM +0200, steve wrote:
On 04/15/2013 11:33 AM, Jakub Hrozek wrote:
Have you been able to check what kind of searches (with wireshark or similar or even just looking at the debug logs) sssd is doing? There should be one search for users and one for groups every 300 seconds by default (can be overriden with ldap_enumeration_refresh_timeout).
Also how many users/groups do you have on the server?
Hi Test setup with 2 users and 2 groups. The ldap server (Samba4 AD) is flooded with requests every second. I attach small bits from each log. If you coud have a look that would be great.
Thanks for the logs. I suspect that what happens is that the sssd_be crashes or is restarted for some reason and after being respawned starts another enumeration which is why you are seeing so many searches.
As a matter of fact, there is a patch on the list to fix a potential crash when enumerating. Which version are you running? Maybe you're hitting that exact problem.
TIA Steve
[snip]
sssd_default.log
[snip]
(Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_dns_service_name has value ldap (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_krb5_ticket_lifetime has value 86400
Is there something more in the log? The part you pasted pretty much describes the option being loaded.
Hi The log got bounced because it was too big. I've put it here: https://dl.dropboxusercontent.com/u/45150875/sssd_default.log
sss.log (Mon Apr 15 12:00:40 2013) [sssd] [client_registration] (0x0100): Received ID registration: (pam,1) (Mon Apr 15 12:00:40 2013) [sssd] [mark_service_as_started] (0x0200): Marking pam as started. (Mon Apr 15 12:00:40 2013) [sssd] [sbus_server_init_new_connection] (0x0200): Entering. (Mon Apr 15 12:00:40 2013) [sssd] [sbus_server_init_new_connection] (0x0200): Adding connection 0x8708c60. (Mon Apr 15 12:00:40 2013) [sssd] [sbus_init_connection] (0x0200): Adding connection 8708C60 (Mon Apr 15 12:00:40 2013) [sssd] [sbus_server_init_new_connection] (0x0200): Got a connection (Mon Apr 15 12:00:40 2013) [sssd] [monitor_service_init] (0x0400): Initializing D-BUS Service (Mon Apr 15 12:00:40 2013) [sssd] [client_registration] (0x0100): Received ID registration: (nss,1) (Mon Apr 15 12:00:40 2013) [sssd] [mark_service_as_started] (0x0200): Marking nss as started. (Mon Apr 15 12:00:40 2013) [sssd] [mark_service_as_started] (0x0400): SSSD is initialized, terminating parent process (Mon Apr 15 12:00:40 2013) [sssd] [sbus_dispatch] (0x0080): Connection is not open for dispatching. (Mon Apr 15 12:00:40 2013) [sssd] [mt_svc_exit_handler] (0x0040): Child [default] terminated with signal [6]
Here is some more evidence, signal [6] is SIGABRT so the back end crashed.
It's quite possible that you hit https://fedorahosted.org/sssd/ticket/1799, if you could collect the backtrace then we would know for sure. We already have a patch for that problem and will include the fix in the shortly upcoming 1.9.5 release.
Thanks. Our version is 1.9.4 on Ubuntu 12.10. Will gladly test anything on 1.9.5 and report back. Cheers, Steve
Hi Steve,
Does ldapsearch against the configured server work? Because from the provided logs, I see that there's issue with contacting it.
[sdap_process_result] (0x0040): ldap_result error: [Can't contact LDAP server]
Hi Yes. ldapsearch is working. e.g.
sudo ldapsearch -h doloresdc.dolores.site -b 'dc=dolores,dc=site' "(cn=steve2)" -Y GSSAPI SASL/GSSAPI authentication started SASL username: Administrator@DOLORES.SITE SASL SSF: 56 SASL data security layer installed.
Ah, but here you are using a different SASL username than the SSSD, I think. According to the config you sent earlier, you set the authid to DOLORESDC$.
Can you also try kinit as this principal and then ldapsearch again, please? That would get you very close to what the SSSD is doing.
You would do something like this (modulo different names in case I mistyped):
kdestroy kinit -k DOLORESDC$@DOLORES.SITE ldapsearch -h doloresdc.dolores.site -b 'dc=dolores,dc=site' "(cn=steve2)" -Y GSSAPI
Also one thing I missed is that you are using IP address instead of host name in the config file, at least according to the log on dropbox. I would recommend using hostnames with GSSAPI, sometimes the SASL library might attempt to canonicalize the hostname and fail if it can't.
On 04/15/2013 05:23 PM, Jakub Hrozek wrote:
On Mon, Apr 15, 2013 at 04:47:15PM +0200, steve wrote:
On 04/15/2013 01:59 PM, Ondrej Kos wrote:
On 04/15/2013 01:29 PM, steve wrote:
On 04/15/2013 12:36 PM, Jakub Hrozek wrote:
On Mon, Apr 15, 2013 at 12:17:33PM +0200, steve wrote:
On 04/15/2013 11:33 AM, Jakub Hrozek wrote: > Have you been able to check what kind of searches (with wireshark or > similar > or even just looking at the debug logs) sssd is doing? > There should be > one search for users and one for groups every 300 seconds by default > (can be overriden with ldap_enumeration_refresh_timeout). > > Also how many users/groups do you have on the server? > Hi Test setup with 2 users and 2 groups. The ldap server (Samba4 AD) is flooded with requests every second. I attach small bits from each log. If you coud have a look that would be great.
Thanks for the logs. I suspect that what happens is that the sssd_be crashes or is restarted for some reason and after being respawned starts another enumeration which is why you are seeing so many searches.
As a matter of fact, there is a patch on the list to fix a potential crash when enumerating. Which version are you running? Maybe you're hitting that exact problem.
TIA Steve
[snip]
sssd_default.log
[snip]
(Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_dns_service_name has value ldap (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_krb5_ticket_lifetime has value 86400
Is there something more in the log? The part you pasted pretty much describes the option being loaded.
Hi The log got bounced because it was too big. I've put it here: https://dl.dropboxusercontent.com/u/45150875/sssd_default.log
sss.log (Mon Apr 15 12:00:40 2013) [sssd] [client_registration] (0x0100): Received ID registration: (pam,1) (Mon Apr 15 12:00:40 2013) [sssd] [mark_service_as_started] (0x0200): Marking pam as started. (Mon Apr 15 12:00:40 2013) [sssd] [sbus_server_init_new_connection] (0x0200): Entering. (Mon Apr 15 12:00:40 2013) [sssd] [sbus_server_init_new_connection] (0x0200): Adding connection 0x8708c60. (Mon Apr 15 12:00:40 2013) [sssd] [sbus_init_connection] (0x0200): Adding connection 8708C60 (Mon Apr 15 12:00:40 2013) [sssd] [sbus_server_init_new_connection] (0x0200): Got a connection (Mon Apr 15 12:00:40 2013) [sssd] [monitor_service_init] (0x0400): Initializing D-BUS Service (Mon Apr 15 12:00:40 2013) [sssd] [client_registration] (0x0100): Received ID registration: (nss,1) (Mon Apr 15 12:00:40 2013) [sssd] [mark_service_as_started] (0x0200): Marking nss as started. (Mon Apr 15 12:00:40 2013) [sssd] [mark_service_as_started] (0x0400): SSSD is initialized, terminating parent process (Mon Apr 15 12:00:40 2013) [sssd] [sbus_dispatch] (0x0080): Connection is not open for dispatching. (Mon Apr 15 12:00:40 2013) [sssd] [mt_svc_exit_handler] (0x0040): Child [default] terminated with signal [6]
Here is some more evidence, signal [6] is SIGABRT so the back end crashed.
It's quite possible that you hit https://fedorahosted.org/sssd/ticket/1799, if you could collect the backtrace then we would know for sure. We already have a patch for that problem and will include the fix in the shortly upcoming 1.9.5 release.
Thanks. Our version is 1.9.4 on Ubuntu 12.10. Will gladly test anything on 1.9.5 and report back. Cheers, Steve
Hi Steve,
Does ldapsearch against the configured server work? Because from the provided logs, I see that there's issue with contacting it.
[sdap_process_result] (0x0040): ldap_result error: [Can't contact LDAP server]
Hi Yes. ldapsearch is working. e.g.
sudo ldapsearch -h doloresdc.dolores.site -b 'dc=dolores,dc=site' "(cn=steve2)" -Y GSSAPI SASL/GSSAPI authentication started SASL username: Administrator@DOLORES.SITE SASL SSF: 56 SASL data security layer installed.
Ah, but here you are using a different SASL username than the SSSD, I think. According to the config you sent earlier, you set the authid to DOLORESDC$.
Can you also try kinit as this principal and then ldapsearch again, please? That would get you very close to what the SSSD is doing.
You would do something like this (modulo different names in case I mistyped):
kdestroy kinit -k DOLORESDC$@DOLORES.SITE ldapsearch -h doloresdc.dolores.site -b 'dc=dolores,dc=site' "(cn=steve2)" -Y GSSAPI
Hi OK. steve@doloresdc:~$ sudo kdestroy steve@doloresdc:~$ sudo kinit -k DOLORESDC$ steve@doloresdc:~$ sudo ldapsearch -h doloresdc.dolores.site -b 'dc=dolores,dc=site' "(cn=steve2)" -Y GSSAPI SASL/GSSAPI authentication started SASL username: DOLORESDC$@DOLORES.SITE SASL SSF: 56 SASL data security layer installed. # extended LDIF # # LDAPv3 # base <dc=dolores,dc=site> with scope subtree # filter: (cn=steve2) # requesting: ALL #
# steve2, Users, dolores.site dn: CN=steve2,CN=Users,DC=dolores,DC=site cn: steve2 instanceType: 4 whenCreated: 20130415110516.0Z uSNCreated: 3794 name: steve2 objectGUID:: jiXWdhBr6kWi1lB53iBxLQ== badPwdCount: 0 codePage: 0 countryCode: 0 badPasswordTime: 0 lastLogoff: 0 lastLogon: 0 primaryGroupID: 513 objectSid:: AQUAAAAAAAUVAAAAVV34Nfc0pG0Q3HZqWgQAAA== logonCount: 0 sAMAccountName: steve2 sAMAccountType: 805306368 userPrincipalName: steve2@dolores.site objectCategory: CN=Person,CN=Schema,CN=Configuration,DC=dolores,DC=site pwdLastSet: 130104975160000000 userAccountControl: 66048 accountExpires: 0 uidNumber: 3000024 gidNumber: 20513 unixHomeDirectory: /home/users/steve2 loginShell: /bin/bash objectClass: top objectClass: posixAccount objectClass: person objectClass: organizationalPerson objectClass: user profilePath: \doloresdc\profiles\steve2 homeDrive: Z: homeDirectory: \doloresdc\users\steve2 whenChanged: 20130415110519.0Z uSNChanged: 3800 memberOf: CN=staff,CN=Users,DC=dolores,DC=site distinguishedName: CN=steve2,CN=Users,DC=dolores,DC=site
# search reference ref: ldap://dolores.site/CN=Configuration,DC=dolores,DC=site
# search reference ref: ldap://dolores.site/DC=DomainDnsZones,DC=dolores,DC=site
# search reference ref: ldap://dolores.site/DC=ForestDnsZones,DC=dolores,DC=site
# search result search: 4 result: 0 Success
# numResponses: 5 # numEntries: 1 # numReferences: 3
And the machine cache has this: steve@doloresdc:~$ sudo klist Ticket cache: FILE:/tmp/krb5cc_0 Default principal: DOLORESDC$@DOLORES.SITE
Valid starting Expires Service principal 15/04/2013 18:21 16/04/2013 04:21 krbtgt/DOLORES.SITE@DOLORES.SITE renew until 16/04/2013 18:21 15/04/2013 18:21 16/04/2013 04:21 ldap/doloresdc.dolores.site@ renew until 16/04/2013 18:21 15/04/2013 18:21 16/04/2013 04:21 ldap/doloresdc.dolores.site@DOLORES.SITE renew until 16/04/2013 18:21
Also one thing I missed is that you are using IP address instead of host name in the config file, at least according to the log on dropbox. I would recommend using hostnames with GSSAPI, sometimes the SASL library might attempt to canonicalize the hostname and fail if it can't.
AD giving this error: ldb_wrap open of secrets.ldb Terminating connection - 'LDAP_PROTOCOL_ERROR' single_terminate: reason[LDAP_PROTOCOL_ERROR]
and sssd is still giving many of these: steve@doloresdc:/var/log/sssd$ sudo cat sssd_default.log | grep "Can't" (Mon Apr 15 16:58:11 2013) [sssd[be[default]]] [sdap_process_result] (0x0040): ldap_result error: [Can't contact LDAP server] (Mon Apr 15 16:58:21 2013) [sssd[be[default]]] [sdap_process_result] (0x0040): ldap_result error: [Can't contact LDAP server]
Sorry. . .
On Mon, Apr 15, 2013 at 01:29:52PM +0200, steve wrote:
On 04/15/2013 12:36 PM, Jakub Hrozek wrote:
On Mon, Apr 15, 2013 at 12:17:33PM +0200, steve wrote:
On 04/15/2013 11:33 AM, Jakub Hrozek wrote:
Have you been able to check what kind of searches (with wireshark or similar or even just looking at the debug logs) sssd is doing? There should be one search for users and one for groups every 300 seconds by default (can be overriden with ldap_enumeration_refresh_timeout).
Also how many users/groups do you have on the server?
Hi Test setup with 2 users and 2 groups. The ldap server (Samba4 AD) is flooded with requests every second. I attach small bits from each log. If you coud have a look that would be great.
Thanks for the logs. I suspect that what happens is that the sssd_be crashes or is restarted for some reason and after being respawned starts another enumeration which is why you are seeing so many searches.
As a matter of fact, there is a patch on the list to fix a potential crash when enumerating. Which version are you running? Maybe you're hitting that exact problem.
TIA Steve
[snip]
sssd_default.log
[snip]
(Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_dns_service_name has value ldap (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_krb5_ticket_lifetime has value 86400
Is there something more in the log? The part you pasted pretty much describes the option being loaded.
Hi The log got bounced because it was too big. I've put it here: https://dl.dropboxusercontent.com/u/45150875/sssd_default.log
Thank you, I will cancel the pending request, then.
This is the most important part of the log: (Mon Apr 15 13:16:33 2013) [sssd[be[default]]] [set_server_common_status] (0x0100): Marking server '192.168.1.100' as 'working' (Mon Apr 15 13:16:33 2013) [sssd[be[default]]] [sdap_get_users_next_base] (0x0400): Searching for users with base [dc=dolores,dc=site] (Mon Apr 15 13:16:33 2013) [sssd[be[default]]] [sdap_get_generic_ext_step] (0x0400): calling ldap_search_ext with [(&(&(objectclass=user)(samAccountName=*)(uidNumber=*)(gidNumber=*))(&(objectCategory=user)(uidNumber=*)))][dc=dolores,dc=site]. (Mon Apr 15 13:16:33 2013) [sssd[be[default]]] [client_registration] (0x0100): Cancel DP ID timeout [0x9e3c5f0] (Mon Apr 15 13:16:33 2013) [sssd[be[default]]] [sdap_process_result] (0x0040): ldap_result error: [Can't contact LDAP server]
Now sssd_be crashed and the two following lines are already the newly spawned process.
(Mon Apr 15 13:16:34 2013) [sssd[be[default]]] [server_setup] (0x0400): CONFDB: /var/lib/sss/db/config.ldb (Mon Apr 15 13:16:34 2013) [sssd[be[default]]] [recreate_ares_channel] (0x0100): Initializing new c-ares channel
Judging by the log, this is a different bug :-/ Is it possible to get the core file from the crash? Or maybe a text backtrace?
I tried to simulate the crash by setting up a firewall on my test IPA server, but the sssd survived the test here..
As Ondra asked, can you ldapsearch the directory? When you were testing with nss-pam-ldapd did you also use GSSAPI authentication?
On 04/15/2013 02:09 PM, Jakub Hrozek wrote:
On Mon, Apr 15, 2013 at 01:29:52PM +0200, steve wrote:
On 04/15/2013 12:36 PM, Jakub Hrozek wrote:
On Mon, Apr 15, 2013 at 12:17:33PM +0200, steve wrote:
On 04/15/2013 11:33 AM, Jakub Hrozek wrote:
Have you been able to check what kind of searches (with wireshark or similar or even just looking at the debug logs) sssd is doing? There should be one search for users and one for groups every 300 seconds by default (can be overriden with ldap_enumeration_refresh_timeout).
Also how many users/groups do you have on the server?
Hi Test setup with 2 users and 2 groups. The ldap server (Samba4 AD) is flooded with requests every second. I attach small bits from each log. If you coud have a look that would be great.
Thanks for the logs. I suspect that what happens is that the sssd_be crashes or is restarted for some reason and after being respawned starts another enumeration which is why you are seeing so many searches.
As a matter of fact, there is a patch on the list to fix a potential crash when enumerating. Which version are you running? Maybe you're hitting that exact problem.
TIA Steve
[snip]
sssd_default.log
[snip]
(Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_dns_service_name has value ldap (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_krb5_ticket_lifetime has value 86400
Is there something more in the log? The part you pasted pretty much describes the option being loaded.
Hi The log got bounced because it was too big. I've put it here: https://dl.dropboxusercontent.com/u/45150875/sssd_default.log
Thank you, I will cancel the pending request, then.
This is the most important part of the log: (Mon Apr 15 13:16:33 2013) [sssd[be[default]]] [set_server_common_status] (0x0100): Marking server '192.168.1.100' as 'working' (Mon Apr 15 13:16:33 2013) [sssd[be[default]]] [sdap_get_users_next_base] (0x0400): Searching for users with base [dc=dolores,dc=site] (Mon Apr 15 13:16:33 2013) [sssd[be[default]]] [sdap_get_generic_ext_step] (0x0400): calling ldap_search_ext with [(&(&(objectclass=user)(samAccountName=*)(uidNumber=*)(gidNumber=*))(&(objectCategory=user)(uidNumber=*)))][dc=dolores,dc=site]. (Mon Apr 15 13:16:33 2013) [sssd[be[default]]] [client_registration] (0x0100): Cancel DP ID timeout [0x9e3c5f0] (Mon Apr 15 13:16:33 2013) [sssd[be[default]]] [sdap_process_result] (0x0040): ldap_result error: [Can't contact LDAP server]
Now sssd_be crashed and the two following lines are already the newly spawned process.
(Mon Apr 15 13:16:34 2013) [sssd[be[default]]] [server_setup] (0x0400): CONFDB: /var/lib/sss/db/config.ldb (Mon Apr 15 13:16:34 2013) [sssd[be[default]]] [recreate_ares_channel] (0x0100): Initializing new c-ares channel
Judging by the log, this is a different bug :-/ Is it possible to get the core file from the crash? Or maybe a text backtrace?
Sorry, but I don't know how to do that. Could you guide me?
I tried to simulate the crash by setting up a firewall on my test IPA server, but the sssd survived the test here..
As Ondra asked, can you ldapsearch the directory? When you were testing with nss-pam-ldapd did you also use GSSAPI authentication?
Yes. nslcd uses GSSAPI /etc/nslcd.conf uid nslcd gid nslcd uri ldap://192.168.1.100/ base dc=dolores,dc=site map passwd uid samAccountName map passwd homeDirectory unixHomeDirectory sasl_mech GSSAPI sasl_realm DOLORES.SITE krb5_ccname /tmp/nslcd.tkt
Thanks again.
sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/sssd-devel
On Mon, Apr 15, 2013 at 04:53:30PM +0200, steve wrote:
On 04/15/2013 02:09 PM, Jakub Hrozek wrote:
On Mon, Apr 15, 2013 at 01:29:52PM +0200, steve wrote:
On 04/15/2013 12:36 PM, Jakub Hrozek wrote:
On Mon, Apr 15, 2013 at 12:17:33PM +0200, steve wrote:
On 04/15/2013 11:33 AM, Jakub Hrozek wrote:
Have you been able to check what kind of searches (with wireshark or similar or even just looking at the debug logs) sssd is doing? There should be one search for users and one for groups every 300 seconds by default (can be overriden with ldap_enumeration_refresh_timeout).
Also how many users/groups do you have on the server?
Hi Test setup with 2 users and 2 groups. The ldap server (Samba4 AD) is flooded with requests every second. I attach small bits from each log. If you coud have a look that would be great.
Thanks for the logs. I suspect that what happens is that the sssd_be crashes or is restarted for some reason and after being respawned starts another enumeration which is why you are seeing so many searches.
As a matter of fact, there is a patch on the list to fix a potential crash when enumerating. Which version are you running? Maybe you're hitting that exact problem.
TIA Steve
[snip]
sssd_default.log
[snip]
(Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_dns_service_name has value ldap (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_krb5_ticket_lifetime has value 86400
Is there something more in the log? The part you pasted pretty much describes the option being loaded.
Hi The log got bounced because it was too big. I've put it here: https://dl.dropboxusercontent.com/u/45150875/sssd_default.log
Thank you, I will cancel the pending request, then.
This is the most important part of the log: (Mon Apr 15 13:16:33 2013) [sssd[be[default]]] [set_server_common_status] (0x0100): Marking server '192.168.1.100' as 'working' (Mon Apr 15 13:16:33 2013) [sssd[be[default]]] [sdap_get_users_next_base] (0x0400): Searching for users with base [dc=dolores,dc=site] (Mon Apr 15 13:16:33 2013) [sssd[be[default]]] [sdap_get_generic_ext_step] (0x0400): calling ldap_search_ext with [(&(&(objectclass=user)(samAccountName=*)(uidNumber=*)(gidNumber=*))(&(objectCategory=user)(uidNumber=*)))][dc=dolores,dc=site]. (Mon Apr 15 13:16:33 2013) [sssd[be[default]]] [client_registration] (0x0100): Cancel DP ID timeout [0x9e3c5f0] (Mon Apr 15 13:16:33 2013) [sssd[be[default]]] [sdap_process_result] (0x0040): ldap_result error: [Can't contact LDAP server]
Now sssd_be crashed and the two following lines are already the newly spawned process.
(Mon Apr 15 13:16:34 2013) [sssd[be[default]]] [server_setup] (0x0400): CONFDB: /var/lib/sss/db/config.ldb (Mon Apr 15 13:16:34 2013) [sssd[be[default]]] [recreate_ares_channel] (0x0100): Initializing new c-ares channel
Judging by the log, this is a different bug :-/ Is it possible to get the core file from the crash? Or maybe a text backtrace?
Sorry, but I don't know how to do that. Could you guide me?
In general you should install the packages with debug information (sorry, I don't know exactly how they are called in Ubuntu).
Next, generating core files must be enabled on the system as they might contain sensitive information. At least on Fedora, the limits are defined in /etc/security/limits.conf: soft core <max-size-in-bytes> root soft core <max-size-in-bytes>
And finally, the cores are dumped to location defined by the core_pattern sysctl parameter: # sysctl kernel.core_pattern
But maybe Ubuntu has some tool similar to Fedora's abrt that would help you with generating the core. Timo, is there something like that?
On 15.04.2013 18:34, Jakub Hrozek wrote:
On Mon, Apr 15, 2013 at 04:53:30PM +0200, steve wrote:
On 04/15/2013 02:09 PM, Jakub Hrozek wrote:
On Mon, Apr 15, 2013 at 01:29:52PM +0200, steve wrote:
On 04/15/2013 12:36 PM, Jakub Hrozek wrote:
On Mon, Apr 15, 2013 at 12:17:33PM +0200, steve wrote:
On 04/15/2013 11:33 AM, Jakub Hrozek wrote: > Have you been able to check what kind of searches (with wireshark or similar > or even just looking at the debug logs) sssd is doing? There should be > one search for users and one for groups every 300 seconds by default > (can be overriden with ldap_enumeration_refresh_timeout). > > Also how many users/groups do you have on the server? > Hi Test setup with 2 users and 2 groups. The ldap server (Samba4 AD) is flooded with requests every second. I attach small bits from each log. If you coud have a look that would be great.
Thanks for the logs. I suspect that what happens is that the sssd_be crashes or is restarted for some reason and after being respawned starts another enumeration which is why you are seeing so many searches.
As a matter of fact, there is a patch on the list to fix a potential crash when enumerating. Which version are you running? Maybe you're hitting that exact problem.
TIA Steve
[snip]
sssd_default.log
[snip]
(Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_dns_service_name has value ldap (Mon Apr 15 12:00:43 2013) [sssd[be[default]]] [dp_get_options] (0x0400): Option ldap_krb5_ticket_lifetime has value 86400
Is there something more in the log? The part you pasted pretty much describes the option being loaded.
Hi The log got bounced because it was too big. I've put it here: https://dl.dropboxusercontent.com/u/45150875/sssd_default.log
Thank you, I will cancel the pending request, then.
This is the most important part of the log: (Mon Apr 15 13:16:33 2013) [sssd[be[default]]] [set_server_common_status] (0x0100): Marking server '192.168.1.100' as 'working' (Mon Apr 15 13:16:33 2013) [sssd[be[default]]] [sdap_get_users_next_base] (0x0400): Searching for users with base [dc=dolores,dc=site] (Mon Apr 15 13:16:33 2013) [sssd[be[default]]] [sdap_get_generic_ext_step] (0x0400): calling ldap_search_ext with [(&(&(objectclass=user)(samAccountName=*)(uidNumber=*)(gidNumber=*))(&(objectCategory=user)(uidNumber=*)))][dc=dolores,dc=site]. (Mon Apr 15 13:16:33 2013) [sssd[be[default]]] [client_registration] (0x0100): Cancel DP ID timeout [0x9e3c5f0] (Mon Apr 15 13:16:33 2013) [sssd[be[default]]] [sdap_process_result] (0x0040): ldap_result error: [Can't contact LDAP server]
Now sssd_be crashed and the two following lines are already the newly spawned process.
(Mon Apr 15 13:16:34 2013) [sssd[be[default]]] [server_setup] (0x0400): CONFDB: /var/lib/sss/db/config.ldb (Mon Apr 15 13:16:34 2013) [sssd[be[default]]] [recreate_ares_channel] (0x0100): Initializing new c-ares channel
Judging by the log, this is a different bug :-/ Is it possible to get the core file from the crash? Or maybe a text backtrace?
Sorry, but I don't know how to do that. Could you guide me?
In general you should install the packages with debug information (sorry, I don't know exactly how they are called in Ubuntu).
Next, generating core files must be enabled on the system as they might contain sensitive information. At least on Fedora, the limits are defined in /etc/security/limits.conf: soft core <max-size-in-bytes> root soft core <max-size-in-bytes>
And finally, the cores are dumped to location defined by the core_pattern sysctl parameter: # sysctl kernel.core_pattern
But maybe Ubuntu has some tool similar to Fedora's abrt that would help you with generating the core. Timo, is there something like that?
Yes, there is apport since many years, need to enable it for a released ubuntu version via /etc/default/apport. Crash dumps end up in /var/crash. Debugging symbols for sssd can be found from ddebs:
https://wiki.ubuntu.com/DebuggingProgramCrash#Debug_Symbol_Packages
sssd-devel@lists.fedorahosted.org