Enumerate users from external group from AD trust
by Bolke de Bruin
Hello,
I have sssd 1.13.00 working against FreeIPA 4.2 domain. This domain has a trust relationship with a active directory domain.
One of the systems we are using requires to enumerate all users in groups by (unfortunate) design (Apache Ranger). This is done by using
“getent group”. During this enumeration the full user list for a group that has a nested external member group* is not always returned so we thought to
add “getent group mygroup” in order to get more details. Unfortunately this does not seem to work consistently: sometimes this gives information sometimes it does not:
[root@master centos]# getent group ad_users
ad_users:*:1950000004:
[root@master centos]# id bolke(a)ad.local
UID=1796201107(bolke(a)ad.local) GID=1796201107(bolke(a)ad.local) groepen=1796201107(bolke(a)ad.local),1796200513(domain users@ad.local),1796201108(test(a)ad.local)
[root@master centos]# getent group ad_users
ad_users:*:1950000004:bolke@ad.local <mailto:bolke@ad.local>
If I clear the cache (sss_cache -E) the entry is gone again:
[root@master centos]# getent group ad_users
ad_users:*:1950000004:
My question is how do I get sssd to enumerate *all users* in a group consistently?
Thanks!
Bolke
* https://docs.fedoraproject.org/en-US/Fedora/18/html/FreeIPA_Guide/trust-g...
4 years
full_name_format and supplemental groups
by Orion Poplawski
Running IPA with an AD trust. Users are in AD. Trying to use
full_name_format = %1$s to strip the domain from user names. This appears to
break supplemental groups in strange ways.
On the IPA server:
Without full_name_format:
# id orion(a)ad.nwra.com
uid=470202603(orion(a)ad.nwra.com) gid=470202603(orion(a)ad.nwra.com)
groups=470202603(orion(a)ad.nwra.com),470200513(domain
users(a)ad.nwra.com),470204703(pirep rd users(a)ad.nwra.com),470204714(wireless
access@ad.nwra.com),470204715(nwra-users@ad.nwra.com),470204701(boulder(a)ad.nwra.com),470207608(heimdall
users(a)ad.nwra.com),470200512(domain admins(a)ad.nwra.com),470207124(andreas
admins(a)ad.nwra.com)
With:
# id orion(a)ad.nwra.com
uid=470202603(orion) gid=470202603(orion) groups=470202603(orion)
If I add:
default_domain_suffix = ad.nwra.com
# id orion
uid=470202603(orion) gid=470202603(orion)
groups=470202603(orion),470200512(domain admins),470207608(heimdall
users),470204714(wireless
access),470204715(nwra-users),470204701(boulder),470204703(pirep rd
users),470207124(andreas admins),470200513(domain users)
Which I guess makes some sense as you'd need to add the domain suffix back on
to find the groups.
But this appears to completely break IPA clients (with full_name_format = %1$s
and default_domain_suffix = ad.nwra.com):
# id orion(a)ad.nwra.com
id: orion(a)ad.nwra.com: no such user
# id orion
id: orion: no such user
>From looking at the server logs, it looks like only the IPA domain is searched
If I reset the server back to normal (drop full_name_format and
default_domain_suffix):
# id orion
uid=470202603(orion) gid=470202603(orion) groups=470202603(orion)
I don't get any supplemental groups. I see sssd errors like:
(Mon Mar 30 15:20:52 2015) [sssd[be[nwra.com]]] [sysdb_mod_group_member]
(0x0400): Error: 2 (No such file or directory)
(Mon Mar 30 15:20:52 2015) [sssd[be[nwra.com]]] [sysdb_update_members_ex]
(0x0020): Could not add member [orion] to group [name=domain
admins,cn=groups,cn=nwra.com,cn=sysdb]. Skipping.
Is t trying "cn=groups,cn=nwra.com,cn=sysdb" instead of
"cn=groups,cn=ad.nwra.com,cn=sysdb"
--
Orion Poplawski
Technical Manager 303-415-9701 x222
NWRA, Boulder/CoRA Office FAX: 303-415-9702
3380 Mitchell Lane orion(a)nwra.com
Boulder, CO 80301 http://www.nwra.com
7 years, 1 month
netlink messages on Infiniband causing sssd to exit
by Ryan Novosielski
Over time, I’ve been having seemingly random sssd quits that I’ve not been able to figure out. Today, I finally traced it to fluctuations on my Infiniband fabric:
sssd.log
(Tue Nov 3 13:17:59 2015) [sssd] [message_type] (0x0200): netlink Message type: 16
(Tue Nov 3 13:17:59 2015) [sssd] [link_msg_handler] (0x1000): netlink link message: iface idx 4 (ib0) flags 0x1003 (broadcast,multicast,up)
(Tue Nov 3 13:17:59 2015) [sssd] [message_type] (0x0200): netlink Message type: 16
(Tue Nov 3 13:17:59 2015) [sssd] [link_msg_handler] (0x1000): netlink link message: iface idx 4 (ib0) flags 0x11043 (broadcast,multicast,up,running,lower)
This exactly corresponds to the time in /var/log/messages for the unexplained shutdown:
2015-11-03T13:17:59-05:00 node75 sssd[pam]: Shutting down
2015-11-03T13:17:59-05:00 node75 sssd[be[default]]: Shutting down
2015-11-03T13:17:59-05:00 node75 sssd[nss]: Shutting down
Here is sssd_default.log for good measure:
(Tue Nov 3 13:17:59 2015) [sssd[be[default]]] [sbus_remove_watch] (0x2000): 0x1414770/0x14133d0
(Tue Nov 3 13:17:59 2015) [sssd[be[default]]] [sbus_remove_watch] (0x2000): 0x1414770/0x13fef90
(Tue Nov 3 13:17:59 2015) [sssd[be[default]]] [be_ptask_destructor] (0x0400): Terminating periodic task [Cleanup of default]
(Tue Nov 3 13:17:59 2015) [sssd[be[default]]] [sdap_handle_release] (0x2000): Trace: sh[0x14bd850], connected[1], ops[(nil)], ldap[0x1424260], destructor_lock[0], release_memory[0]
(Tue Nov 3 13:17:59 2015) [sssd[be[default]]] [remove_connection_callback] (0x4000): Successfully removed connection callback.
(Tue Nov 3 13:17:59 2015) [sssd[be[default]]] [sbus_remove_watch] (0x2000): 0x1415970/0x1416430
(Tue Nov 3 13:17:59 2015) [sssd[be[default]]] [remove_socket_symlink] (0x4000): The symlink points to [/var/lib/sss/pipes/private/sbus-dp_default.18702]
(Tue Nov 3 13:17:59 2015) [sssd[be[default]]] [remove_socket_symlink] (0x4000): The path including our pid is [/var/lib/sss/pipes/private/sbus-dp_default.18702]
(Tue Nov 3 13:17:59 2015) [sssd[be[default]]] [remove_socket_symlink] (0x4000): Removed the symlink
(Tue Nov 3 13:17:59 2015) [sssd[be[default]]] [be_client_destructor] (0x0400): Removed PAM client
(Tue Nov 3 13:17:59 2015) [sssd[be[default]]] [be_client_destructor] (0x0400): Removed NSS client
I can duplicate this by manually taking down the Infiniband link:
[root@node24 ~]# service sssd status
sssd (pid 9132) is running...
[root@node24 ~]# ifdown ib0
[root@node24 ~]# service sssd status
sssd dead but pid file exists
I have also noticed that sssd will not start on boot. As I know that Infiniband tends to flutter a little bit before the link comes up, I’m thinking this is probably the same cause.
Can anyone explain this behavior and tell me what I might do to prevent it?
--
____ *Note: UMDNJ is now Rutgers-Biomedical and Health Sciences*
|| \\UTGERS |---------------------*O*---------------------
||_// Biomedical | Ryan Novosielski - Senior Technologist
|| \\ and Health | novosirj(a)rutgers.edu - 973/972.0922 (2x0922)
|| \\ Sciences | OIRT/High Perf & Res Comp - MSB C630, Newark
`'
7 years, 3 months
Caching of automount maps
by Jason L Tibbitts III
I'm trying to better understand how sssd caches automount maps. I've
had everything working for quite some time; autofs will get the maps
(which are stored in ldap) from sssd when the system boots and in
general there are no issues. However, there are a couple of problems
I've been having:
* Autofs is mostly nonfunctional When a system boots without the network.
* Even with a network, autofs sometimes simply fails to start with:
automount[917]: setautomntent: lookup(sss): setautomntent: No such
file or directory
---
With no network, autofs starts but the only thing which operates is the
/net map because that's in /etc/auto.master. The rest of the maps
simply aren't there, and if the network returns, autofs doesn't notice
(which I know is an issue with autofs).
I thought that sssd would be able to cache the master map entries which
come from ldap when the network is offline, but that doesn't seem to be
the case. I know that it couldn't actually mount anything, but if it at
had access to a cached version the map, it would at least start properly.
---
The startup failure is rather difficult to reproduce. My guess is that
autofs is simply coming up before sssd is ready to provide the master
map, but I haven't been able to confirm that. I also haven't been able
to see how much of autofs is actually functioning in that case (because
people keep rebooting their desktops before I can examine one in
detail).
---
By skipping ldap and storing the entire master map in /etc/auto.master,
everything seems to work better. At least, the first problem goes away
entirely, and I haven't yet been able to reproduce the second problem.
With a local auto.master and no network, autofs starts up OK but
automount -m shows no data at all for the maps. This also surprises me
because I figured they'd be cached. As soon as the network returns the
maps appear.
Is this the normal behavior? Am I expecting sssd to cache things it
isn't supposed to be caching?
- J<
7 years, 8 months
stale config used when restarting sssd 1.13.0-40
by Chadwick Banning
I am joining a machine to a domain via Realmd and then filling out the SSSD config with a few more directives such as setting dyndns_update = false. Every once in a while, I'm finding that SSSD is using the old configuration even after restarting the service or starting it interactively.
Sanitized config:
[root@host]# cat /etc/sssd/sssd.conf
[domain/<domain.com>]
access_provider = simple
ad_domain = <domain.com>
ad_hostname = <host.domain.com>
cache_credentials = true
debug_level = 6
default_shell = /bin/bash
dyndns_update = false
fallback_homedir = /home/%u
id_provider = ad
krb5_realm = <DOMAIN.COM>
krb5_store_password_if_offline = true
ldap_id_mapping = true
realmd_tags = manages-system joined-with-adcli
simple_allow_groups = <group>
use_fully_qualified_names = false
[sssd]
config_file_version = 2
domains = <domain.com>
services = nss,pam
If I restart the service, all logs are blank under /var/log/sssd/* so it is not picking up the debug level in the config and I also have trouble logging in.
If I start the service interactively:
[root@host]# sssd -d 6 -i
...snip...
(Fri Mar 18 14:23:58 2016) [sssd[be[<domain.com>]]] [ad_failover_init] (0x0100): No primary servers defined, using service discovery
(Fri Mar 18 14:23:58 2016) [sssd[be[<domain.com>]]] [fo_add_srv_server] (0x0400): Adding new SRV server to service 'AD_GC' using 'tcp'.
(Fri Mar 18 14:23:58 2016) [sssd[be[<domain.com>]]] [fo_add_srv_server] (0x0400): Adding new SRV server to service 'AD' using 'tcp'.
(Fri Mar 18 14:23:58 2016) [sssd[be[<domain.com>]]] [_ad_servers_init] (0x0100): Added service discovery for AD
(Fri Mar 18 14:23:58 2016) [sssd[be[<domain.com>]]] [dp_get_options] (0x0400): Option dyndns_update is TRUE
(Fri Mar 18 14:23:58 2016) [sssd[be[<domain.com>]]] [dp_get_options] (0x0400): Option dyndns_refresh_interval has value 86400
(Fri Mar 18 14:23:58 2016) [sssd[be[<domain.com>]]] [dp_get_options] (0x0400): Option dyndns_iface has no value
(Fri Mar 18 14:23:58 2016) [sssd[be[<domain.com>]]] [dp_get_options] (0x0400): Option dyndns_ttl has value 3600
(Fri Mar 18 14:23:58 2016) [sssd[be[<domain.com>]]] [dp_get_options] (0x0400): Option dyndns_update_ptr is TRUE
(Fri Mar 18 14:23:58 2016) [sssd[be[<domain.com>]]] [dp_get_options] (0x0400): Option dyndns_force_tcp is FALSE
(Fri Mar 18 14:23:58 2016) [sssd[be[<domain.com>]]] [dp_get_options] (0x0400): Option dyndns_auth has value gss-tsig
(Fri Mar 18 14:23:58 2016) [sssd[be[<domain.com>]]] [dp_get_options] (0x0400): Option dyndns_server has no value
...snip...
It clearly sees dyndns_update as TRUE even though its set to false in the config. It remains stuck in this state until i remove /var/lib/sss/db/config.ldb and restart the service, after which everything is fine.
Is there any way for me to dig into why the config.ldb file would not be refreshed after config changes and service restart?
7 years, 8 months
accessing ldap through sssd
by Cyril Scetbon
Hi Guys,
I've made some tests and I have a few questions regarding sssd.
We were using pam_ldap and at first I thought that sssd could work with pam_ldap but I didn't find a way to make it work.
If I enable the debug mode in the pam section, I don't see anything. As sssd can query for the ldap password + do the caching, it may be the reason why they can't work together.
I've been able to make it work by putting my ldap configuration in the domain section and I've verified that if the ldap server becomes unavailable then sssd uses the password version it has cached
[sssd[be[default]]] [sdap_pam_auth_done] (0x0100): Password successfully cached for mouser
However, when the ldap server is available, I see that every time I try to log in, it does a ldap request instead of reusing the value it has cached :
[sssd[be[default]]] [sdap_get_generic_ext_step] (0x0400): calling ldap_search_ext with [(&(uid=myuser)(objectclass=posixAccount))][dc=fti,dc=net]
As entry_cache_timeout is set to 600 per default, I would expect sssd to only query the ldap every 600 seconds and use the cached value otherwise. What am I missing ?
I see sssd tries to access many attributes for my user and that some of them are missing. Can it be the reason it doesn't reuse the cache except if the ldap is offline ?
Thank you
--
Cyril
7 years, 8 months
sssd responders fail regularly on busy servers
by Patrick Coleman
Hi,
We run sssd to bind a number of machines to LDAP for auth. On a subset
of these machines, we have software that makes several thousand IPv6
route changes per second.
Recently, we found that on these hosts the sssd_nss responder process
fails several times a day[1], and will not recover until sssd is
restarted. strace[2] of the main sssd process indicates that sssd is
receiving many, many netlink messages - so many, in fact, that sssd
cannot process them fast enough and is receiving ENOBUFS from
recvmsg(2).
The messages that are received seem to get forwarded[3] to the sssd
responders over the unix socket and flood them until they fail.
From what I can see, the netlink code in
src/monitor/monitor_netlink.c:setup_netlink() subscribes to netlink
notifications with the aim of detecting things like wifi network
changes. This isn't something we'd find useful on our servers and
seems to have performance implications - is there any easy way of
turning off this functionality in sssd that I've missed?
We see this issue running sssd 1.11.7.
Cheers,
Patrick
1. The failures look something like this. I have replaced our sss
domain with "ourdomain"
/var/log/sssd/sssd_nss.log
(Tue Mar 22 02:58:01 2016) [sssd[nss]] [accept_fd_handler] (0x0100):
Client connected!
(Tue Mar 22 02:58:01 2016) [sssd[nss]] [nss_cmd_initgroups] (0x0100):
Requesting info for [systemuser] from [<ALL>]
(Tue Mar 22 02:58:01 2016) [sssd[nss]] [nss_cmd_initgroups_search]
(0x0100): Requesting info for [systemuser@ourdomain]
(Tue Mar 22 02:59:04 2016) [sssd[nss]]
[nss_cmd_initgroups_dp_callback] (0x0040): Unable to get information
from Data Provider
Error: 3, 5, (null)
Will try to return what we have in cache
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [server_setup] (0x0080):
CONFDB: /var/lib/sss/db/config.ldb
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [monitor_common_send_id]
(0x0100): Sending ID: (nss,1)
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [check_file] (0x0020): lstat
for [/var/lib/sss/pipes/private/sbus-dp_ourdomain] failed: [2][No such
file or directory].
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [sbus_client_init] (0x0020):
check_file failed for [/var/lib/sss/pipes/private/sbus-dp_ourdomain].
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [sss_dp_init] (0x0010): Failed
to connect to monitor services.
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [sss_process_init] (0x0010):
fatal error setting up backend connector
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [server_setup] (0x0080):
CONFDB: /var/lib/sss/db/config.ldb
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [monitor_common_send_id]
(0x0100): Sending ID: (nss,1)
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [check_file] (0x0020): lstat
for [/var/lib/sss/pipes/private/sbus-dp_ourdomain] failed: [2][No such
file or directory].
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [sbus_client_init] (0x0020):
check_file failed for [/var/lib/sss/pipes/private/sbus-dp_ourdomain].
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [sss_dp_init] (0x0010): Failed
to connect to monitor services.
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [sss_process_init] (0x0010):
fatal error setting up backend connector
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [server_setup] (0x0080):
CONFDB: /var/lib/sss/db/config.ldb
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [monitor_common_send_id]
(0x0100): Sending ID: (nss,1)
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [check_file] (0x0020): lstat
for [/var/lib/sss/pipes/private/sbus-dp_ourdomain] failed: [2][No such
file or directory].
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [sbus_client_init] (0x0020):
check_file failed for [/var/lib/sss/pipes/private/sbus-dp_ourdomain].
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [sss_dp_init] (0x0010): Failed
to connect to monitor services.
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [sss_process_init] (0x0010):
fatal error setting up backend connector
(no further messages until service restarts)
2. strace sssd -i
sendmsg(12, {msg_name(0)=NULL,
msg_iov(2)=[{"l\1\0\1\0\0\0\0M\10\0\0e\0\0\0\1\1o\0\35\0\0\0/org/fre"...,
120}, {"", 0}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 120
gettimeofday({1458663090, 595705}, NULL) = 0
epoll_wait(5, {{EPOLLIN, {u32=150507384, u64=150507384}}}, 1, 1563) = 1
recvmsg(12, {msg_name(0)=NULL,
msg_iov(1)=[{"l\2\1\1\0\0\0\0M\10\0\0\10\0\0\0\5\1u\0M\10\0\0\0\2\1\1\0\0\0\0"...,
2048}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC},
MSG_CMSG_CLOEXEC) = 24
gettimeofday({1458663090, 595978}, NULL) = 0
recvmsg(12, 0xffcdcac0, MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource
temporarily unavailable)
gettimeofday({1458663090, 596110}, NULL) = 0
gettimeofday({1458663090, 596151}, NULL) = 0
epoll_wait(5, {{EPOLLIN|EPOLLERR, {u32=150512424, u64=150512424}}}, 1, 1563) = 1
recvmsg(11, 0xffcdcb3c, 0) = -1 ENOBUFS (No buffer space available)
gettimeofday({1458663090, 596330}, NULL) = 0
stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2819, ...}) = 0
stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2819, ...}) = 0
write(2, "(Tue Mar 22 09:11:30 2016) [sssd"..., 65(Tue Mar 22 09:11:30
2016) [sssd] [netlink_fd_handler] (0x0020): ) = 65
write(2, "Error while reading from netlink"..., 36Error while reading
from netlink fd
) = 36
gettimeofday({1458663090, 596538}, NULL) = 0
epoll_wait(5, {{EPOLLIN, {u32=150512424, u64=150512424}}}, 1, 1563) = 1
recvmsg(11, {msg_name(12)={sa_family=AF_NETLINK, pid=0,
groups=00000400},
msg_iov(1)=[{"\224\0\0\0\31\0\0\0\0\0\0\0\0\0\0\0\n\200\0\0\1\0\0\1\0\2\0\0\10\0\17\0"...,
4096}], msg_controllen=24, {cmsg_len=24, cmsg_level=SOL_SOCKET,
cmsg_type=SCM_CREDENTIALS{pid=0, uid=0, gid=0}}, msg_flags=0}, 0) =
148
gettimeofday({1458663090, 596679}, NULL) = 0
sendmsg(12, {msg_name(0)=NULL,
msg_iov(2)=[{"l\1\0\1\0\0\0\0N\10\0\0e\0\0\0\1\1o\0\35\0\0\0/org/fre"...,
120}, {"", 0}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 120
gettimeofday({1458663090, 596777}, NULL) = 0
3. /var/log/sssd/sssd.log
(Wed Mar 23 06:59:07 2016) [sssd] [message_type] (0x0200): netlink
Message type: 24
(Wed Mar 23 06:59:07 2016) [sssd] [route_msg_debug_print] (0x1000):
route idx 209591 flags 0X200 family 10 addr
fd0a:9b09:1f7:0:218:aff:fe34:3d08/128
(Wed Mar 23 06:59:07 2016) [sssd] [network_status_change_cb] (0x2000):
A networking status change detected signaling providers to reset
offline status
(Wed Mar 23 06:59:07 2016) [sssd] [service_signal] (0x0020): Could not
signal service [ourdomain].
(Wed Mar 23 06:59:07 2016) [sssd] [message_type] (0x0200): netlink
Message type: 24
(Wed Mar 23 06:59:07 2016) [sssd] [route_msg_debug_print] (0x1000):
route idx 209591 flags 0X200 family 10 addr
fd0a:9b09:1f7:0:218:aff:fe33:3b66/128
(Wed Mar 23 06:59:07 2016) [sssd] [network_status_change_cb] (0x2000):
A networking status change detected signaling providers to reset
offline status
(Wed Mar 23 06:59:07 2016) [sssd] [service_signal] (0x0020): Could not
signal service [ourdomain].
(Wed Mar 23 06:59:07 2016) [sssd] [message_type] (0x0200): netlink
Message type: 24
$ grep network_status_change_cb sssd.log | grep '06:59:12' | wc -l
1245
7 years, 8 months
sssd responders fail regularly on busy servers
by Patrick Coleman
Hi,
We run sssd to bind a number of machines to LDAP for auth. On a subset
of these machines, we have software that makes several thousand IPv6
route changes per second.
Recently, we found that on these hosts the sssd_nss responder process
fails several times a day[1], and will not recover until sssd is
restarted. strace[2] of the main sssd process indicates that sssd is
receiving many, many netlink messages - so many, in fact, that sssd
cannot process them fast enough and is receiving ENOBUFS from
recvmsg(2).
The messages that are received seem to get forwarded[3] to the sssd
responders over the unix socket and flood them until they fail.
From what I can see, the netlink code in
src/monitor/monitor_netlink.c:setup_netlink() subscribes to netlink
notifications with the aim of detecting things like wifi network
changes. This isn't something we'd find useful on our servers and
seems to have performance implications - is there any easy way of
turning off this functionality in sssd that I've missed?
We see this issue running sssd 1.11.7.
Cheers,
Patrick
1. The failures look something like this. I have replaced our sss
domain with "ourdomain"
/var/log/sssd/sssd_nss.log
(Tue Mar 22 02:58:01 2016) [sssd[nss]] [accept_fd_handler] (0x0100):
Client connected!
(Tue Mar 22 02:58:01 2016) [sssd[nss]] [nss_cmd_initgroups] (0x0100):
Requesting info for [systemuser] from [<ALL>]
(Tue Mar 22 02:58:01 2016) [sssd[nss]] [nss_cmd_initgroups_search]
(0x0100): Requesting info for [systemuser@ourdomain]
(Tue Mar 22 02:59:04 2016) [sssd[nss]]
[nss_cmd_initgroups_dp_callback] (0x0040): Unable to get information
from Data Provider
Error: 3, 5, (null)
Will try to return what we have in cache
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [server_setup] (0x0080):
CONFDB: /var/lib/sss/db/config.ldb
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [monitor_common_send_id]
(0x0100): Sending ID: (nss,1)
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [check_file] (0x0020): lstat
for [/var/lib/sss/pipes/private/sbus-dp_ourdomain] failed: [2][No such
file or directory].
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [sbus_client_init] (0x0020):
check_file failed for [/var/lib/sss/pipes/private/sbus-dp_ourdomain].
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [sss_dp_init] (0x0010): Failed
to connect to monitor services.
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [sss_process_init] (0x0010):
fatal error setting up backend connector
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [server_setup] (0x0080):
CONFDB: /var/lib/sss/db/config.ldb
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [monitor_common_send_id]
(0x0100): Sending ID: (nss,1)
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [check_file] (0x0020): lstat
for [/var/lib/sss/pipes/private/sbus-dp_ourdomain] failed: [2][No such
file or directory].
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [sbus_client_init] (0x0020):
check_file failed for [/var/lib/sss/pipes/private/sbus-dp_ourdomain].
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [sss_dp_init] (0x0010): Failed
to connect to monitor services.
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [sss_process_init] (0x0010):
fatal error setting up backend connector
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [server_setup] (0x0080):
CONFDB: /var/lib/sss/db/config.ldb
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [monitor_common_send_id]
(0x0100): Sending ID: (nss,1)
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [check_file] (0x0020): lstat
for [/var/lib/sss/pipes/private/sbus-dp_ourdomain] failed: [2][No such
file or directory].
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [sbus_client_init] (0x0020):
check_file failed for [/var/lib/sss/pipes/private/sbus-dp_ourdomain].
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [sss_dp_init] (0x0010): Failed
to connect to monitor services.
(Tue Mar 22 02:59:14 2016) [sssd[nss]] [sss_process_init] (0x0010):
fatal error setting up backend connector
(no further messages until service restarts)
2. strace sssd -i
sendmsg(12, {msg_name(0)=NULL,
msg_iov(2)=[{"l\1\0\1\0\0\0\0M\10\0\0e\0\0\0\1\1o\0\35\0\0\0/org/fre"...,
120}, {"", 0}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 120
gettimeofday({1458663090, 595705}, NULL) = 0
epoll_wait(5, {{EPOLLIN, {u32=150507384, u64=150507384}}}, 1, 1563) = 1
recvmsg(12, {msg_name(0)=NULL,
msg_iov(1)=[{"l\2\1\1\0\0\0\0M\10\0\0\10\0\0\0\5\1u\0M\10\0\0\0\2\1\1\0\0\0\0"...,
2048}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC},
MSG_CMSG_CLOEXEC) = 24
gettimeofday({1458663090, 595978}, NULL) = 0
recvmsg(12, 0xffcdcac0, MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource
temporarily unavailable)
gettimeofday({1458663090, 596110}, NULL) = 0
gettimeofday({1458663090, 596151}, NULL) = 0
epoll_wait(5, {{EPOLLIN|EPOLLERR, {u32=150512424, u64=150512424}}}, 1, 1563) = 1
recvmsg(11, 0xffcdcb3c, 0) = -1 ENOBUFS (No buffer space available)
gettimeofday({1458663090, 596330}, NULL) = 0
stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2819, ...}) = 0
stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2819, ...}) = 0
write(2, "(Tue Mar 22 09:11:30 2016) [sssd"..., 65(Tue Mar 22 09:11:30
2016) [sssd] [netlink_fd_handler] (0x0020): ) = 65
write(2, "Error while reading from netlink"..., 36Error while reading
from netlink fd
) = 36
gettimeofday({1458663090, 596538}, NULL) = 0
epoll_wait(5, {{EPOLLIN, {u32=150512424, u64=150512424}}}, 1, 1563) = 1
recvmsg(11, {msg_name(12)={sa_family=AF_NETLINK, pid=0,
groups=00000400},
msg_iov(1)=[{"\224\0\0\0\31\0\0\0\0\0\0\0\0\0\0\0\n\200\0\0\1\0\0\1\0\2\0\0\10\0\17\0"...,
4096}], msg_controllen=24, {cmsg_len=24, cmsg_level=SOL_SOCKET,
cmsg_type=SCM_CREDENTIALS{pid=0, uid=0, gid=0}}, msg_flags=0}, 0) =
148
gettimeofday({1458663090, 596679}, NULL) = 0
sendmsg(12, {msg_name(0)=NULL,
msg_iov(2)=[{"l\1\0\1\0\0\0\0N\10\0\0e\0\0\0\1\1o\0\35\0\0\0/org/fre"...,
120}, {"", 0}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 120
gettimeofday({1458663090, 596777}, NULL) = 0
3. /var/log/sssd/sssd.log
(Wed Mar 23 06:59:07 2016) [sssd] [message_type] (0x0200): netlink
Message type: 24
(Wed Mar 23 06:59:07 2016) [sssd] [route_msg_debug_print] (0x1000):
route idx 209591 flags 0X200 family 10 addr
fd0a:9b09:1f7:0:218:aff:fe34:3d08/128
(Wed Mar 23 06:59:07 2016) [sssd] [network_status_change_cb] (0x2000):
A networking status change detected signaling providers to reset
offline status
(Wed Mar 23 06:59:07 2016) [sssd] [service_signal] (0x0020): Could not
signal service [ourdomain].
(Wed Mar 23 06:59:07 2016) [sssd] [message_type] (0x0200): netlink
Message type: 24
(Wed Mar 23 06:59:07 2016) [sssd] [route_msg_debug_print] (0x1000):
route idx 209591 flags 0X200 family 10 addr
fd0a:9b09:1f7:0:218:aff:fe33:3b66/128
(Wed Mar 23 06:59:07 2016) [sssd] [network_status_change_cb] (0x2000):
A networking status change detected signaling providers to reset
offline status
(Wed Mar 23 06:59:07 2016) [sssd] [service_signal] (0x0020): Could not
signal service [ourdomain].
(Wed Mar 23 06:59:07 2016) [sssd] [message_type] (0x0200): netlink
Message type: 24
$ grep network_status_change_cb sssd.log | grep '06:59:12' | wc -l
1245
7 years, 8 months
Re: sssd-ad login failures after certain period (Striker Leggette)
by Christoph Kaunzner
Hi Striker,
yes, I used this method (restart, clear db) so far to resolve the problem.
Thanks for the info about id_provider, will adjust the config.
I will report again once I have a debug log when this issues happens.
Thanks,
Christoph
On 03/21/2016 12:49 PM, Striker Legette wrote:
> Hi,
>
> If it works right after joining, then I wouldn't expect anything in your
> config. would be wrong. 'debug_level = 7' in your [domain] section will
> tell more. Does the following command clear the issue for some time?
>
> # service sssd stop ; rm -rf /var/lib/sss/db/* ; service sssd start
>
> Also, you have some duplicates in your config. Since 'auth_provider'
> and 'chpass_provider' is the same as 'id_provider', you do not actually
> need to specify them.
>
> Striker
7 years, 8 months