sssd-users,
I'm having an odd problem with autmount, that seems to be specific to using sssd for autofs. The operating system is Springdale Open Enterprise Linux 9.1, which is a rebuild of RHEL maintained by Princeton University, so it should be 100% bug compatible with RHEL (and Rocky).
My configuration is using Kerberos for auth, and LDAP directory services. When my nsswitch.conf entry for automount looks like this:
automount: sss files
Testing the configuration of automount/sssdwith 'automount -m' leads to a segfault:
# automount -m
autofs dump map information ===========================
global options: none configured
Mount point: /p
source(s): Segmentation fault (core dumped)
If I change nsswitch.conf to use files only or use ldap like this:
automount: files
or this:
automount files ldap
Everything works as expected. LDAP searches using ldapsearch works just fine, and using getent to get user and group information (which is stored in LDAP) works just fine. I've increased the debugging levels for the relevant SSSD daemons:
# egrep '^[|debug_level' /etc/sssd/sssd.conf [domain/PPPL] debug_level = 8 [sssd] debug_level = 8 [nss] debug_level = 8 [pam] [autofs] debug_level = 8
Looking in the related log files the logs for my default domain show that it is getting information from the LDAP directory, and then it fails saying it can't contact the LDAP server:
(2023-01-24 16:01:31): [be[default]] [sysdb_set_entry_attr] (0x0200): [RID#6] Entry [name=/lldap:ou\3Dauto.local,ou\3Dmounts,dc\3Dunix,dc\3Dpppl,dc\3Dgov,name=auto.master,cn=autofsmaps,cn=custom,cn=default,cn=sysdb] has set [cache] attrs. (2023-01-24 16:01:31): [be[default]] [sysdb_entry_attrs_diff] (0x0400): [RID#6] Entry [name=/pfsldap:ou\3Dauto.pfs,ou\3Dmounts,dc\3Dunix,dc\3Dpppl,dc\3Dgov,name=auto.master,cn=autofsmaps,cn=custom,cn=default,cn=sysdb] differs, reason: ts_cache doesn't trace this type of entry. (2023-01-24 16:01:31): [be[default]] [sysdb_set_entry_attr] (0x0200): [RID#6] Entry [name=/pfsldap:ou\3Dauto.pfs,ou\3Dmounts,dc\3Dunix,dc\3Dpppl,dc\3Dgov,name=auto.master,cn=autofsmaps,cn=custom,cn=default,cn=sysdb] has set [cache] attrs. (2023-01-24 16:01:31): [be[default]] [fo_resolve_service_send] (0x0100): [RID#6] Trying to resolve service 'LDAP' (2023-01-24 16:01:31): [be[default]] [get_server_status] (0x1000): [RID#6] Status of server 'host-a.pppl.gov' is 'working' (2023-01-24 16:01:31): [be[default]] [get_port_status] (0x1000): [RID#6] Port status of port 389 for server 'host-a.pppl.gov' is 'not working'
Not only do the earlier log file entries show that sssd_bes actually getting data from LDAP before it reports an error, but I can run queries from this machine to our LDAP server with 'ldapsearch', and all the other computers in our environment, which are running CentOS 7 or Rocky 8 using the same configuration files.
Hi,
might be https://github.com/SSSD/sssd/issues/6505 / https://bugzilla.redhat.com/show_bug.cgi?id=2143159 (difficult to confirm without coredump/backtraces).
Should be fixed in C9S/9.2 ( https://composes.stream.centos.org/development/latest-CentOS-Stream/compose/... ...)
On Tue, Jan 24, 2023 at 10:13 PM Prentice Bisbal pbisbal@pppl.gov wrote:
sssd-users,
I'm having an odd problem with autmount, that seems to be specific to using sssd for autofs. The operating system is Springdale Open Enterprise Linux 9.1, which is a rebuild of RHEL maintained by Princeton University, so it should be 100% bug compatible with RHEL (and Rocky).
My configuration is using Kerberos for auth, and LDAP directory services. When my nsswitch.conf entry for automount looks like this:
automount: sss files
Testing the configuration of automount/sssdwith 'automount -m' leads to a segfault:
# automount -m
autofs dump map information
global options: none configured
Mount point: /p
source(s): Segmentation fault (core dumped)
If I change nsswitch.conf to use files only or use ldap like this:
automount: files
or this:
automount files ldap
Everything works as expected. LDAP searches using ldapsearch works just fine, and using getent to get user and group information (which is stored in LDAP) works just fine. I've increased the debugging levels for the relevant SSSD daemons:
# egrep '^[|debug_level' /etc/sssd/sssd.conf [domain/PPPL] debug_level = 8 [sssd] debug_level = 8 [nss] debug_level = 8 [pam] [autofs] debug_level = 8
Looking in the related log files the logs for my default domain show that it is getting information from the LDAP directory, and then it fails saying it can't contact the LDAP server:
(2023-01-24 16:01:31): [be[default]] [sysdb_set_entry_attr] (0x0200): [RID#6] Entry [name=/lldap:ou\3Dauto.local,ou\3Dmounts,dc\3Dunix,dc\3Dpppl,dc\3Dgov,name=auto.master,cn=autofsmaps,cn=custom,cn=default,cn=sysdb]
has set [cache] attrs. (2023-01-24 16:01:31): [be[default]] [sysdb_entry_attrs_diff] (0x0400): [RID#6] Entry [name=/pfsldap:ou\3Dauto.pfs,ou\3Dmounts,dc\3Dunix,dc\3Dpppl,dc\3Dgov,name=auto.master,cn=autofsmaps,cn=custom,cn=default,cn=sysdb]
differs, reason: ts_cache doesn't trace this type of entry. (2023-01-24 16:01:31): [be[default]] [sysdb_set_entry_attr] (0x0200): [RID#6] Entry [name=/pfsldap:ou\3Dauto.pfs,ou\3Dmounts,dc\3Dunix,dc\3Dpppl,dc\3Dgov,name=auto.master,cn=autofsmaps,cn=custom,cn=default,cn=sysdb]
has set [cache] attrs. (2023-01-24 16:01:31): [be[default]] [fo_resolve_service_send] (0x0100): [RID#6] Trying to resolve service 'LDAP' (2023-01-24 16:01:31): [be[default]] [get_server_status] (0x1000): [RID#6] Status of server 'host-a.pppl.gov' is 'working' (2023-01-24 16:01:31): [be[default]] [get_port_status] (0x1000): [RID#6] Port status of port 389 for server 'host-a.pppl.gov' is 'not working'
Not only do the earlier log file entries show that sssd_bes actually getting data from LDAP before it reports an error, but I can run queries from this machine to our LDAP server with 'ldapsearch', and all the other computers in our environment, which are running CentOS 7 or Rocky 8 using the same configuration files.
-- Prentice _______________________________________________ sssd-users mailing list -- sssd-users@lists.fedorahosted.org To unsubscribe send an email to sssd-users-leave@lists.fedorahosted.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedorahosted.org/archives/list/sssd-users@lists.fedorahosted.o... Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
(difficult to confirm without coredump/backtraces).
Would a stack trace of automount or the sssd daemon be sufficient? If so, which sssd daemon should I trace: sssd_be, sssd_autofs, sssd?
Here's what I see when I do an strace of 'automount -m':
newfstatat(AT_FDCWD, "/etc/nsswitch.conf", {st_mode=S_IFREG|0644, st_size=2980, ...}, 0) = 0 newfstatat(AT_FDCWD, "/etc/nsswitch.conf", {st_mode=S_IFREG|0644, st_size=2980, ...}, 0) = 0 openat(AT_FDCWD, "/etc/group", O_RDONLY|O_CLOEXEC) = 7 newfstatat(7, "", {st_mode=S_IFREG|0644, st_size=653, ...}, AT_EMPTY_PATH) = 0 lseek(7, 0, SEEK_SET) = 0 read(7, "root:x:0:\nbin:x:1:\ndaemon:x:2:\ns"..., 4096) = 653 read(7, "", 4096) = 0 close(7) = 0 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x8} --- +++ killed by SIGSEGV (core dumped) +++ Segmentation fault (core dumped)
Here's what I see when I attach strace to sssd_autofs:
recvfrom(18, "", 1536, 0, NULL, NULL) = 0 write(0, "(2023-01-25 10:52:46): [autofs] "..., 85) = 85 getpid() = 1348 epoll_ctl(3, EPOLL_CTL_DEL, 18, 0x7fff6d209c8c) = 0 close(18) = 0 write(0, "(2023-01-25 10:52:46): [autofs] "..., 107) = 107 getpid() = 1348 epoll_wait(3, 0x7fff6d209e0c, 1, 2171) = -1 EINTR (Interrupted system call) --- SIGRT_2 {si_signo=SIGRT_2, si_code=SI_TIMER, si_timerid=0, si_overrun=0, si_int=1165744160, si_ptr=0x7efc457bd820} --- rt_sigreturn({mask=[INT FPE USR1 USR2 PIPE]}) = -1 EINTR (Interrupted system call) getpid() = 1348 epoll_wait(3, [], 1, 420)
and from an strace of sss_nsss:
write(0, "(2023-01-25 10:54:12): [nss] [cl"..., 83) = 83 getpid() = 1345 epoll_ctl(3, EPOLL_CTL_DEL, 22, 0x7ffdca2b446c) = 0 close(22) = 0 write(0, "(2023-01-25 10:54:12): [nss] [cl"..., 105) = 105 getpid() = 1345 epoll_wait(3, 0x7ffdca2b45ec, 1, 5949) = -1 EINTR (Interrupted system call) --- SIGRT_2 {si_signo=SIGRT_2, si_code=SI_TIMER, si_timerid=0, si_overrun=0, si_int=959572000, si_ptr=0x7fb93931e820} --- rt_sigreturn({mask=[INT FPE USR1 USR2 PIPE]}) = -1 EINTR (Interrupted system call) getpid() = 1345 epoll_wait(3, [], 1, 414) = 0 getpid() = 1345 epoll_wait(3,
And from sss_be:
write(0, "(2023-01-25 11:11:23): [be[defau"..., 100) = 100 getpid() = 1251 epoll_wait(3, [{events=EPOLLIN, data={u32=2661562720, u64=94882784098656}}], 1, 5775) = 1 recvmsg(17, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\2\1\1\0\0\0\0A\0\0\0 \0\0\0\6\1s\0\v\0\0\0sssd.aut"..., iov_len=2048}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 48 recvmsg(17, {msg_namelen=0}, MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable) sendmsg(21, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\2\1\1\0\0\0\0A\0\0\0<\0\0\0\6\1s\0\v\0\0\0sssd.aut"..., iov_len=80}, {iov_base="", iov_len=0}], msg_iovlen=2, msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 80 getpid() = 1251 epoll_wait(3, 0x7ffd39a53ccc, 1, 5775) = -1 EINTR (Interrupted system call) --- SIGRT_2 {si_signo=SIGRT_2, si_code=SI_TIMER, si_timerid=0, si_overrun=0, si_int=-186955744, si_ptr=0x7fa6f4db4820} --- rt_sigreturn({mask=[INT FPE PIPE]}) = -1 EINTR (Interrupted system call) getpid() = 1251 epoll_wait(3, [], 1, 872) = 0 getpid() = 1251
I tried to compile the trivial reproducer from the GitHub Issue you linked to, but I'm getting an error when I try to compile it:
# gcc sssd_tester.c -o sssd_tester -lsss_nss_idmap -lpthread -ldl sssd_tester.c: In function ‘thread’: sssd_tester.c:12:5: warning: implicit declaration of function ‘sss_getpwnam’; did you mean ‘getpwnam’? [-Wimplicit-function-declaration] 12 | sss_getpwnam("test", &res, buff, sizeof(buff), &errnop); | ^~~~~~~~~~~~ | getpwnam
It looks like sss_getpwnam isn't defined in any of the header files in /usr/include, or I just can't find what package provides it.
Prentice
On 1/25/23 4:30 AM, Alexey Tikhonov wrote:
Hi,
might be https://github.com/SSSD/sssd/issues/6505 / https://bugzilla.redhat.com/show_bug.cgi?id=2143159 (difficult to confirm without coredump/backtraces).
Should be fixed in C9S/9.2 (https://composes.stream.centos.org/development/latest-CentOS-Stream/compose/... ...)
On Tue, Jan 24, 2023 at 10:13 PM Prentice Bisbal pbisbal@pppl.gov wrote:
sssd-users, I'm having an odd problem with autmount, that seems to be specific to using sssd for autofs. The operating system is Springdale Open Enterprise Linux 9.1, which is a rebuild of RHEL maintained by Princeton University, so it should be 100% bug compatible with RHEL (and Rocky). My configuration is using Kerberos for auth, and LDAP directory services. When my nsswitch.conf entry for automount looks like this: automount: sss files Testing the configuration of automount/sssdwith 'automount -m' leads to a segfault: # automount -m autofs dump map information =========================== global options: none configured Mount point: /p source(s): Segmentation fault (core dumped) If I change nsswitch.conf to use files only or use ldap like this: automount: files or this: automount files ldap Everything works as expected. LDAP searches using ldapsearch works just fine, and using getent to get user and group information (which is stored in LDAP) works just fine. I've increased the debugging levels for the relevant SSSD daemons: # egrep '^\[|debug_level' /etc/sssd/sssd.conf [domain/PPPL] debug_level = 8 [sssd] debug_level = 8 [nss] debug_level = 8 [pam] [autofs] debug_level = 8 Looking in the related log files the logs for my default domain show that it is getting information from the LDAP directory, and then it fails saying it can't contact the LDAP server: (2023-01-24 16:01:31): [be[default]] [sysdb_set_entry_attr] (0x0200): [RID#6] Entry [name=/lldap:ou\3Dauto.local\,ou\3Dmounts\,dc\3Dunix\,dc\3Dpppl\,dc\3Dgov,name=auto.master,cn=autofsmaps,cn=custom,cn=default,cn=sysdb] has set [cache] attrs. (2023-01-24 16:01:31): [be[default]] [sysdb_entry_attrs_diff] (0x0400): [RID#6] Entry [name=/pfsldap:ou\3Dauto.pfs\,ou\3Dmounts\,dc\3Dunix\,dc\3Dpppl\,dc\3Dgov,name=auto.master,cn=autofsmaps,cn=custom,cn=default,cn=sysdb] differs, reason: ts_cache doesn't trace this type of entry. (2023-01-24 16:01:31): [be[default]] [sysdb_set_entry_attr] (0x0200): [RID#6] Entry [name=/pfsldap:ou\3Dauto.pfs\,ou\3Dmounts\,dc\3Dunix\,dc\3Dpppl\,dc\3Dgov,name=auto.master,cn=autofsmaps,cn=custom,cn=default,cn=sysdb] has set [cache] attrs. (2023-01-24 16:01:31): [be[default]] [fo_resolve_service_send] (0x0100): [RID#6] Trying to resolve service 'LDAP' (2023-01-24 16:01:31): [be[default]] [get_server_status] (0x1000): [RID#6] Status of server 'host-a.pppl.gov <http://host-a.pppl.gov>' is 'working' (2023-01-24 16:01:31): [be[default]] [get_port_status] (0x1000): [RID#6] Port status of port 389 for server 'host-a.pppl.gov <http://host-a.pppl.gov>' is 'not working' Not only do the earlier log file entries show that sssd_bes actually getting data from LDAP before it reports an error, but I can run queries from this machine to our LDAP server with 'ldapsearch', and all the other computers in our environment, which are running CentOS 7 or Rocky 8 using the same configuration files. -- Prentice _______________________________________________ sssd-users mailing list -- sssd-users@lists.fedorahosted.org To unsubscribe send an email to sssd-users-leave@lists.fedorahosted.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedorahosted.org/archives/list/sssd-users@lists.fedorahosted.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
sssd-users mailing list --sssd-users@lists.fedorahosted.org To unsubscribe send an email tosssd-users-leave@lists.fedorahosted.org Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives:https://lists.fedorahosted.org/archives/list/sssd-users@lists.fedorahosted.o... Do not reply to spam, report it:https://pagure.io/fedora-infrastructure/new_issue
On Wed, Jan 25, 2023 at 5:34 PM Prentice Bisbal pbisbal@pppl.gov wrote:
(difficult to confirm without coredump/backtraces).
Would a stack trace of automount or the sssd daemon be sufficient?
No, I meant backtrace from a coredump. Maybe `ltrace` can help also, but I doubt `strace` is handy here.
I tried to compile the trivial reproducer from the GitHub Issue you linked to, but I'm getting an error when I try to compile it:
# gcc sssd_tester.c -o sssd_tester -lsss_nss_idmap -lpthread -ldl sssd_tester.c: In function ‘thread’: sssd_tester.c:12:5: warning: implicit declaration of function ‘sss_getpwnam’; did you mean ‘getpwnam’? [-Wimplicit-function-declaration] 12 | sss_getpwnam("test", &res, buff, sizeof(buff), &errnop); | ^~~~~~~~~~~~ | getpwnam
It looks like sss_getpwnam isn't defined in any of the header files in /usr/include, or I just can't find what package provides it.
Link against `/lib64/libnss_sss.so.2`: ``` $ nm -D /lib64/libnss_sss.so.2 | grep sss_getpwnam 0000000000005f10 T _nss_sss_getpwnam_r@@EXPORTED ``` (no need to link against `sss_nss_idmap`)
But why? It won't help to assure you face the same issue.
On Wed, Jan 25, 2023 at 10:40 PM Alexey Tikhonov atikhono@redhat.com wrote:
On Wed, Jan 25, 2023 at 5:34 PM Prentice Bisbal pbisbal@pppl.gov wrote:
(difficult to confirm without coredump/backtraces).
Would a stack trace of automount or the sssd daemon be sufficient?
No, I meant backtrace from a coredump. Maybe `ltrace` can help also, but I doubt `strace` is handy here.
I tried to compile the trivial reproducer from the GitHub Issue you linked to, but I'm getting an error when I try to compile it:
# gcc sssd_tester.c -o sssd_tester -lsss_nss_idmap -lpthread -ldl sssd_tester.c: In function ‘thread’: sssd_tester.c:12:5: warning: implicit declaration of function ‘sss_getpwnam’; did you mean ‘getpwnam’? [-Wimplicit-function-declaration] 12 | sss_getpwnam("test", &res, buff, sizeof(buff), &errnop); | ^~~~~~~~~~~~ | getpwnam
It looks like sss_getpwnam isn't defined in any of the header files in /usr/include, or I just can't find what package provides it.
Link against `/lib64/libnss_sss.so.2`:
Sorry, no need, of course: sss_getpwnam = dlsym(h, "_nss_sss_getpwnam_r");
I just accidentally cut off a few beginning lines of a reproducer, while pasting to github: includes and `sss_getpwnam` variable (pointer to function) declaration.
Something like: ``` typedef int (*fp)(const char *name, struct passwd *result, char *buffer, size_t buflen, int *errnop);
fp sss_getpwnam; ```
$ nm -D /lib64/libnss_sss.so.2 | grep sss_getpwnam 0000000000005f10 T _nss_sss_getpwnam_r@@EXPORTED
(no need to link against `sss_nss_idmap`)
But why? It won't help to assure you face the same issue.
On 1/25/23 4:40 PM, Alexey Tikhonov wrote:
On Wed, Jan 25, 2023 at 5:34 PM Prentice Bisbal pbisbal@pppl.gov wrote:
(difficult to confirm without coredump/backtraces).
Would a stack trace of automount or the sssd daemon be sufficient?
No, I meant backtrace from a coredump. Maybe `ltrace` can help also, but I doubt `strace` is handy here.
Alexey,
I forwarded your previous e-mail to my colleague who maintains the Springdale Linux Distro, and he applied the patch from the SSSD bug report you linked to and built updated RPMs for me to use. I updated my SSSD packages, and everything is working just fine now.
Thanks for your help.
Prentice
On Fri, Jan 27, 2023 at 7:16 PM Prentice Bisbal pbisbal@pppl.gov wrote:
On 1/25/23 4:40 PM, Alexey Tikhonov wrote:
On Wed, Jan 25, 2023 at 5:34 PM Prentice Bisbal pbisbal@pppl.gov wrote:
(difficult to confirm without coredump/backtraces).
Would a stack trace of automount or the sssd daemon be sufficient?
No, I meant backtrace from a coredump. Maybe `ltrace` can help also, but I doubt `strace` is handy here.
Alexey,
I forwarded your previous e-mail to my colleague who maintains the Springdale Linux Distro, and he applied the patch from the SSSD bug report you linked to and built updated RPMs for me to use. I updated my SSSD packages, and everything is working just fine now.
Thanks for your help.
Thank you for the confirmation.
sssd-users@lists.fedorahosted.org