I hope someone can help. I have an odd issue I haven't seen before. I've
done a lot of checking under the hood, but I'm stuck.
We have hundreds of systems using the v2.9+ of the daemon (AD and LDAP
providers). We're deploying a new HPC cluster using Rocky Linux 9
containers (all other systems are RHEL 8/9) as stateless compute nodes.
These nodes are ephemeral so we use the LDAP providers.
The observed issue is the daemons load and run as expected. The nodes mount
NFS file systems but do not resolve file and directory ownerships for LDAP
users until I manually run a "getent" or "id" on any user or group. It
doesn't even have to be a user or group that owns files. So any type of
NSS lookup seems to kick start the process. From there the node is fine.
libnfs, libnss, sssd-nfs-idmap, libsss_nss_idmap, etc are all the same
on nodes that don't do this.
DNS works, there's no difference in daemon configurations from working
ones. systemd unit files are identical, etc. I cannot figure out why these
nodes need to be poked by NSS to start using NSS. Very peculiar.
Any insight would be appreciated,
-- lawrence