Kevin Vasko wrote:
Thanks. That is actually one of the other instances I had trouble
with a
very similar type of experience. I felt I was constantly resetting
gssproxy, rpc-gssd services but it would never automatically pick up the
keytab. Left for the day and the very next day ran kinit to pick up
where I left off and it had picked up they keytab and was showing it
expires in like the year 2999.
Granted half the time I don't even know how to begin to "reproduce" the
problem because it's so sporadic. I just constantly tweak, try to reset
and start over, and eventually it ends up working. I never find the
"root cause", it just magically starts working and it works until I have
to update something and then rinse and repeat. Prime example: I dread
touching our EMC Unity system with updates because without fail it
always breaks the kerberos authentication, but we had to update it for
patches this past week. We haven't touched this thing in 6+ months and
it was working properly before the update. Literally no configuration to
the LDAP/Kerberos settings. We simply uploaded the new updated OS and
applied its patch. Sure enough after the reboot all of the Kerberos
stuff is failing saying it can't reach the LDAP servers. Looking at the
logs shows
```
1681140357: LDAP: 6: LdapService::connect: Connection to Ldap server
X.X.X.X SUCCEEDED IP[0/1]=X.X.X.X port=389
1681140357: LDAP: 3: LdapGssAuthenticator::evaluate: gss_unwrap failed -
GSS-API major error: A token had an invalid signature
1681140357: LDAP: 3: LdapGssAuthenticator::evaluate: gss_unwrap failed -
GSS-API minor error: No error
```
I changed it from Kerberos authentication to "Simple" rather than
Kerberos using a DN e.g.
(uid=testuser,cn=users,cn=accounts,dc=example,dc=com). On Friday it kept
failing on my client, with "mount.nfs4: access denied by server",
(wouldn't even mount). Randomly tried it on a different client not
changing a thing and that client mounted it successfully. Left on
Friday, came back on Monday tried the mount and sure enough my client
worked (I was kdestroying on my client, albeit wasn't restarting
gssproxy or rpc-gssd but had restarted the machine that day after I
changed it to Simple). Literally could see the last command I ran on
Friday was the "mount" command and it failed, hit up on command prompt
and ran it again, and it succeeded.
Most of the clients are Ubuntu, so I'm not sure if this has anything to
do with their packages versus RH based distros and how they interoperate.
If kdestroy, gssproxy, rpc-gssd, sssd indeed are the _only_ things
caching anything and a restart of those services should refresh those
caches, the next time I set one of these systems up, I'm going to
document what I can. I always dread making any changes to kerberos
because of how temperamental it is. Are there any other services I
should be looking at restarting/resetting when dealing with this topic?
I don't believe gssproxy removes all ccaches on restart. You might try
actively removing the ccaches from /var/lib/gssproxy/clients and/or
confirm that they aren't being removed on restart.
rob
On Mon, Apr 10, 2023 at 11:08 AM Rob Crittenden <rcritten(a)redhat.com
<mailto:rcritten@redhat.com>> wrote:
Kevin Vasko via FreeIPA-users wrote:
> Hello,
>
> Does anyone have any tips for completely refreshing (forcing cleaning)
> all kerberos tickets on a client from FreeIPA?
>
> I assumed "$ kdestroy -A" should do it, but it certainly doesn't
> completely clear all caches.
>
> What I'm having trouble with is some NFS/NAS servers using kerberos.
> I'll set up a new NFS server with Kerberos, the server will have their
> appropriate keytab and services created.
>
> I'll make sure and clear my local cache on my client with "$ kdestroy
> -A", and then connect to the NFS server. If for some reason I have
> something misconfigured (e.g. time is off) I'll obviously get a "stale
> file handle" or "mount.nfs4: access denied by server". At that
point
> I'll correct the issue on the server/client. However, I'll continue
> getting the error even though I destroy the cache. I _know_ its a
cache
> issue _somewhere_ because it will randomly start working (e.g. it will
> be failing, leave for the day and next morning it will mount no
problem)
> OR I'll try it on a different client and it will mount
successfully. It
> seems so sporadic. I've even been in the situation where I've
> purposefully removed keytabs, LDAP login access and reset the cache on
> the client on systems the and NFS mount has still worked. It will
> continue to work when it shouldn't as I've removed keytab or
> authentications so obviously something is cached.
>
> Is there a foolproof list of things I need to do to reset the
cache(es)?
> kdestroy, services on client and server? Is there a potential force 15
> min TTL or something somewhere I'm missing?
It is probably gssproxy holding the credentials. See
https://pagure.io/gssproxy/blob/master/f/docs/NFS.md
rob