Ok, this is *very* illuminating!
I see this in sssd_amer.company.com.log"
(2021-09-01 3:44:46): [be[amer.company.com]]
[ad_machine_account_password_renewal_done] (0x1000): --- adcli output
start---
adcli: couldn't connect to
amer.company.com domain: Couldn't authenticate
as machine account: ZZZKBTDURBOL8: Preauthentication failed
---adcli output end---
However, I don't find that host name ZZZKBTDURBOL8 anywhere on the system.
(By company convention, servers named ZZZ* are test servers that linux SEs
spin up themselves).
This server that's not renewing its creds is named:
nwpllv8bu100.amer.company.com. it's a std dev server. in
/etc/sssd/sssd.conf file, it has that as its sasl auth ID:
[root@nwpllv8bu100 sssd]# grep sasl /etc/sssd/sssd.conf
ldap_sasl_authid = host/nwpllv8bu100.amer.dell.com(a)AMER.COMPANY.COM
[root@nwpllv8bu100 sssd]#
If I do 'kinit -k', the /etc/krb5.keytab file has that name as well:
[root@nwpllv8bu100 sssd]# kinit -k
[root@nwpllv8bu100 sssd]# klist
Ticket cache: KCM:0
Default principal: host/nwpllv8bu100.amer.dell.com(a)AMER.COMPANY.COM
Valid starting Expires Service principal
09/01/2021 11:04:16 09/01/2021 21:04:16 krbtgt/
AMER.DELL.COM(a)AMER.COMPANY.COM
renew until 09/08/2021 11:04:16
[root@nwpllv8bu100 sssd]#
I searched /etc/sssd/sssd.conf -- no "zzz" or "ZZZ" string is anywhere
in
there. So where is sssd picking up this name ZZZKBTDURBOL8 and passing it
to adcli update?
Spike
On Wed, Sep 1, 2021 at 2:46 AM Sumit Bose <sbose(a)redhat.com> wrote:
Am Tue, Aug 31, 2021 at 09:53:01PM +0200 schrieb Alexey Tikhonov:
> On Tue, Aug 31, 2021 at 6:47 PM Spike White <spikewhitetx(a)gmail.com>
wrote:
>
> > All,
> >
> > OK we have a query we run in AD for machine account passwords for a
> > certain age. In today's run, 31 - 32 days. Then we verify it's
pingable.
> >
> > We have found such one such suspicious candidate today (two actually,
but
> > the other Linux server is quite sick). So one good research candidate.
> > According to both AD and /etc/krb5.keytab file, the machine account
> > password was last set on 7/29. Today is 8/31, so that would be 32
days.
> > This 'automatic machine account keytab renewal' background task
should
> > trigger again today.
> >
> > sssd service was last started 2 weeks ago and, by all appearances,
appears
> > healthy. sssctl domain-status <domain> shows online, connected to AD
> > servers (both domain and GC servers).. All logins and group
enumerations
> > working as expected.
> >
> > Just now, we dynamically set the debug level to 9 with 'sssctl
debug-level
> > 9'. This particular server is Oracle Linux 8.4,
> > running sssd-*-2.4.0-9.0.1.el8_4.1.x86_64. Installed July 13th,
2021. So
> > -- very recent sssd version. (This problem occurs with both RHEL & OL
> > 6/7/8, it's just today's candidate happens to be OL8.)
> >
> > We can't keep debug level 9 up for a great many days; it swamps the
> > /var/log filesystem. But we can leave up for a few days. We
purposely did
> > not restart sssd server as we know that would trigger a machine account
> > renewal.
> >
> > Speaking of that -- from Sumit's sssd source code in
> > ad_provider/ad_machine_pw_renewal.c, it appears that sssd is creating a
> > back-end task to call external program /usr/sbin/adcli with certain
args.
> > What string can I look for in which sssd log file (now that I have
debug
> > level 9 enabled) to tell me when this 'adcli update' task (aka
'automatic
> > machine account keytab renewal') is triggered?
> >
>
> It seems SSSD itself only logs in case of errors. I didn't find any
> explicit logs around `ad_machine_account_password_renewal_send()`.
> But perhaps there will be something like "[be_ptask_execute] (0x0400):
Task
> [AD machine account password renewal]: executing task" from generic
> be_ptask_* helpers in the sssd_$domain.log (I'm not sure).
>
> Also at this verbosity level `--verbose` should be supplied to adcli
itself
> and I guess output should be captured in sssd_$domain.log as well. I'm
not
> familiar with `adcli` internals, you can take a glance at
>
https://gitlab.freedesktop.org/realmd/adcli to find its log messages.
Hi,
if SSSD's debug_level is 7 or higher the '--verbose' option is set
when calling adcli and the output is added to the backend logs. It will
start with log message "--- adcli output start---".
HTH
bye,
Sumit
>
>
> >
> > I'm less certain now that we've surveyed our env that this background
> > 'adcli update' task is the reason behind 70 - 80 servers / month
dropping
> > off the domain. It might be a slight contributor, but I find only a
very
> > few pingable servers with machine account last renewal date between 30
and
> > 40 days.
> >
> > Yes, I can disable this default 30 day automatic update and roll my own
> > 'adcli update' cron. But that's a mass deployment, to fix what
might
not
> > be the problem. I want to verify this is the actual culprit before I
take
> > those drastic steps.
> >
> > Spike
> >
> >
> _______________________________________________
> sssd-users mailing list -- sssd-users(a)lists.fedorahosted.org
> To unsubscribe send an email to sssd-users-leave(a)lists.fedorahosted.org
> Fedora Code of Conduct:
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines:
https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives:
https://lists.fedorahosted.org/archives/list/sssd-users@lists.fedorahoste...
> Do not reply to spam on the list, report it:
https://pagure.io/fedora-infrastructure
_______________________________________________
sssd-users mailing list -- sssd-users(a)lists.fedorahosted.org
To unsubscribe send an email to sssd-users-leave(a)lists.fedorahosted.org
Fedora Code of Conduct:
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines:
https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives:
https://lists.fedorahosted.org/archives/list/sssd-users@lists.fedorahoste...
Do not reply to spam on the list, report it:
https://pagure.io/fedora-infrastructure