[Freeipa-users] Re: Scaling and Misc

Thursday, 28 January 2021

So the DNS overload was my own fault. I was using 'while' in Ansible and
doing an entry at a time instead of just generating a playbook that adds
multiple entries. I've tested with 100 entries and had a single update per
zone to the replicas. So I've sorted that. I shouldn't Ansible on almost no
sleep.

However, the question about ~9k diskless nodes booting and all running
'ipa-client-install' with the '--force' option overloading the cluster in
the same manner.

On Thu, Jan 28, 2021, 11:14 AM Mark Potter <markp(a)dug.com&gt; wrote:

...
 The docs say 2k to 3k hosts per FreeIPA machine. We currently have 1
 server and 3 replicas for ~9k hosts. The issue is that the hosts in
 question are stateless so have to have ipa-client-install run every boot.
 We've got that part handled but something came up that's got me concerne.

 I was adding DNS records using ansible-freeipa. With needing DNS for all
 of our sites along with BMC and such we have about ~38k valid DNS entries.
 I was running two playbooks to add entries in parallel because we need
 everything to resolve on example.com and example1.com. This is an
 artifact that can't be avoided so we end up ~76k entries across two zones.

 Example.com was adding with reverse and example1.com was adding without
 reverse. Based on the time it took for each entry to be added this should
 have taken ~31 hours. At some point the three replicas stopped responding
 to any requests. For instance ipa1.example.com (primary) would validate
 while adding a host but example2.com (replica) would hang and never
 timeout. Eventually both playbooks failed at ~33k DNS entries as ssh wasn't
 responding on the primary. I wasn't monitoring at that point so I didn't
 get to see it happen. There is nothing from OOM in the logs so it doesn't
 look like ssshd got killed from memory usage, when I was monitoring load
 never got over2.

 The VMs have 16GiB memory, 6 cores, and a 10Gib connection. They are
 running CentOS 7 with FreeIPA 4.6.5. Logs on ipa1 show:

 Jan 27 11:39:14 ipa1 ns-slapd: [27/Jan/2021:11:39:14.363156372 -0600] -
 WARN - NSMMReplicationPlugin - acquire_replica - agmt="cn=
 meToipa2.example.com" (ipa2:389): Unable to receive the response for a
 startReplication extended operation to consumer (Timed out). Will retry
 later.

 For both the left and right replicas (ipa2 and ipa3).

 The replicas show:

 Jan 27 16:38:02 ipa2 named-pkcs11[2516]: LDAP query timed out. Try to
 adjust "timeout" parameter
 Jan 27 16:38:02 ipa2 named-pkcs11[2516]: zone example.com/IN: serial
 (1611787052) write back to LDAP failed
 Jan 27 16:38:12 ipa2 named-pkcs11[2516]: LDAP query timed out. Try to
 adjust "timeout" parameter
 Jan 27 16:38:12 ipa2 named-pkcs11[2516]: zone 16.172.in-addr.arpa/IN:
 serial (1611787062) write back to LDAP failed

 Which eventually became:

 Jan 27 16:57:32 ipa2 named-pkcs11[2516]: zone example.com/IN: serial
 (1611788192) write back to LDAP failed
 Jan 27 16:58:22 ipa2 named-pkcs11[2516]: timeout in
 ldap_pool_getconnection(): try to raise 'connections' parameter; potential
 deadlock?

 This was happening in the krb5kdc.log on the replicas around the same time:

 Jan 27 15:20:41 ipa2.example.com krb5kdc[26712](info): AS_REQ (8 etypes
 {18 17 20 19 16 23 25 26}) 10.201.1.5: LOOKING_UP_CLIENT:
 markp(a)EXAMPLE.COM for krbtgt/EXAMPLE.COM(a)EXAMPLE.COM, Server error

 In dirsrv/slapd-EXAMPLE-COM/error in the same timeframe:

 [27/Jan/2021:15:58:04.885131721 -0600] - ERR - NSMMReplicationPlugin -
 bind_and_check_pwp - agmt="cn=ipa2.example.com-to-ipa3.example.com"
 (ipa3:389) - Replication bind with GSSAPI auth failed: LDAP error -1 (Can't
 contact LDAP server) ()

 While not frequently these appeared again until I rebooted the replicas
 this morning. I could restart with 'ipactl restart`, it would just hang. I
 let it sit for ten minutes at one point. `ipactl status` consistently
 showed everything running.

 My topology looks like this (both ca and domain are the same)

 ------------------
 5 segments matched
 ------------------
   Segment name: ipa1.example.com-to-ipa2.example.com
   Left node: ipa1.example.com
   Right node: ipa2.example.com
   Connectivity: both

   Segment name: ipa1.example.com-to-ipa3.example.com
   Left node: ipa1.example.com
   Right node: ipa3.example.com
   Connectivity: both

   Segment name: ipa1.example.com-to-ipa4.example.com
   Left node: ipa1.example.com
   Right node: ipa4.example.com
   Connectivity: both

   Segment name: ipa3.example.com-to-ipa2.example.com
   Left node: ipa3.example.com
   Right node: ipa2.example.com
   Connectivity: both

   Segment name: ipa4.example.com-to-ipa3.example.com
   Left node: ipa4.example.com
   Right node: ipa3.example.com
   Connectivity: both
 ----------------------------
 Number of entries returned 5
 ----------------------------

 Since the play takes slightly more than 2 seconds to run when creating
 with reverse and slightly under 2 seconds when creating without reverse I
 don't see why this should ever overload anything but I will freely admit I
 am not all that familiar with the way DNS is handled. If FreeIPA is sending
 the entire zone file for every update and it all has to be written to the
 DB then I can see why that would be an issue. I could kill the replication
 agreements, load the rest of the entries, then re-add the agreements so the
 zone only needs to be transferred once. But it's still a bit concerning due
 to the scenario I described above.

 If we have a power outage and need to boot ~9k machines, all of which will
 run:

 ipa-client-install -U -q -p <service account for adding hosts> \
 -w <some really secure password> \
 --domain=example.com \
 --server=ipa1.example.com \
 --server=ipa2.example.com \
 --server=ipa3.example.com \
 --server=ipa4.example.com \
 --force-join \
 --enable-dns-updates \
 --ssh-trust-dns \
 --automount-location=<appropriate map>

 Are we going to see everything fail in a spectacular manner and is there
 anything I can do to mitigate the failure with adding DNS entries as I
 still need to complete the addition and have ~5k per zone left for two
 zones.

2024

2023

2022

2021

2020

2019

2018

2017

[Freeipa-users] Re: Scaling and Misc