On 22 Dec 2019, at 08:22, Christophe Trefois <trefex(a)gmail.com>
First off, apologies for double posting to here and ipa mailing list, but we are getting
a bit uneasy, and also the issue seems to come from the code in 389-ds directly, so this
seems more appropriate.
Hi there, thanks for contacting us. Happy your you to post here.
We are using ipa-server ipa-server-4.6.5-11.el7.centos.3.x86_64 with
389-ds-base-188.8.131.52-10.el7.x86_64 on CentOS 7.7.
Since couple days some of our replicas are coming with "csngen_new_csn - Sequence
rollover; local offset updated." messages in the slapd erorr logs.
This isn't a problem, but you should investigate the possible causes. The short answer
is that we are pushing the lamport clock ahead due to either high writes or the system
clock being stepped backwards.
To see the code look at:
You should probably for sanity checking investigate:
* If you have high write load in your environment that is not expected
* If you have issues with ntp consistency on your machines (continually advancing or
* Conflict between a virtualised time sync service is vmware/libvirt vs ntp causing time
For a slightly longer explanation. The CSN is a lamport clock, IE it can only advance, but
never step back. It's based on the current unix time in seconds, with a sub-counter
that is 16 bit. IE we can have 65535 writes "per second".
This is because if you have say:
Write object A
Ntp syncs clock backwards
Write object B
We need the CSN of these to still reflect the true order of operations - that A occurs
before B, as we use time as the sync source between replicas rather than
locking/consensus. If the CSN didn't use lamport clock the changelog would show B
before A which is incorrect for reasons that are extremely complex and subtle.
So with the CSN being a lamport clock, if ntp sets your time backwards, the CSN stays at
the "highest" time, and the subcounter keeps incrementing. If this continues for
a long time, we overflow the 16bit sub counter - we can't have duplicate CSN so the
local offset (aka seconds) is increased to push the CSN's always forward.
That's why I recommend you check your write load and ntp/system time.
Hope that helps,
We use the python "ipa_check_consistency" and replication seems to be fine.
We checked all replicas, and they are all in time sync with ntp (updated) with no visible
is this anything to worry about, and how can we make those messages to stop appearing?
389-users mailing list -- 389-users(a)lists.fedoraproject.org
To unsubscribe send an email to 389-users-leave(a)lists.fedoraproject.org
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
Senior Software Engineer, 389 Directory Server