the fixes for the tickets you mention did change the iteration thru
the changelog and how it handles situtations when the start csn is
not found in the changelog. and it also did change the logging, so
you might see messages now which were not there or hidden before.
That was my understanding too.
But I am very surprised to see them so frequently and I would like
to understand it.
First some questions, do you have changelog trimming enabled and
how, do you have fractional replication ?
yes for both questions.
Trimming: 14 days
Fractional replication:
nsDS5ReplicatedAttributeList: (objectclass=*) $ EXCLUDE entryusn memberOf
nsDS5ReplicatedAttributeListTotal: (objectclass=*) $ EXCLUDE entryusn
nsds5ReplicaStripAttrs: modifiersName modifyTimestamp internalModifiersName internalModifyTimestamp internalCreatorsname
Changelog:
cn=changelog5,cn=config
objectClass: top
objectClass: extensibleObject
cn: changelog5
nsslapd-changelogdir: /Local/dirsrv/var/lib/dirsrv/slapd-ens/changelogdb
nsslapd-changelogmaxage: 14d
replica:
cn=replica,cn=dc\\3Did\\2Cdc\\3Dpolytechnique\\2Cdc\\3Dedu,cn=mapping tree,cn=config
objectClass: top
objectClass: nsDS5Replica
cn: replica
nsDS5ReplicaId: 1
nsDS5ReplicaRoot: dc=id,dc=polytechnique,dc=edu
nsDS5Flags: 1
nsDS5ReplicaBindDN: cn=RepliX,cn=config
nsds5ReplicaPurgeDelay: 604800
nsds5ReplicaTombstonePurgeInterval: 86400
nsds5ReplicaLegacyConsumer: False
nsDS5ReplicaType: 3
nsState:: AQAAAAAAAADCrc5XAAAAAAAAAAAAAAAAAQAAAAAAAAABAAAAAAAAAA==
nsDS5ReplicaName: eeb6d304-736c11e6-9bc5a1ff-40280b8e
nsds5ReplicaChangeCount: 114948
nsds5replicareapactive: 0
Typical replication agreement:
cn=Replication from ldap-lab.<domain name> to ldap-adm.<domain name>,cn=replica,cn=dc\\3Did\\2Cdc\\3Dpolytechnique\\2Cdc\\3Dedu,cn=mapping tree,cn=config
objectClass: top
objectClass: nsDS5ReplicationAgreement
cn: Replication from ldap-lab.<domain name> to ldap-adm.<domain name>
description: Replication agreement from server ldap-lab.<domain name> to server ldap-adm.<domain name>
nsDS5ReplicaHost: ldap-adm.<domain name>
nsDS5ReplicaRoot: dc=id,dc=polytechnique,dc=edu
nsDS5ReplicaPort: 636
nsDS5ReplicaTransportInfo: SSL
nsDS5ReplicaBindDN: cn=RepliX,cn=config
nsDS5ReplicaBindMethod: simple
nsDS5ReplicatedAttributeList: (objectclass=*) $ EXCLUDE entryusn memberOf
nsDS5ReplicatedAttributeListTotal: (objectclass=*) $ EXCLUDE entryusn
nsds5ReplicaStripAttrs: modifiersName modifyTimestamp internalModifiersName internalModifyTimestamp internalCreatorsname
nsds5replicaBusyWaitTime: 5
nsds5ReplicaFlowControlPause: 500
nsds5ReplicaFlowControlWindow: 1000
nsds5replicaTimeout: 120
nsDS5ReplicaCredentials: {AES-...
nsds50ruv: {replicageneration} 57cd7377000000020000
nsds50ruv: {replica 2 ldap://ldap-adm.<domain name>:389}
nsruvReplicaLastModified: {replica 2 ldap://ldap-adm.<domain name>:389} 00000000
nsds5replicareapactive: 0
nsds5replicaLastUpdateStart: 20160906115520Z
nsds5replicaLastUpdateEnd: 20160906115520Z
nsds5replicaChangesSentSinceStartup: 3:13525/670 1:3671/0 2:1/0
nsds5replicaLastUpdateStatus: 0 Replica acquired successfully: Incremental update succeeded
nsds5replicaUpdateInProgress: FALSE
nsds5replicaLastInitStart: 19700101000000Z
nsds5replicaLastInitEnd: 19700101000000Z
Next, is it possible to get the access and error logs for a period
of an hour from all servers (you can send them off list) ? I would
like to track some of the reported csns.
Sure, i will send it to you off list in a moment.
Thank you,
Regards,
Andrey
Regards,
Ludwig
On 09/06/2016 12:31 PM, Ivanov Andrey
(M.) wrote:
Hi,
We are successfully using the compiled 1.3.4 git branch of
389DS in production on CentOS 7 since about a year
(approximately 40 000 entries, about 4000 groups, hundreds
of reads and tens of writes per second).
Our current topology consists of 3 servers in triangle
(each server is a master replicating to 2 others, so two
read-write replication agreements on each).
Since the fixes for the Ticket 48766 ("Replication
changelog can incorrectly skip over updates") and Ticket
48954 ("Replication fails because anchorcsn cannot be
found") I’ve started to see the following regular warnings
in error logs:
[06/Sep/2016:01:21:43 +0200] clcache_load_buffer_bulk -
changelog record with csn (57cdfe06000100010000) not found
for DB_NEXT
[06/Sep/2016:01:21:43 +0200] agmt="cn=Replication from
ldap-adm.<domain> to ldap-lab.<domain>"
(ldap-lab:636) - Can't locate CSN 57cdfe06000100010000 in
the changelog (DB rc=-30988). If replication stops, the
consumer may need to be reinitialized.
[06/Sep/2016:02:35:25 +0200] - replica_generate_next_csn:
opcsn=57ce0f4e000500020000 <=
basecsn=57ce0f4e000500030000, adjusted
opcsn=57ce0f4e000600020000
[06/Sep/2016:04:10:11 +0200] clcache_load_buffer_bulk -
changelog record with csn (57ce257e000400030000) not found
for DB_NEXT
[06/Sep/2016:05:16:58 +0200] - replica_generate_next_csn:
opcsn=57ce352b000000020000 <=
basecsn=57ce352b000100010000, adjusted
opcsn=57ce352b000100020000
[06/Sep/2016:06:56:04 +0200] agmt="cn=Replication from
ldap-adm.<domain> to ldap-ens.<domain>"
(ldap-ens:636) - Can't locate CSN 57ce4c62000100030000 in
the changelog (DB rc=-30988). If replication stops, the
consumer may need to be reinitialized.
[06/Sep/2016:07:29:00 +0200] agmt="cn=Replication from
ldap-adm.<domain> to ldap-ens.<domain>"
(ldap-ens:636) - Can't locate CSN 57ce541a000200030000 in
the changelog (DB rc=-30988). If replication stops, the
consumer may need to be reinitialized.
[06/Sep/2016:07:34:20 +0200] agmt="cn=Replication from
ldap-adm.<domain> to ldap-lab.<domain>"
(ldap-lab:636) - Can't locate CSN 57ce5559000100010000 in
the changelog (DB rc=-30988). If replication stops, the
consumer may need to be reinitialized.
[06/Sep/2016:07:34:27 +0200] agmt="cn=Replication from
ldap-adm.<domain> to ldap-lab.<domain>"
(ldap-lab:636) - Can't locate CSN 57ce5561000000010000 in
the changelog (DB rc=-30988). If replication stops, the
consumer may need to be reinitialized.
[06/Sep/2016:07:40:17 +0200] clcache_load_buffer_bulk -
changelog record with csn (57ce56c0000500030000) not found
for DB_NEXT
[06/Sep/2016:07:40:24 +0200] clcache_load_buffer_bulk -
changelog record with csn (57ce56c5000100030000) not found
for DB_NEXT
[06/Sep/2016:08:08:36 +0200] clcache_load_buffer_bulk -
changelog record with csn (57ce5d5f000f00010000) not found
for DB_NEXT
[06/Sep/2016:08:12:39 +0200] clcache_load_buffer_bulk -
changelog record with csn (57ce5e54000200030000) not found
for DB_NEXT
[06/Sep/2016:08:12:39 +0200] agmt="cn=Replication from
ldap-adm.<domain> to ldap-ens.<domain>"
(ldap-ens:636) - Can't locate CSN 57ce5e54000200030000 in
the changelog (DB rc=-30988). If replication stops, the
consumer may need to be reinitialized.
[06/Sep/2016:08:26:45 +0200] clcache_load_buffer_bulk -
changelog record with csn (57ce61a3000200030000) not found
for DB_NEXT
[06/Sep/2016:08:27:40 +0200] clcache_load_buffer_bulk -
changelog record with csn (57ce61d8000200030000) not found
for DB_NEXT
[06/Sep/2016:08:27:40 +0200] agmt="cn=Replication from
ldap-adm.<domain> to ldap-ens.<domain>"
(ldap-ens:636) - Can't locate CSN 57ce61d8000200030000 in
the changelog (DB rc=-30988). If replication stops, the
consumer may need to be reinitialized.
[06/Sep/2016:08:31:42 +0200] clcache_load_buffer_bulk -
changelog record with csn (57ce62c8000300010000) not found
for DB_NEXT
[06/Sep/2016:08:34:05 +0200] clcache_load_buffer_bulk -
changelog record with csn (57ce635a000100010000) not found
for DB_NEXT
[06/Sep/2016:08:44:28 +0200] clcache_load_buffer_bulk -
changelog record with csn (57ce65c9000200030000) not found
for DB_NEXT
[06/Sep/2016:08:52:25 +0200] agmt="cn=Replication from
ldap-adm.<domain> to ldap-ens.<domain>"
(ldap-ens:636) - Can't locate CSN 57ce67aa000100030000 in
the changelog (DB rc=-30988). If replication stops, the
consumer may need to be reinitialized.
[06/Sep/2016:08:53:04 +0200] - replica_generate_next_csn:
opcsn=57ce67d1000100020000 <=
basecsn=57ce67d1000200030000, adjusted
opcsn=57ce67d1000200020000
These warnings are present
on all three servers and for all replication agreements. One
of them is virtual and two others are physical.
The replication still seems to work fine in spite of
these warnings. The "replica_generate_next_csn" is not new
- it existed since always with 1.3.4, the two new warnings
are "clcache_load_buffer_bulk " and "Can't locate CSN ...
in the changelog (DB rc=-30988)." There are no network
problems or anything like that. So it could only be
replication topology (3-master fully-connected triangle)
and/or servers being rather busy. Is it a bug, a warning
that can be ignored or anything else?
Thank you!
--
389-users mailing list
389-users@lists.fedoraproject.orghttps://lists.fedoraproject.org/admin/lists/389-users@lists.fedoraproject.org
--
Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric Shander
--
389-users mailing list
389-users@lists.fedoraproject.org
https://lists.fedoraproject.org/admin/lists/389-users@lists.fedoraproject.org