Hi Ludwig,
the fixes for the tickets you mention did change the iteration thru
the
changelog and how it handles situtations when the start csn is not found in the
changelog. and it also did change the logging, so you might see messages now
which were not there or hidden before.
That was my understanding too.
But I am very surprised to see them so frequently and I would like to
understand
it.
First some questions, do you have changelog trimming enabled and how, do you
have fractional replication ?
yes for both questions.
Trimming: 14 days
Fractional replication:
nsDS5ReplicatedAttributeList: (objectclass=*) $ EXCLUDE entryusn memberOf
nsDS5ReplicatedAttributeListTotal: (objectclass=*) $ EXCLUDE entryusn
nsds5ReplicaStripAttrs: modifiersName modifyTimestamp internalModifiersName
internalModifyTimestamp internalCreatorsname
Changelog:
cn=changelog5,cn=config
objectClass: top
objectClass: extensibleObject
cn: changelog5
nsslapd-changelogdir: /Local/dirsrv/var/lib/dirsrv/slapd-ens/changelogdb
nsslapd-changelogmaxage: 14d
replica:
cn=replica,cn=dc\\3Did\\2Cdc\\3Dpolytechnique\\2Cdc\\3Dedu,cn=mapping tree,cn=config
objectClass: top
objectClass: nsDS5Replica
cn: replica
nsDS5ReplicaId: 1
nsDS5ReplicaRoot: dc=id,dc=polytechnique,dc=edu
nsDS5Flags: 1
nsDS5ReplicaBindDN: cn=RepliX,cn=config
nsds5ReplicaPurgeDelay: 604800
nsds5ReplicaTombstonePurgeInterval: 86400
nsds5ReplicaLegacyConsumer: False
nsDS5ReplicaType: 3
nsState:: AQAAAAAAAADCrc5XAAAAAAAAAAAAAAAAAQAAAAAAAAABAAAAAAAAAA==
nsDS5ReplicaName: eeb6d304-736c11e6-9bc5a1ff-40280b8e
nsds5ReplicaChangeCount: 114948
nsds5replicareapactive: 0
Typical replication agreement:
cn=Replication from ldap-lab.<domain name> to ldap-adm.<domain
name>,cn=replica,cn=dc\\3Did\\2Cdc\\3Dpolytechnique\\2Cdc\\3Dedu,cn=mapping
tree,cn=config
objectClass: top
objectClass: nsDS5ReplicationAgreement
cn: Replication from ldap-lab.<domain name> to ldap-adm.<domain name>
description: Replication agreement from server ldap-lab.<domain name> to server
ldap-adm.<domain name>
nsDS5ReplicaHost: ldap-adm.<domain name>
nsDS5ReplicaRoot: dc=id,dc=polytechnique,dc=edu
nsDS5ReplicaPort: 636
nsDS5ReplicaTransportInfo: SSL
nsDS5ReplicaBindDN: cn=RepliX,cn=config
nsDS5ReplicaBindMethod: simple
nsDS5ReplicatedAttributeList: (objectclass=*) $ EXCLUDE entryusn memberOf
nsDS5ReplicatedAttributeListTotal: (objectclass=*) $ EXCLUDE entryusn
nsds5ReplicaStripAttrs: modifiersName modifyTimestamp internalModifiersName
internalModifyTimestamp internalCreatorsname
nsds5replicaBusyWaitTime: 5
nsds5ReplicaFlowControlPause: 500
nsds5ReplicaFlowControlWindow: 1000
nsds5replicaTimeout: 120
nsDS5ReplicaCredentials: {AES-...
nsds50ruv: {replicageneration} 57cd7377000000020000
nsds50ruv: {replica 2 ldap://ldap-adm.<domain name>:389}
nsruvReplicaLastModified: {replica 2 ldap://ldap-adm.<domain name>:389} 00000000
nsds5replicareapactive: 0
nsds5replicaLastUpdateStart: 20160906115520Z
nsds5replicaLastUpdateEnd: 20160906115520Z
nsds5replicaChangesSentSinceStartup: 3:13525/670 1:3671/0 2:1/0
nsds5replicaLastUpdateStatus: 0 Replica acquired successfully: Incremental update
succeeded
nsds5replicaUpdateInProgress: FALSE
nsds5replicaLastInitStart: 19700101000000Z
nsds5replicaLastInitEnd: 19700101000000Z
Next, is it possible to get the access and error logs for a period of
an hour
from all servers (you can send them off list) ? I would like to track some of
the reported csns.
Sure, i will send it to you off list in a moment.
Thank you,
Regards,
Andrey
Regards,
Ludwig
On 09/06/2016 12:31 PM, Ivanov Andrey (M.) wrote:
> Hi,
> We are successfully using the compiled 1.3.4 git branch of 389DS
in production
> on CentOS 7 since about a year (approximately 40 000 entries, about 4000
> groups, hundreds of reads and tens of writes per second).
> Our current topology consists of 3 servers in triangle (each server is a master
> replicating to 2 others, so two read-write replication agreements on each).
> Since the fixes for the Ticket 48766 ("Replication changelog
can incorrectly
> skip over updates") and Ticket 48954 ("Replication fails because anchorcsn
> cannot be found") I’ve started to see the following regular warnings in error
> logs:
> [06/Sep/2016:01:21:43 +0200] clcache_load_buffer_bulk - changelog
record with
> csn (57cdfe06000100010000) not found for DB_NEXT
> [06/Sep/2016:01:21:43 +0200] agmt="cn=Replication from ldap-adm.<domain>
to
> ldap-lab.<domain>" (ldap-lab:636) - Can't locate CSN
57cdfe06000100010000 in
> the changelog (DB rc=-30988). If replication stops, the consumer may need to be
> reinitialized.
> [06/Sep/2016:02:35:25 +0200] - replica_generate_next_csn:
> opcsn=57ce0f4e000500020000 <= basecsn=57ce0f4e000500030000, adjusted
> opcsn=57ce0f4e000600020000
> [06/Sep/2016:04:10:11 +0200] clcache_load_buffer_bulk - changelog record with
> csn (57ce257e000400030000) not found for DB_NEXT
> [06/Sep/2016:05:16:58 +0200] - replica_generate_next_csn:
> opcsn=57ce352b000000020000 <= basecsn=57ce352b000100010000, adjusted
> opcsn=57ce352b000100020000
> [06/Sep/2016:06:56:04 +0200] agmt="cn=Replication from ldap-adm.<domain>
to
> ldap-ens.<domain>" (ldap-ens:636) - Can't locate CSN
57ce4c62000100030000 in
> the changelog (DB rc=-30988). If replication stops, the consumer may need to be
> reinitialized.
> [06/Sep/2016:07:29:00 +0200] agmt="cn=Replication from ldap-adm.<domain>
to
> ldap-ens.<domain>" (ldap-ens:636) - Can't locate CSN
57ce541a000200030000 in
> the changelog (DB rc=-30988). If replication stops, the consumer may need to be
> reinitialized.
> [06/Sep/2016:07:34:20 +0200] agmt="cn=Replication from ldap-adm.<domain>
to
> ldap-lab.<domain>" (ldap-lab:636) - Can't locate CSN
57ce5559000100010000 in
> the changelog (DB rc=-30988). If replication stops, the consumer may need to be
> reinitialized.
> [06/Sep/2016:07:34:27 +0200] agmt="cn=Replication from ldap-adm.<domain>
to
> ldap-lab.<domain>" (ldap-lab:636) - Can't locate CSN
57ce5561000000010000 in
> the changelog (DB rc=-30988). If replication stops, the consumer may need to be
> reinitialized.
> [06/Sep/2016:07:40:17 +0200] clcache_load_buffer_bulk - changelog record with
> csn (57ce56c0000500030000) not found for DB_NEXT
> [06/Sep/2016:07:40:24 +0200] clcache_load_buffer_bulk - changelog record with
> csn (57ce56c5000100030000) not found for DB_NEXT
> [06/Sep/2016:08:08:36 +0200] clcache_load_buffer_bulk - changelog record with
> csn (57ce5d5f000f00010000) not found for DB_NEXT
> [06/Sep/2016:08:12:39 +0200] clcache_load_buffer_bulk - changelog record with
> csn (57ce5e54000200030000) not found for DB_NEXT
> [06/Sep/2016:08:12:39 +0200] agmt="cn=Replication from ldap-adm.<domain>
to
> ldap-ens.<domain>" (ldap-ens:636) - Can't locate CSN
57ce5e54000200030000 in
> the changelog (DB rc=-30988). If replication stops, the consumer may need to be
> reinitialized.
> [06/Sep/2016:08:26:45 +0200] clcache_load_buffer_bulk - changelog record with
> csn (57ce61a3000200030000) not found for DB_NEXT
> [06/Sep/2016:08:27:40 +0200] clcache_load_buffer_bulk - changelog record with
> csn (57ce61d8000200030000) not found for DB_NEXT
> [06/Sep/2016:08:27:40 +0200] agmt="cn=Replication from ldap-adm.<domain>
to
> ldap-ens.<domain>" (ldap-ens:636) - Can't locate CSN
57ce61d8000200030000 in
> the changelog (DB rc=-30988). If replication stops, the consumer may need to be
> reinitialized.
> [06/Sep/2016:08:31:42 +0200] clcache_load_buffer_bulk - changelog record with
> csn (57ce62c8000300010000) not found for DB_NEXT
> [06/Sep/2016:08:34:05 +0200] clcache_load_buffer_bulk - changelog record with
> csn (57ce635a000100010000) not found for DB_NEXT
> [06/Sep/2016:08:44:28 +0200] clcache_load_buffer_bulk - changelog record with
> csn (57ce65c9000200030000) not found for DB_NEXT
> [06/Sep/2016:08:52:25 +0200] agmt="cn=Replication from ldap-adm.<domain>
to
> ldap-ens.<domain>" (ldap-ens:636) - Can't locate CSN
57ce67aa000100030000 in
> the changelog (DB rc=-30988). If replication stops, the consumer may need to be
> reinitialized.
> [06/Sep/2016:08:53:04 +0200] - replica_generate_next_csn:
> opcsn=57ce67d1000100020000 <= basecsn=57ce67d1000200030000, adjusted
> opcsn=57ce67d1000200020000
> These warnings are present on all three servers and for all
replication
> agreements. One of them is virtual and two others are physical.
> The replication still seems to work fine in spite of these
warnings. The
> "replica_generate_next_csn" is not new - it existed since always with
1.3.4,
> the two new warnings are "clcache_load_buffer_bulk " and "Can't
locate CSN ...
> in the changelog (DB rc=-30988)." There are no network problems or anything
> like that. So it could only be replication topology (3-master fully-connected
> triangle) and/or servers being rather busy. Is it a bug, a warning that can be
> ignored or anything else?
> Thank you!
--
Red Hat GmbH,
http://www.de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric
Shander