On 09/06/2016 02:02 PM, Ivanov Andrey (M.) wrote:
> Hi Ludwig,
>
> <
http://www.polytechnique.edu>
>
>
> ------------------------------------------------------------------------
>
> the fixes for the tickets you mention did change the iteration
> thru the changelog and how it handles situtations when the start
> csn is not found in the changelog. and it also did change the
> logging, so you might see messages now which were not there or
> hidden before.
>
> That was my understanding too.
so far I have not seen any replication problems related to these
messages, all generatedcsns seem to be replicated. What makes it a bit
more difficult is that most of the updates are updates of
lastlogintime and the original MOD is not logged. I still do not
understand why we have these messages so frequently, I will try to
reproduce.
Or, if it possible, could you run the servers for just an hour with
replication logging enabled ?
no more need for this, I found the messages in a
deployment where repl
logging was enabled. I think it happens when the smallest consumer
maxCSN is ahead of the local maxCSN for this replicaID.
It should do no harm, but in some scenarios could slow down replication
a bit.
I will continue to investigate and work on a fix
When looking into the provided data set I did notice three replicated
ops with err=50, insufficient access. This should not happen and
requires a separate investigation
>
>
> But I am very surprised to see them so frequently and I would
> like to understand it.
> First some questions, do you have changelog trimming enabled and
> how, do you have fractional replication ?
>
> yes for both questions.
>
> Trimming: 14 days
> Fractional replication:
> nsDS5ReplicatedAttributeList: (objectclass=*) $ EXCLUDE entryusn memberOf
> nsDS5ReplicatedAttributeListTotal: (objectclass=*) $ EXCLUDE entryusn
> nsds5ReplicaStripAttrs: modifiersName modifyTimestamp
> internalModifiersName internalModifyTimestamp internalCreatorsname
>
> Changelog:
> cn=changelog5,cn=config
> objectClass: top
> objectClass: extensibleObject
> cn: changelog5
> nsslapd-changelogdir: /Local/dirsrv/var/lib/dirsrv/slapd-ens/changelogdb
> nsslapd-changelogmaxage: 14d
>
>
> replica:
> cn=replica,cn=dc\\3Did\\2Cdc\\3Dpolytechnique\\2Cdc\\3Dedu,cn=mapping
> tree,cn=config
> objectClass: top
> objectClass: nsDS5Replica
> cn: replica
> nsDS5ReplicaId: 1
> nsDS5ReplicaRoot: dc=id,dc=polytechnique,dc=edu
> nsDS5Flags: 1
> nsDS5ReplicaBindDN: cn=RepliX,cn=config
> nsds5ReplicaPurgeDelay: 604800
> nsds5ReplicaTombstonePurgeInterval: 86400
> nsds5ReplicaLegacyConsumer: False
> nsDS5ReplicaType: 3
> nsState:: AQAAAAAAAADCrc5XAAAAAAAAAAAAAAAAAQAAAAAAAAABAAAAAAAAAA==
> nsDS5ReplicaName: eeb6d304-736c11e6-9bc5a1ff-40280b8e
> nsds5ReplicaChangeCount: 114948
> nsds5replicareapactive: 0
>
>
> Typical replication agreement:
>
> cn=Replication from ldap-lab.<domain name> to ldap-adm.<domain
> name>,cn=replica,cn=dc\\3Did\\2Cdc\\3Dpolytechnique\\2Cdc\\3Dedu,cn=mapping
> tree,cn=config
> objectClass: top
> objectClass: nsDS5ReplicationAgreement
> cn: Replication from ldap-lab.<domain name> to ldap-adm.<domain name>
> description: Replication agreement from server ldap-lab.<domain name>
> to server ldap-adm.<domain name>
> nsDS5ReplicaHost: ldap-adm.<domain name>
> nsDS5ReplicaRoot: dc=id,dc=polytechnique,dc=edu
> nsDS5ReplicaPort: 636
> nsDS5ReplicaTransportInfo: SSL
> nsDS5ReplicaBindDN: cn=RepliX,cn=config
> nsDS5ReplicaBindMethod: simple
> nsDS5ReplicatedAttributeList: (objectclass=*) $ EXCLUDE entryusn memberOf
> nsDS5ReplicatedAttributeListTotal: (objectclass=*) $ EXCLUDE entryusn
> nsds5ReplicaStripAttrs: modifiersName modifyTimestamp
> internalModifiersName internalModifyTimestamp internalCreatorsname
> nsds5replicaBusyWaitTime: 5
> nsds5ReplicaFlowControlPause: 500
> nsds5ReplicaFlowControlWindow: 1000
> nsds5replicaTimeout: 120
> nsDS5ReplicaCredentials: {AES-...
> nsds50ruv: {replicageneration} 57cd7377000000020000
> nsds50ruv: {replica 2 ldap://ldap-adm.<domain name>:389}
> nsruvReplicaLastModified: {replica 2 ldap://ldap-adm.<domain
> name>:389} 00000000
> nsds5replicareapactive: 0
> nsds5replicaLastUpdateStart: 20160906115520Z
> nsds5replicaLastUpdateEnd: 20160906115520Z
> nsds5replicaChangesSentSinceStartup: 3:13525/670 1:3671/0 2:1/0
> nsds5replicaLastUpdateStatus: 0 Replica acquired successfully:
> Incremental update succeeded
> nsds5replicaUpdateInProgress: FALSE
> nsds5replicaLastInitStart: 19700101000000Z
> nsds5replicaLastInitEnd: 19700101000000Z
>
>
>
> Next, is it possible to get the access and error logs for a
> period of an hour from all servers (you can send them off list) ?
> I would like to track some of the reported csns.
>
> Sure, i will send it to you off list in a moment.
>
> Thank you,
>
> Regards,
> Andrey
>
>
>
> Regards,
> Ludwig
>
>
> On 09/06/2016 12:31 PM, Ivanov Andrey (M.) wrote:
>
> Hi,
>
> We are successfully using the compiled 1.3.4 git branch of
> 389DS in production on CentOS 7 since about a year
> (approximately 40 000 entries, about 4000 groups, hundreds of
> reads and tens of writes per second).
> Our current topology consists of 3 servers in triangle (each
> server is a master replicating to 2 others, so two read-write
> replication agreements on each).
>
> Since the fixes for the Ticket 48766 ("Replication changelog
> can incorrectly skip over updates") and Ticket 48954
> ("Replication fails because anchorcsn cannot be found") I’ve
> started to see the following regular warnings in error logs:
>
> [06/Sep/2016:01:21:43 +0200] clcache_load_buffer_bulk -
> changelog record with csn (57cdfe06000100010000) not found
> for DB_NEXT
> [06/Sep/2016:01:21:43 +0200] agmt="cn=Replication from
> ldap-adm.<domain> to ldap-lab.<domain>" (ldap-lab:636) -
> Can't locate CSN 57cdfe06000100010000 in the changelog (DB
> rc=-30988). If replication stops, the consumer may need to be
> reinitialized.
> [06/Sep/2016:02:35:25 +0200] - replica_generate_next_csn:
> opcsn=57ce0f4e000500020000 <= basecsn=57ce0f4e000500030000,
> adjusted opcsn=57ce0f4e000600020000
> [06/Sep/2016:04:10:11 +0200] clcache_load_buffer_bulk -
> changelog record with csn (57ce257e000400030000) not found
> for DB_NEXT
> [06/Sep/2016:05:16:58 +0200] - replica_generate_next_csn:
> opcsn=57ce352b000000020000 <= basecsn=57ce352b000100010000,
> adjusted opcsn=57ce352b000100020000
> [06/Sep/2016:06:56:04 +0200] agmt="cn=Replication from
> ldap-adm.<domain> to ldap-ens.<domain>" (ldap-ens:636) -
> Can't locate CSN 57ce4c62000100030000 in the changelog (DB
> rc=-30988). If replication stops, the consumer may need to be
> reinitialized.
> [06/Sep/2016:07:29:00 +0200] agmt="cn=Replication from
> ldap-adm.<domain> to ldap-ens.<domain>" (ldap-ens:636) -
> Can't locate CSN 57ce541a000200030000 in the changelog (DB
> rc=-30988). If replication stops, the consumer may need to be
> reinitialized.
> [06/Sep/2016:07:34:20 +0200] agmt="cn=Replication from
> ldap-adm.<domain> to ldap-lab.<domain>" (ldap-lab:636) -
> Can't locate CSN 57ce5559000100010000 in the changelog (DB
> rc=-30988). If replication stops, the consumer may need to be
> reinitialized.
> [06/Sep/2016:07:34:27 +0200] agmt="cn=Replication from
> ldap-adm.<domain> to ldap-lab.<domain>" (ldap-lab:636) -
> Can't locate CSN 57ce5561000000010000 in the changelog (DB
> rc=-30988). If replication stops, the consumer may need to be
> reinitialized.
> [06/Sep/2016:07:40:17 +0200] clcache_load_buffer_bulk -
> changelog record with csn (57ce56c0000500030000) not found
> for DB_NEXT
> [06/Sep/2016:07:40:24 +0200] clcache_load_buffer_bulk -
> changelog record with csn (57ce56c5000100030000) not found
> for DB_NEXT
> [06/Sep/2016:08:08:36 +0200] clcache_load_buffer_bulk -
> changelog record with csn (57ce5d5f000f00010000) not found
> for DB_NEXT
> [06/Sep/2016:08:12:39 +0200] clcache_load_buffer_bulk -
> changelog record with csn (57ce5e54000200030000) not found
> for DB_NEXT
> [06/Sep/2016:08:12:39 +0200] agmt="cn=Replication from
> ldap-adm.<domain> to ldap-ens.<domain>" (ldap-ens:636) -
> Can't locate CSN 57ce5e54000200030000 in the changelog (DB
> rc=-30988). If replication stops, the consumer may need to be
> reinitialized.
> [06/Sep/2016:08:26:45 +0200] clcache_load_buffer_bulk -
> changelog record with csn (57ce61a3000200030000) not found
> for DB_NEXT
> [06/Sep/2016:08:27:40 +0200] clcache_load_buffer_bulk -
> changelog record with csn (57ce61d8000200030000) not found
> for DB_NEXT
> [06/Sep/2016:08:27:40 +0200] agmt="cn=Replication from
> ldap-adm.<domain> to ldap-ens.<domain>" (ldap-ens:636) -
> Can't locate CSN 57ce61d8000200030000 in the changelog (DB
> rc=-30988). If replication stops, the consumer may need to be
> reinitialized.
> [06/Sep/2016:08:31:42 +0200] clcache_load_buffer_bulk -
> changelog record with csn (57ce62c8000300010000) not found
> for DB_NEXT
> [06/Sep/2016:08:34:05 +0200] clcache_load_buffer_bulk -
> changelog record with csn (57ce635a000100010000) not found
> for DB_NEXT
> [06/Sep/2016:08:44:28 +0200] clcache_load_buffer_bulk -
> changelog record with csn (57ce65c9000200030000) not found
> for DB_NEXT
> [06/Sep/2016:08:52:25 +0200] agmt="cn=Replication from
> ldap-adm.<domain> to ldap-ens.<domain>" (ldap-ens:636) -
> Can't locate CSN 57ce67aa000100030000 in the changelog (DB
> rc=-30988). If replication stops, the consumer may need to be
> reinitialized.
> [06/Sep/2016:08:53:04 +0200] - replica_generate_next_csn:
> opcsn=57ce67d1000100020000 <= basecsn=57ce67d1000200030000,
> adjusted opcsn=57ce67d1000200020000
>
> These warnings are present on all three servers and for all
> replication agreements. One of them is virtual and two others
> are physical.
>
> The replication still seems to work fine in spite of these
> warnings. The "replica_generate_next_csn" is not new - it
> existed since always with 1.3.4, the two new warnings are
> "clcache_load_buffer_bulk " and "Can't locate CSN ... in
the
> changelog (DB rc=-30988)." There are no network problems or
> anything like that. So it could only be replication topology
> (3-master fully-connected triangle) and/or servers being
> rather busy. Is it a bug, a warning that can be ignored or
> anything else?
>
>
> Thank you!
>
>
>
> --
> 389-users mailing list
>
389-users@lists.fedoraproject.orghttps://lists.fedoraproject.org/admin/lists/389-users@lists.fedoraproject.org
>
>
> --
> Red Hat
GmbH,http://www.de.redhat.com/, Registered seat: Grasbrunn,
> Commercial register: Amtsgericht Muenchen, HRB 153243,
> Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill,
Eric Shander
>
>
> --
> 389-users mailing list
> 389-users(a)lists.fedoraproject.org
>
https://lists.fedoraproject.org/admin/lists/389-users@lists.fedoraproject...
>
>
>
> --
> 389-users mailing list
> 389-users(a)lists.fedoraproject.org
>
https://lists.fedoraproject.org/admin/lists/389-users@lists.fedoraproject...
--
Red Hat
GmbH,http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric
Shander
--
389-users mailing list
389-users(a)lists.fedoraproject.org
https://lists.fedoraproject.org/admin/lists/389-users@lists.fedoraproject... , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric
Shander