On 10/15/2013 02:41 PM, Mark Reynolds wrote:
https://fedorahosted.org/389/ticket/47368
So we run into issues when trying to figure out if replicas are in
synch(if those replicas use fractional replication and "strip mods").
What happens is that an update is made on master A, but due to
fractional replication there is no update made to any replicas. So if
you look at the ruv in the tombstone entry on each server, it would
appear they are out of synch. So using the ruv in the db tombstone is
no longer accurate when using fractional replication.
So what we really want to know is - When there are differences when
comparing RUV max CSN values, how much of the differences are due only
to unreplicated operations?
I'm proposing a new ruv to be stored in the backend replica entry:
e.g. cn=replica,cn="dc=example,dc=com",cn=mapping tree,cn=config. I'm
calling this the "replicated ruv". So whenever we actually send an
update to a replica, this ruv will get updated. Since we can not
compare this "replicated ruv" to the replicas tombstone ruv, we can
instead compare the "replicated ruv" to the ruv in the replica's repl
agreement(unless it is a dedicated consumer - here we might be able to
still look at the db tombstone ruv to determine the status).
Will have to check to see if, on a dedicated consumer, the RUV is
updated by internal operations.
Problems with this approach:
- All the servers need to have the same replication configuration(the
same fractional replication policy and attribute stripping) to give
accurate results.
- If one replica has an agreement that does NOT filter the updates,
but has agreements that do filter updates, then we can not correctly
determine its synchronization state with the fractional replicas.
- Performance hit from updating another ruv(in cn=config)?
Yes. We already have a lot of churn in dse.ldif due to - uuid generator
- csn generator - updating consumer RUV in each replication agreement.
Fractional replication simply breaks our monitoring process. I'm not
sure, not without updating the repl protocol, that we can cover all
deployment scenarios(mixed fractional repl agmts, etc). However, I
"think" this approach would work for most deployments(compared to none
at the moment). For IPA, since they don't use consumers, this
approach would work for them. And finally, all of this would have to
be handled by a updated version of repl-monitor.pl.
This is just my preliminary idea on how to handle this. Feedback is
welcome!!
Thanks in advance,
Mark
--
Mark Reynolds
389 Development Team
Red Hat, Inc
mreynolds(a)redhat.com
--
389-devel mailing list
389-devel(a)lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel