[389-devel] fractional replication monitoring proposal

Rich Megginson rmeggins at redhat.com
Wed Oct 16 14:51:54 UTC 2013


On 10/15/2013 02:41 PM, Mark Reynolds wrote:
> https://fedorahosted.org/389/ticket/47368
>
> So we run into issues when trying to figure out if replicas are in 
> synch(if those replicas use fractional replication and "strip mods").  
> What happens is that an update is made on master A, but due to 
> fractional replication there is no update made to any replicas. So if 
> you look at the ruv in the tombstone entry on each server, it would 
> appear they are out of synch.  So using the ruv in the db tombstone is 
> no longer accurate when using fractional replication.

So what we really want to know is - When there are differences when 
comparing RUV max CSN values, how much of the differences are due only 
to unreplicated operations?

>
> I'm proposing a new ruv to be stored in the backend replica entry: 
> e.g. cn=replica,cn="dc=example,dc=com",cn=mapping tree,cn=config. I'm 
> calling this the "replicated ruv".  So whenever we actually send an 
> update to a replica, this ruv will get updated.  Since we can not 
> compare this "replicated ruv" to the replicas tombstone ruv, we can 
> instead compare the "replicated ruv" to the ruv in the replica's repl 
> agreement(unless it is a dedicated consumer - here we might be able to 
> still look at the db tombstone ruv to determine the status).

Will have to check to see if, on a dedicated consumer, the RUV is 
updated by internal operations.

>
> Problems with this approach:
>
> -  All the servers need to have the same replication configuration(the 
> same fractional replication policy and attribute stripping) to give 
> accurate results.
>
> -  If one replica has an agreement that does NOT filter the updates, 
> but has agreements that do filter updates, then we can not correctly 
> determine its synchronization state with the fractional replicas.
>
> -  Performance hit from updating another ruv(in cn=config)?

Yes.  We already have a lot of churn in dse.ldif due to - uuid generator 
- csn generator - updating consumer RUV in each replication agreement.

>
>
> Fractional replication simply breaks our monitoring process. I'm not 
> sure, not without updating the repl protocol, that we can cover all 
> deployment scenarios(mixed fractional repl agmts, etc). However, I 
> "think" this approach would work for most deployments(compared to none 
> at the moment).  For IPA, since they don't use consumers, this 
> approach would work for them. And finally, all of this would have to 
> be handled by a updated version of repl-monitor.pl.
>
> This is just my preliminary idea on how to handle this. Feedback is 
> welcome!!
>
> Thanks in advance,
> Mark
>
> -- 
> Mark Reynolds
> 389 Development Team
> Red Hat, Inc
> mreynolds at redhat.com
>
>
> --
> 389-devel mailing list
> 389-devel at lists.fedoraproject.org
> https://admin.fedoraproject.org/mailman/listinfo/389-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.fedoraproject.org/pipermail/389-devel/attachments/20131016/58663e17/attachment.html>


More information about the 389-devel mailing list