[389-devel] fractional replication monitoring proposal

Thu Oct 17 08:49:55 UTC 2013

On 10/17/2013 10:15 AM, thierry bordaz wrote:
> On 10/16/2013 05:41 PM, Ludwig Krispenz wrote:
>>
>> On 10/16/2013 05:28 PM, Mark Reynolds wrote:
>>>
>>> On 10/16/2013 11:05 AM, Ludwig Krispenz wrote:
>>>>
>>>> On 10/15/2013 10:41 PM, Mark Reynolds wrote:
>>>>> https://fedorahosted.org/389/ticket/47368
>>>>>
>>>>> So we run into issues when trying to figure out if replicas are in 
>>>>> synch(if those replicas use fractional replication and "strip 
>>>>> mods").  What happens is that an update is made on master A, but 
>>>>> due to fractional replication there is no update made to any 
>>>>> replicas. So if you look at the ruv in the tombstone entry on each 
>>>>> server, it would appear they are out of synch.  So using the ruv 
>>>>> in the db tombstone is no longer accurate when using fractional 
>>>>> replication.
>>>>>
>>>>> I'm proposing a new ruv to be stored in the backend replica entry: 
>>>>> e.g. cn=replica,cn="dc=example,dc=com",cn=mapping tree,cn=config. 
>>>>> I'm calling this the "replicated ruv". So whenever we actually 
>>>>> send an update to a replica, this ruv will get updated.
>>>> I don't see how this will help, you have an additional info on waht 
>>>> has been replicated (which is available on the consumer as well) 
>>>> and you have a max csn, but you don't know if there are outstanding 
>>>> fractional changes to be sent.
>>> Well you will know on master A what operations get replicated(this 
>>> updates the new ruv before sending any changes), and you can use 
>>> this ruv to compare against the other master B's ruv(in its 
>>> replication agreement).   Maybe I am missing your point? 
>> MY point is that the question is, what is NOT yet replicated. Without 
>> fractional replication you have states of the ruv on all servers, and 
>> if ruv(A) > ruv(B) you know there are updates missing on B. With 
>> fractional, if (ruv(A) > ruv(B) this might be ok or not. If you keep 
>> an additional ruv on A when sending updates to be, you can only 
>> record what ws sent or attempted to send, but not what still has to 
>> be sent
>
> I agree with you Ludwig, but unless I missed something would not be 
> enough to know that the replica B is late or in sync ?
>
> For example, we have updates U1 U2 U3 and U4. U3 should be skipped by 
> fractional replication.
>
> replica RUV (tombstone) on master_A contains U4 and master_B replica 
> RUV contains U1.
> Let's assume that as initial value of the "replicated ruv" on master_A 
> we have U1.
> Starting a replication session, master_A should send U2 and update the 
> "replicated ruv" to U2.
> If the update is successfully applied on master_B, master_B replica 
> ruv is U2 and monitoring the two ruv shoud show they are in sync.
They are not, since U4 is not yet replicated, in master_A you see the 
"normal" ruv as U4 and the "replicated" ruv as U2, but you don't know 
how many changes are between U2 and U4 an if any of them should be 
replicated, the replicated ruv is more or less a local copy of the 
remote ruv
> If the update is not applierd, master_B replica ruv stays at U1 and 
> the two ruv will show out of sync.
>
> In the first case, we have a transient status of 'in sync' because the 
> replica agreement will evaluate U3 then U4 then send U4 and store it 
> into the "replicated ruv". At this point master_A and master_B will 
> appear out of sync until master_B will apply U4.
> If U4 was to be skipped by fractional we have master_B ruv and 
> Master_A replicated ruv both showing U2 and that is correct both 
> servers are in sync.
>
> Mark instead of storing the replicated ruv in the replica, would not 
> be possible to store it into the replica agreement (one replicated ruv 
> per RA). So that it can solve the problem of different fractional 
> replication policy ?
>
>>> Do you mean changes that have not been read from the changelog yet?  
>>> My plan was to update the new ruv in perform_operation() - right 
>>> after all the "stripping" has been done and there is something to 
>>> replicate.  We need to have a ruv for replicated operations.
>>>
>>> I guess there are other scenarios I didn't think of, like if 
>>> replication is in a backoff state, and valid changes are coming in.  
>>> Maybe, we could do test "stripping" earlier in the replication 
>>> process(when writing to the changelog?), and then update the new ruv 
>>> there instead of waiting until we try and send the changes.
>>>>> Since we can not compare this "replicated ruv" to the replicas 
>>>>> tombstone ruv, we can instead compare the "replicated ruv" to the 
>>>>> ruv in the replica's repl agreement(unless it is a dedicated 
>>>>> consumer - here we might be able to still look at the db tombstone 
>>>>> ruv to determine the status).
>>>>>
>>>>> Problems with this approach:
>>>>>
>>>>> -  All the servers need to have the same replication 
>>>>> configuration(the same fractional replication policy and attribute 
>>>>> stripping) to give accurate results.
>>>>>
>>>>> -  If one replica has an agreement that does NOT filter the 
>>>>> updates, but has agreements that do filter updates, then we can 
>>>>> not correctly determine its synchronization state with the 
>>>>> fractional replicas.
>>>>>
>>>>> -  Performance hit from updating another ruv(in cn=config)?
>>>>>
>>>>>
>>>>> Fractional replication simply breaks our monitoring process.  I'm 
>>>>> not sure, not without updating the repl protocol, that we can 
>>>>> cover all deployment scenarios(mixed fractional repl agmts, etc). 
>>>>> However, I "think" this approach would work for most 
>>>>> deployments(compared to none at the moment).  For IPA, since they 
>>>>> don't use consumers, this approach would work for them.  And 
>>>>> finally, all of this would have to be handled by a updated version 
>>>>> of repl-monitor.pl.
>>>>>
>>>>> This is just my preliminary idea on how to handle this. Feedback 
>>>>> is welcome!!
>>>>>
>>>>> Thanks in advance,
>>>>> Mark
>>>>>
>>>>> -- 
>>>>> Mark Reynolds
>>>>> 389 Development Team
>>>>> Red Hat, Inc
>>>>> mreynolds at redhat.com
>>>>>
>>>>>
>>>>> --
>>>>> 389-devel mailing list
>>>>> 389-devel at lists.fedoraproject.org
>>>>> https://admin.fedoraproject.org/mailman/listinfo/389-devel
>>>>
>>>>
>>>>
>>>> --
>>>> 389-devel mailing list
>>>> 389-devel at lists.fedoraproject.org
>>>> https://admin.fedoraproject.org/mailman/listinfo/389-devel
>>>
>>> -- 
>>> Mark Reynolds
>>> 389 Development Team
>>> Red Hat, Inc
>>> mreynolds at redhat.com
>>
>>
>>
>> --
>> 389-devel mailing list
>> 389-devel at lists.fedoraproject.org
>> https://admin.fedoraproject.org/mailman/listinfo/389-devel
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.fedoraproject.org/pipermail/389-devel/attachments/20131017/2bf42822/attachment.html>