On 09/07/2011 05:06 PM, Noriko Hosoi wrote:
Rich Megginson wrote:
> The problem comes from the method we use to check if the changelog does
> not match the database in replica_check_for_data_reload(). The RUV in
> the database contains obsolete elements from replicas that are no longer
> in use. replica_check_for_data_reload() uses ruv_covers_ruv() to see if
> all of the max csns in the database ruv are in the changelog maxruv, and
> vice versa. It fails because the database ruv contains these obsolete
> elements not found in the changelog maxruv.
> My question is - why do we care? Isn't it sufficient to check that the
> replicageneration in the changelog is the same as the replicageneration
> in the database ruv? The replicageneration is supposed to be the unique
> identifier of the "starting point" of the replicated data.
> If the data
> is reloaded (e.g. from an ldif not created with db2ldif -r), a new
> replicageneration will be created, and the data will mismatch.
That's right. And the problem is the database RUV never be updated once
the data is reloaded from such an ldif file?
If the data is reloaded from a
"plain" ldif file, a new RUV and new
replicageneration will be created.
Then, the server recreates
the changelog every time the server is restarted?
If the data is reloaded from a
"plain" ldif file, the server will see
that the changelog does not match, and will erase the changelog. The
reason why this bug is causing the server to recreate the changelog
every time it is restarted is because of the extra ruv elements that do
not match any of the ruv elements in the changelog max ruv.
You mentioned "remove
them" in the proposed warning. Is it the only way to adjust the
As far as I can tell, the only way to adjust the database RUV is to
1) dump data using db2ldif -r
2) manually edit the file to remove the obsolete RUV elements
3) reload the data using ldif2db
Note that, due to
/export1/share/ds/ds.git(master)>git show e9fa8249|morecommit
Author: Nathan Kinder <nkinder(a)redhat.com>
Date: Tue Jan 18 08:29:50 2011 -0800
Bug 543633 - replication problems if supplier is killed under
ldapmodify to fix the ruv entry will deadlock the server. See
We should definitely fix the deadlock too.
> Or, alternately, leave the check for all of the ruv elements in,
> just warn if the database contains ruv elements not in the cl maxruv
> e.g. something like
> "WARNING: The database RUV contains these elements not present in the
> changelog max ruv:
> These elements may be obsolete, in which case you should remove them.
> If they are not obsolete, you should check those servers to make sure
> replication is occurring."
If the database RUV is not used at all, I think there is no benefit to
maintain it... Warning would rather confuse users, wouldn't it?
We need to have some way to clean up obsolete ruv elements. I remember
this issue coming up on the 389-users list some time ago, but I did not
know that it could lead to data loss.
I think the warning would be acceptable as long as we had clear
procedures for removing the obsolete ruv elements and checking the
status of the other replicas.
389-devel mailing list