The problem comes from the method we use to check if the changelog does
not match the database in replica_check_for_data_reload(). The RUV in
the database contains obsolete elements from replicas that are no longer
in use. replica_check_for_data_reload() uses ruv_covers_ruv() to see if
all of the max csns in the database ruv are in the changelog maxruv, and
vice versa. It fails because the database ruv contains these obsolete
elements not found in the changelog maxruv.
My question is - why do we care? Isn't it sufficient to check that the
replicageneration in the changelog is the same as the replicageneration
in the database ruv? The replicageneration is supposed to be the unique
identifier of the "starting point" of the replicated data. If the data
is reloaded (e.g. from an ldif not created with db2ldif -r), a new
replicageneration will be created, and the data will mismatch.
Or, alternately, leave the check for all of the ruv elements in, but
just warn if the database contains ruv elements not in the cl maxruv
e.g. something like
"WARNING: The database RUV contains these elements not present in the
changelog max ruv:
....
These elements may be obsolete, in which case you should remove them.
If they are not obsolete, you should check those servers to make sure
replication is occurring."