>
> I have a question / concern though. I thought that we want dbscan 2
> ldif for emergency recovery scenarios when all else has gone bad and
> assuming that id2entry is still readable. In the approach you
> described we make the assumption that the parentid index is readable
> as well. So we depend on two files instead of one for exporting the
> database. Does this matter or we don't care at all?
There are two scenarios here in my opinion. Backup, and emergency
backup :-) As I've previously stated: performance is important. It
should not take forever to process a 100 million entry database. I
think the tool should use multiple index files (id2entry + friends) if
we can generate the LDIF faster. But, if some of those indexes are
corrupted, then we need an alternate algorithm to generate it just from
id2entry. Also, if we are dealing with a corrupted db, then performance
is not important, recovery is. So if we can do it fast, do it,
otherwise grind it out.
All that being said there is something we need to consider, which I
don't have an answer for, and that is when databases do get corrupted
which files typically get corrupted? Is it indexes, or is it id2entry?
To be honest database corruption doesn't happen very often, but the tool
should be smart enough to realize that the data could be inaccurate.
Perhaps a parent could be missing, etc. So the tool should be robust
enough to use multiple techniques to complete an entry, and if it can't
it should log something, or better yet create a rejects file that an
Admin can take and repair manually.
I know this is getting more complicated, but we need to keep these
things in mind.
Regards,
Mark
>
With the current design of id2entry and friends, we can't automatically
detect this so easily. I think we should really just have a flag on
dbscan that says "ignore everything BUT id2entry" and recover all you
can. We should leave this to a human to make that call.
If our database had proper checksumming of content and pages, we could
detect this, but today that's not the case :(
--
Sincerely,
William Brown
Software Engineer
Red Hat, Australia/Brisbane