[389-users] db import failure, when setting replication up

Reinhard Nappert rnappert at juniper.net
Tue May 24 13:27:34 UTC 2011


I do that.

Now, I have two questions:

So, what db version do you recommend?

More importantly, is there a migration path or do I have to reload the existing data? I could see issues migrating replicated environments.

Thanks,
-Reinhard

________________________________
From: 389-users-bounces at lists.fedoraproject.org [mailto:389-users-bounces at lists.fedoraproject.org] On Behalf Of Noriko Hosoi
Sent: Monday, May 23, 2011 1:42 PM
To: 389-users at lists.fedoraproject.org
Subject: Re: [389-users] db import failure, when setting replication up

Hi Reinhard,

That was an unfortunate...  I was hoping you were using a newer version. :)  You hit this bug.

Bug 472131<https://bugzilla.redhat.com/show_bug.cgi?id=472131> - dbverify: when a duplicate is large enough to have internal page(s), dbverify issues bogus out-of-order key errors

The bug was fixed by Sleepycat on db4.8.  And we ported the fix back to 4.3, but no chance to do so to 4.2.  So, we cannot use dbverify to check if the index file is healthy or not...  Could it be possible to reindex the ancestorid index and see if the error goes away?  (Or you could reinitialize the consumer?  That would be the cleanest)

Thanks,
--noriko

Reinhard Nappert wrote:
Hi Noriko,

I run it on a CentOS 4.4 box (Linux 2.6.24). I use the db 4.2 libs with all the patches.

Oh, yes dbverify does complain a lot. I see for all of the db files messages like:

[20/May/2011:11:03:05 -0400] DB verify - verify failed(-30976): /var/lib/dirsrv/slapd-ID/db/userRoot/cn.db4
[20/May/2011:11:03:06 -0400] - libdb: Page 5: out-of-order key at entry 2
[20/May/2011:11:03:06 -0400] - libdb: Page 5: out-of-order key at entry 5
[20/May/2011:11:03:06 -0400] - libdb: Page 5: out-of-order key at entry 8
[20/May/2011:11:03:06 -0400] - libdb: Page 5: out-of-order key at entry 10
[20/May/2011:11:03:06 -0400] - libdb: Page 5: out-of-order key at entry 13
[20/May/2011:11:03:06 -0400] - libdb: Page 5: out-of-order key at entry 16
[20/May/2011:11:03:06 -0400] - libdb: Page 5: out-of-order key at entry 19
[20/May/2011:11:03:06 -0400] - libdb: Page 5: out-of-order key at entry 21
[20/May/2011:11:03:07 -0400] DB verify - verify failed(-30976): /var/lib/dirsrv/slapd-ID/db/userRoot/parentid.db4
DB verify: Passed
This said, I guess I should re-index the entire db. Any idea, why this is happening?

Right now, I have a 2 MMR setup, where both masters also have a replication agreement to a third box, which is a dedicated consumer. I do run tests, where I perform simultaneously adds and deletes (not on the same object) on all three boxes. I just want to verify how replication behaves in 1.2.8.

-Reinhard

________________________________
From: 389-users-bounces at lists.fedoraproject.org<mailto:389-users-bounces at lists.fedoraproject.org> [mailto:389-users-bounces at lists.fedoraproject.org] On Behalf Of Noriko Hosoi
Sent: Thursday, May 19, 2011 5:33 PM
To: 389-users at lists.fedoraproject.org<mailto:389-users at lists.fedoraproject.org>
Subject: Re: [389-users] db import failure, when setting replication up

Hi Reinhard,

Could you tell me the OS version and Berkeley DB version (rpm -q db4)?

Could you run "/usr/lib[64]/dirsrv/slapd-ID/dbverify"?  Does it complain anything?  Especially, the ancestorid index?  If it does, you may want to re-create the corrupted index...
--noriko

Reinhard Nappert wrote:
Noriko,

I observed one more item, which does not bother me right now, but you may want to see:

I am not sure why and how it happened,  but I see the following message on the supplier:

[18/May/2011:13:59:50 -0400] NSMMReplicationPlugin - agmt="cn=supplier2consumer" (consumer:389): Consumer failed to replay change (uniqueid aea3731d-808711e0-83d5fdc8-f32b8f3c, CSN 4dd4085b004800040000): Operations error. Will retry later.

And I see the following on the consumer:
[18/May/2011:13:59:29 -0400] - idl_new.c BAD 22, err=-30988 DB_PAGE_NOTFOUND: Requested page not found
[18/May/2011:13:59:29 -0400] - ancestorid BAD 13120, err=-30988 DB_PAGE_NOTFOUND: Requested page not found

 Any idea, what happened there....

Thanks,
-Reinhard



________________________________
From: 389-users-bounces at lists.fedoraproject.org<mailto:389-users-bounces at lists.fedoraproject.org> [mailto:389-users-bounces at lists.fedoraproject.org] On Behalf Of Noriko Hosoi
Sent: Tuesday, May 17, 2011 4:02 PM
To: General discussion list for the 389 Directory server project.
Subject: Re: [389-users] db import failure, when setting replication up

Hi Reinhard,

Reinhard Nappert wrote:
Hi Noriko,

I have to correct myself. The box which had the import issue was on a 1.2.7.5 system. The other box was running 1.2.8.2.

So, it looks like you have fixed the issue with 1.2.8.2.
*relieved*  Thanks for testing it on 1.2.8.2!
--noriko

Thanks,
-Reinhard

________________________________
From: 389-users-bounces at lists.fedoraproject.org<mailto:389-users-bounces at lists.fedoraproject.org> [mailto:389-users-bounces at lists.fedoraproject.org] On Behalf Of Reinhard Nappert
Sent: Tuesday, May 17, 2011 3:21 PM
To: General discussion list for the 389 Directory server project.
Subject: Re: [389-users] db import failure, when setting replication up

1.2.8.2

-Reinhard

________________________________
From: 389-users-bounces at lists.fedoraproject.org<mailto:389-users-bounces at lists.fedoraproject.org> [mailto:389-users-bounces at lists.fedoraproject.org] On Behalf Of Noriko Hosoi
Sent: Tuesday, May 17, 2011 2:16 PM
To: General discussion list for the 389 Directory server project.
Subject: Re: [389-users] db import failure, when setting replication up

It looks to me you have hit this bug...  Which version of 389-ds-base you are running?
Bug 684996<https://bugzilla.redhat.com/show_bug.cgi?id=684996> - Exported tombstone cannot be imported correctly.
The patch should be in the version 1.2.8.2.
Thanks,
--noriko

On 05/17/2011 11:03 AM, Reinhard Nappert wrote:
Hi,

I have seen the following:

I set 2 systems up in MMR. Replication worked. For some reason, I needed to take one of the boxes out of the replication and disabled replication. Later on, I enabled it again and created the shadowing agreement to the other box. Now, I saw the following errors during the import of the db:

[17/May/2011:11:46:04 -0400] NSMMReplicationPlugin - multimaster_be_state_change
: replica o=base is going offline; disabling replication
[17/May/2011:11:46:07 -0400] - WARNING: Import is running with nsslapd-db-privat
e-import-mem on; No other process is allowed to access the database
[17/May/2011:11:46:08 -0400] - import userRoot: WARNING: Skipping entry "nsuniqu
eid=06869502-7fe011e0-8f589300-7e7b2163,ou=sample,o=base" which has no parent,
ending at line 0 of file "(bulk import)"
[17/May/2011:11:46:08 -0400] - import userRoot: WARNING: bad entry: ID 453
.....

Any idea, what is going on there?

Thanks,
-Reinhard


--
389 users mailing list
389-users at lists.fedoraproject.org<mailto:389-users at lists.fedoraproject.org>
https://admin.fedoraproject.org/mailman/listinfo/389-users



--
389 users mailing list
389-users at lists.fedoraproject.org<mailto:389-users at lists.fedoraproject.org>
https://admin.fedoraproject.org/mailman/listinfo/389-users



--
389 users mailing list
389-users at lists.fedoraproject.org<mailto:389-users at lists.fedoraproject.org>
https://admin.fedoraproject.org/mailman/listinfo/389-users



--
389 users mailing list
389-users at lists.fedoraproject.org<mailto:389-users at lists.fedoraproject.org>
https://admin.fedoraproject.org/mailman/listinfo/389-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.fedoraproject.org/pipermail/389-users/attachments/20110524/54966576/attachment.html>


More information about the 389-users mailing list