[389-users] Non-contiguous attribute values

Tue Mar 11 22:38:51 UTC 2014

On 03/11/2014 04:09 PM, Timothy Pollard wrote:
> On Tue, 11 Mar 2014 07:17:25 -0600
> Rich Megginson <rmeggins at redhat.com> wrote:
>> On 03/10/2014 09:17 PM, Timothy Pollard wrote:
>>> On Mon, 10 Mar 2014 20:56:08 -0600
>>> Rich Megginson <rmeggins at redhat.com> wrote:
>>>> On 03/10/2014 08:42 PM, Timothy Pollard wrote:
>>>>> A small update; we're now
>>>> Now as opposed to some time in the past?  At what point did you begin
>>>> seeing these messages, and what changed?
>>> It looks like it started after I manually "fixed" the entry.
>> What exactly did you do to fix the entry?
> I edited it and filled it what looked like the missing values (which I copied
> from an old LDIF file):
>
> dNSClass: IN
> zoneName: cvsdude.com
> relativeDomainName: testingstatus
> objectClass: top
> objectClass: dNSZone
> dNSTTL: 100

Did you use ldapdelete to delete old one and ldapmodify/ldapadd to add 
this fixed one?

>
>>> As I said it is a
>>> test entry, so I'm happy to delete it entirely and recreate it if you think
>>> this will fix the issue,
>> I don't think it will fix the issue, but it may help reproduce it more easily.
>>
>>> but I can hold off on that if you'd like me to find
>>> out more information.
>> If you are not experiencing the "non-contiguous" problem now, there's not
>> much information to get.
>>
> We're not seeing the non-contiguous problem any more, but we are seeing
> repeated DB crashes:
>
> [11/Mar/2014:21:57:14 +0000] - libdb: dnsRoot/id2entry.db4 page 36132 is on free list with type 5
> [11/Mar/2014:21:57:14 +0000] - libdb: PANIC: Invalid argument
> [11/Mar/2014:21:57:14 +0000] - libdb: PANIC: fatal region error detected; run recovery
> [11/Mar/2014:21:57:14 +0000] - Serious Error---Failed in dblayer_txn_abort, err=-30974 (DB_RUNRECOVERY: Fatal error, run database recovery)
> [11/Mar/2014:21:57:14 +0000] - libdb: PANIC: fatal region error detected; run recovery
> [11/Mar/2014:21:57:14 +0000] - FATAL ERROR at idl_new.c (1); server stopping as database recovery needed.

I don't suppose you are running out of disk space?  Any other disk 
errors?  Is this a VM with a virtual disk image holding the db?

>
> This happens within a few minutes after every restart of the daemon. I'm not
> sure if this is related though. It (the new DB error) first occurred after
> ns-slapd was killed by the oom-killer. Could that cause database corruption?

It is not supposed to, but it is a possibility.

>
> It also looks like we might need to do some memory tuning on 389, is there some
> suggested documentation on that, or should I just google it?
https://access.redhat.com/site/documentation/en-US/Red_Hat_Directory_Server/9.0/html/Performance_Tuning_Guide/index.html
is a good place to start
>
> At the moment we've switched to our other master (we use a multi-master
> replication setup), so we'll probably just rebuild the problem server from
> there, but is there anything that I should look at to diagnose the problem first?

I'm not sure.  Looks like we are now working on several different 
problems in various states of knowledge/severity . . .

>
> Thanks,
>
>
> --
> 389 users mailing list
> 389-users at lists.fedoraproject.org
> https://admin.fedoraproject.org/mailman/listinfo/389-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.fedoraproject.org/pipermail/389-users/attachments/20140311/fafaf6cd/attachment.html>