Re: [389-users] 389 directory server crash

Monday, 15 July 2013

On 07/12/2013 05:55 PM, Rich Megginson wrote:
...
 On 07/12/2013 08:22 AM, Mitja Mihelič wrote:
> On 07/09/2013 03:34 PM, Rich Megginson wrote:
>> On 07/09/2013 06:43 AM, Mitja Mihelič wrote:
>>> Hi!
>>>
>>> We are having problems with some our 389-DS instances. They crash 
>>> after receiving an update from the provider.
>>
>> After looking at the stack trace, I think this is 
>> https://fedorahosted.org/389/ticket/47391 Yes, it looks like it might be
it. When CONSUMER_ONE crashed for the 
first time, the last thing replicated was a password change.
Do you perhaps know, where I could get a 389DS version for Centos6 that 
has the patch? The ticket says it was pushed to 1.2.11, but would seem 
that our 1.2.11.15-14 is still an unpatched one and the repositories do 
not have any newer versions.
...
>>
>>> The crash happened twice after about a week of running without 
>>> problems. The crashes happened on two consumer servers but not at 
>>> the same time.
>>> The servers are running CentOS 6x with the following 389DS packages 
>>> installed:
>>> 389-ds-console-doc-1.2.6-1.el6.noarch
>>> 389-console-1.1.7-1.el6.noarch
>>> 389-adminutil-1.1.15-1.el6.x86_64
>>> 389-dsgw-1.1.10-1.el6.x86_64
>>> 389-ds-base-debuginfo-1.2.11.15-14.el6_4.x86_64
>>> 389-admin-1.1.29-1.el6.x86_64
>>> 389-ds-console-1.2.6-1.el6.noarch
>>> 389-admin-console-doc-1.1.8-1.el6.noarch
>>> 389-ds-1.2.2-1.el6.noarch
>>> 389-ds-base-1.2.11.15-14.el6_4.x86_64
>>> 389-ds-base-libs-1.2.11.15-14.el6_4.x86_64
>>> 389-admin-console-1.1.8-1.el6.noarch
>>>
>>> We are in the process of replacing the Centos 5x base 
>>> consumer+provider setup with a CentOS 6x base one. For the time 
>>> being, the CentOS 6 machines are acting as consumers for the old 
>>> server. They run for a while and then the replicated instances 
>>> crash though not at the same time.
>>> One of the servers did not want to start after the crash,
>>
>> Can you provide the error messages from the errors log?
> I have attached error logs from the provider 
> (2013-06-27-provider_error) and the consumer 
> (2013-06-27-server_two_error) in question.
>>
>>> so I have run db2index on its database. It's been running for four 
>>> days and it has still not finished. 
>>
>> Try exporting using db2ldif, then importing using ldif2db.
> The export process hangs. After an hour strace still shows:
> futex(0x7f5822670ed4, FUTEX_WAIT, 1, NULL
> The error log for this is attached as 
> 2013-07-10-server_two-ldif_import_hangs.

 Are you using db2ldif or db2ldif.pl?  If you are using db2ldif, is the 
 server running?  If not, please try first shutting down the server and 
 use db2ldif.

 If db2ldif still hangs, then please follow the instructions at 
 http://port389.org/wiki/FAQ#Debugging_Hangs to get a stack trace of 
 the hung process. I was using db2ldif with the server shut down. I tried it again
and it 
hung. The LDIF file was created but its size was zero. The produced 
stack trace is attached as 
server_two-db2ldif_hang-stacktrace.1373877200.txt.

...

>
>>
>>> All I get from db2index now are these outputs:
>>> [09/Jul/2013:13:29:11 +0200] - reindex db: Processed 65095 entries 
>>> (pass 1104) -- average rate 53686277.5/sec, recent rate 0.0/sec, 
>>> hit ratio 0%
>>
>> How many entries do you have in your database?
> The number revolves around 65400. It varies perhaps 2 user del/add 
> operations a month and 20 attribute changes per week, if that.
>>
>>>
>>> The other instance did start up, but the replication process did 
>>> not work anymore. I disabled the replication to this host and set 
>>> it up again. I chose "Initialize consumer now" and the consumer 
>>> crashed every time.
>>
>> Can provide a stack trace of the core when the server crashes?  This 
>> may be different than the stack trace below.
> The last provided stack trace was produced at the last server crash. 
> I will provide another stack trace when CONSUMER_ONE crashes again. 
> Currently it refuses to crash at initialization time and keeps running.
>>
>>> I have enabled full error logging and could find nothing.
>>> I have read a few threads (not all, I admit) on this list and 
>>> http://directory.fedoraproject.org/wiki/FAQ#Debugging_Crashes and 
>>> tried to troubleshoot.
>>>
>>> The crash produced the attached core dump and I could use your help 
>>> with understanding it. As well as any help with the crash. If more 
>>> info is needed I will gladly provide it.
>>>
>>> Regards, Mitja
>>>
>>>
>>>
>>> --
>>> 389 users mailing list
>>> 389-users(a)lists.fedoraproject.org
>>> https://admin.fedoraproject.org/mailman/listinfo/389-users
>>
>

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [389-users] 389 directory server crash