On Wed, Feb 15, 2012 at 18:19:10 -0600, Rich Megginson wrote:
> On 02/15/2012 03:51 PM, Iain Morgan wrote:
>> On Wed, Feb 15, 2012 at 15:04:52 -0600, Rich Megginson wrote:
>>> On 02/15/2012 01:56 PM, Iain Morgan wrote:
>>>> On Tue, Feb 14, 2012 at 19:54:39 -0600, Rich Megginson wrote:
>>>>> On 02/14/2012 06:37 PM, Iain Morgan wrote:
>>>>>> Hello,
>>>>>>
>>>>>> On a fairly frequent basis, one of my 389 DS servers hangs after
certain
>>>>>> CMP operations. Once this happens, the server cannot be shutdown
>>>>>> gracefully. This has been going on for several weeks, and I have
not yet
>>>>>> found a solution.
>>>>>>
>>>>>> My setup consists of two systems running RHEL 6.2 with 389 DS
1.2.9.16.
>>>>>> Multimaster replication is enabled between the two servers, but
the
>>>>>> client systems (currently just two test systems) preferrentially
use the
>>>>>> same server, ServerA. The second server, ServerB, is the one
which is
>>>>>> experiencing the problem.
>>>>>>
>>>>>> We are using class-of-service entries to to set the values for
the
>>>>>> shadowMax, shadowMin, and shadowWarning attributes. And we are
>>>>>> conditionally setting a pwdPolicySubentry attribute for some
entries in
>>>>>> the same manner.
>>>>>>
>>>>>> If I execute an ldapcompare command, such as the following:
>>>>>>
>>>>>> # ldapcompare uid=imorgan,ou=People,dc=example,dc=com \
>>>>>> pwdpolicysubentry:"cn=Special
Policy,ou=Policies,dc=example,dc=com"
>>>>>>
>>>>>> the command will occassionally hang. Most of the time, the
command
>>>>>> succeeds and indicates that the attribute is not defined for that
entry.
>>>>>> However, once or twice a day it will simply hang.
>>>>>>
>>>>>> The access log shows that the CMP request was received, but no
result is
>>>>>> logged. After this occurs, the server will not shut down
gracefully. The
>>>>>> init script fails to shut down the server and I end up having to
send a
>>>>>> SIGKILL to ns-slapd.
>>>>> When you get the hang, can you attach to the process with gdb?
>>>>> ps -ef|grep ns-slapd
>>>>> gdb /usr/sbin/ns-slapd pid-of-ns-slapd
>>>>>> The error log does not report any issues.
>>>>>>
>>>>>> CMP operations against other attributes, such as loginShell, do
not seem
>>>>>> to exhibit this problem. Also, the problem does not occur on
ServerA;
>>>>>> only on ServerB. Once the CMP operation has hung, comparisons
against
>>>>>> other attributes, even shadowMax, continue to work.
>>>>>>
>>>>>> As noted above, most of the time the CMP operation returns
normally.
>>>>>> However, if I reinitialize ServerB from ServerA, the problem
occurs with
>>>>>> the first CMP operation against ServerB.
>>>>>>
>>>>>> Both servers have the same set of RPMs and the dse.ldif on both
systems
>>>>>> do not have any significant differences.
>>>>>>
>>>>>> Has anyone seen a similar issue? Any suggestions on how to debug
of fix
>>>>>> this?
>>>>>>
>>>>>> A somewhat simplified and redacted version of the
class-of-service
>>>>>> configuration is listed below.
>>>>>>
>>>>>> Thanks
>>>> A gzip'd copy of the 'thread apply all bt full' output is
attached.
>>>>
>>> Thanks. Can you do this again after installing the
>>> 389-ds-base-debuginfo package?
>>> debuginfo-install 389-ds-base
>> Ah, sorry about that. Here's the updated output.
>>
>>> Are you using Views?
>>>
http://docs.redhat.com/docs/en-US/Red_Hat_Directory_Server/9.0/html/Admin...
>> No.
>>
> Thanks! This looks like a symptom of
>
https://fedorahosted.org/389/ticket/247 fixed in 1.2.10
Hello Rich,
Thanks, I upgraded both of the servers to 1.2.10.1. Unfortunately, it
did not resolve the issue. I also noticed that if I run the same
ldapcompare command after the first try fails,
the server crashes. I
can't say whether that is a change in the behaviour, but it is a new
observation.
So with 1.2.10, in addition to the hang, you also get a crash?
Please file a ticket at
along with your
configuration and steps to reproduce.
I've attached gdb output for the case where the first ldapcompare
is
hanging. And, I've also attached the gdb analysis of the core dump.