[389-users] CMP operations against pwdPolicySubentry hanging

Rich Megginson rmeggins at redhat.com
Tue Mar 13 00:11:30 UTC 2012


On 03/12/2012 05:40 PM, Iain Morgan wrote:
> On Thu, Feb 23, 2012 at 12:13:18 -0800, Iain Morgan wrote:
>> On Wed, Feb 15, 2012 at 18:19:10 -0600, Rich Megginson wrote:
>>> On 02/15/2012 03:51 PM, Iain Morgan wrote:
>>>> On Wed, Feb 15, 2012 at 15:04:52 -0600, Rich Megginson wrote:
>>>>> On 02/15/2012 01:56 PM, Iain Morgan wrote:
>>>>>> On Tue, Feb 14, 2012 at 19:54:39 -0600, Rich Megginson wrote:
>>>>>>> On 02/14/2012 06:37 PM, Iain Morgan wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> On a fairly frequent basis, one of my 389 DS servers hangs after certain
>>>>>>>> CMP operations. Once this happens, the server cannot be shutdown
>>>>>>>> gracefully. This has been going on for several weeks, and I have not yet
>>>>>>>> found a solution.
>>>>>>>>
>>>>>>>> My setup consists of two systems running RHEL 6.2 with 389 DS 1.2.9.16.
>>>>>>>> Multimaster replication is enabled between the two servers, but the
>>>>>>>> client systems (currently just two test systems) preferrentially use the
>>>>>>>> same server, ServerA. The second server, ServerB, is the one which is
>>>>>>>> experiencing the problem.
>>>>>>>>
>>>>>>>> We are using class-of-service entries to to set the values for the
>>>>>>>> shadowMax, shadowMin, and shadowWarning attributes. And we are
>>>>>>>> conditionally setting a pwdPolicySubentry attribute for some entries in
>>>>>>>> the same manner.
>>>>>>>>
>>>>>>>> If I execute an ldapcompare command, such as the following:
>>>>>>>>
>>>>>>>> # ldapcompare uid=imorgan,ou=People,dc=example,dc=com \
>>>>>>>> 	pwdpolicysubentry:"cn=Special Policy,ou=Policies,dc=example,dc=com"
>>>>>>>>
>>>>>>>> the command will occassionally hang. Most of the time, the command
>>>>>>>> succeeds and indicates that the attribute is not defined for that entry.
>>>>>>>> However, once or twice a day it will simply hang.
>>>>>>>>
>>>>>>>> The access log shows that the CMP request was received, but no result is
>>>>>>>> logged. After this occurs, the server will not shut down gracefully. The
>>>>>>>> init script fails to shut down the server and I end up having to send a
>>>>>>>> SIGKILL to ns-slapd.
>>>>>>> When you get the hang, can you attach to the process with gdb?
>>>>>>> ps -ef|grep ns-slapd
>>>>>>> gdb /usr/sbin/ns-slapd pid-of-ns-slapd
>>>>>>>> The error log does not report any issues.
>>>>>>>>
>>>>>>>> CMP operations against other attributes, such as loginShell, do not seem
>>>>>>>> to exhibit this problem. Also, the problem does not occur on ServerA;
>>>>>>>> only on ServerB. Once the CMP operation has hung, comparisons against
>>>>>>>> other attributes, even shadowMax, continue to work.
>>>>>>>>
>>>>>>>> As noted above, most of the time the CMP operation returns normally.
>>>>>>>> However, if I reinitialize ServerB from ServerA, the problem occurs with
>>>>>>>> the first CMP operation against ServerB.
>>>>>>>>
>>>>>>>> Both servers have the same set of RPMs and the dse.ldif on both systems
>>>>>>>> do not have any significant differences.
>>>>>>>>
>>>>>>>> Has anyone seen a similar issue? Any suggestions on how to debug of fix
>>>>>>>> this?
>>>>>>>>
>>>>>>>> A somewhat simplified and redacted version of the class-of-service
>>>>>>>> configuration is listed below.
>>>>>>>>
>>>>>>>> Thanks
>>>>>> A gzip'd copy of the 'thread apply all bt full' output is attached.
>>>>>>
>>>>> Thanks.  Can you do this again after installing the
>>>>> 389-ds-base-debuginfo package?
>>>>> debuginfo-install 389-ds-base
>>>> Ah, sorry about that. Here's the updated output.
>>>>
>>>>> Are you using Views?
>>>>> http://docs.redhat.com/docs/en-US/Red_Hat_Directory_Server/9.0/html/Administration_Guide/using-views.html
>>>> No.
>>>>
>>> Thanks!  This looks like a symptom of
>>> https://fedorahosted.org/389/ticket/247 fixed in 1.2.10
>> Hello Rich,
>>
>> Thanks, I upgraded both of the servers to 1.2.10.1. Unfortunately, it
>> did not resolve the issue. I also noticed that if I run the same
>> ldapcompare command after the first try fails, the server crashes. I
>> can't say whether that is a change in the behaviour, but it is a new
>> observation.
>>
>> I've attached gdb output for the case where the first ldapcompare is
>> hanging. And, I've also attached the gdb analysis of the core dump.
>>
>> -- 
>> Iain Morgan
> I've tested 1.2.10.3 and can confirm that it addresses the segfault.
> However, the hang (presumably a deadlock) has not gone away. I don't
> seem to be able to update bug #305 now that it is closed, so I am
> attaching the gdb backtrace of ns-slapd 1.2.10.3 during the server hang.
>
Thanks - reopened and attached your stack trace



More information about the 389-users mailing list