Re: [389-users] Lots of abandoned connections from sssd

Monday, 10 November 2014


On 11/10/2014 03:32 PM, Orion Poplawski wrote:
...
 On 11/06/2014 03:14 AM, Rich Megginson wrote:
> On 11/06/2014 04:16 AM, Orion Poplawski wrote:
>> Just recently we're seeing some very strange behavior on our system.
>> Periodically we will see a sssd process start to have an ever greater number
>> of connections to our ldap server until the server runs out of file
>> descriptors.  This seems to be happening with a particular user, who is
>> having trouble logging in at times, particularly with email (dovecot).  We
>> see entries like the following on our sever:
>>
>> [05/Nov/2014:17:14:51 -0700] conn=1786153 op=0 EXT
>> oid="1.3.6.1.4.1.1466.20037" name="startTLS"
>> [05/Nov/2014:17:14:51 -0700] conn=1786153 op=0 RESULT err=0 tag=120
>> nentries=0 etime=0
>> [05/Nov/2014:17:14:51 -0700] conn=1786153 SSL 128-bit AES
>> [05/Nov/2014:17:14:51 -0700] conn=1786153 op=1 BIND
>> dn="uid=user,ou=People,dc=domain,dc=com" method=128 version=3
>> [05/Nov/2014:17:14:56 -0700] conn=1786153 op=2 ABANDON targetop=NOTFOUND
>> msgid=2
>> [05/Nov/2014:17:14:56 -0700] conn=1786153 op=3 UNBIND
>> [05/Nov/2014:17:14:56 -0700] conn=1786153 op=3 fd=1022 closed - U1
>>
>> I don't yet have debug info from the sssd process.  Any ideas from the
above?
>>
>> Restarting the sssd process seems to clear things up for a while.
>>
>> - Orion
>>
> Try to reproduce the problem while using gdb to capture stack traces every few
> seconds as in http://www.port389.org/docs/389ds/FAQ/faq.html#debugging-hangs
> Ideally, we can get some stack traces of the server during the time between
> the BIND and the ABANDON
 If I catch the problem early enough I can still get a stack trace.  A series
 of them are in http://www.cora.nwra.com/~orion/ns-slapd-trace.tar.gz.
 Anything useful there?
 These traces show the server is almost entirely idle - not working on 
client operations, not deadlocked or otherwise waiting on locks, not 
waiting on I/O to complete.  So either we're still not catching dirsrv 
in the act, or dirsrv is not the problem.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [389-users] Lots of abandoned connections from sssd