On 03/14/2012 05:58 PM, Alther, Nicholas wrote:
I am seeking some assistance in isolating a re-occurring problem we
are experiencing with our 389 DS Version 1.2.8.3 installation. We use
the directory server for user authentication to our website. Every
couple of days we start getting errors from our website login
application reporting a user authentication timed out. These timeouts
get more frequent as time passes. Our fix now is to restart the
directory server which fixes the problem for a couple of days then the
timeouts start happening again. I traced one application timeout back
to the ds access logs and found the following entry at the same time:
[14/Mar/2012:10:23:01 -0500] conn=14730 op=-1 fd=1093 closed error 104
(Connection reset by peer) - TCP connection reset by peer.
I looked through the older logs and the only time this conn/fd was
used was two days ago. Here are the access log entries:
[12/Mar/2012:14:33:06 -0500] conn=14730 fd=1093 slot=1093 connection
from 10.1.xx.xx to 10.1.xx.xx
[12/Mar/2012:14:33:06 -0500] conn=14730 op=0 BIND
dn="uid,dc=domain,dc=com" method=128 version=3
[12/Mar/2012:14:33:06 -0500] conn=14730 op=0 RESULT err=0 tag=97
nentries=0 etime=0 dn="uid=theManager,dc=domain,dc=com"
[12/Mar/2012:14:33:06 -0500] conn=14730 op=1 SRCH
base="ou=users,ou=external,dc=domain,dc=com" scope=2
filter="(&(uid=xxxxx)(objectClass=inetUser))" attrs="1.1"
[12/Mar/2012:14:33:06 -0500] conn=14730 op=1 RESULT err=0 tag=101
nentries=1 etime=0
[12/Mar/2012:14:33:06 -0500] conn=14730 op=2 BIND
dn="uid=xxxxx,ou=users,ou=external,dc=domain,dc=com" method=128 version=3
[12/Mar/2012:14:33:06 -0500] conn=14730 op=2 RESULT err=0 tag=97
nentries=0 etime=0 dn="uid=xxxxx,ou=users,ou=external,dc=domain,dc=com"
[12/Mar/2012:14:33:06 -0500] conn=14730 op=3 BIND
dn="uid,dc=domain,dc=com" method=128 version=3
[12/Mar/2012:14:33:06 -0500] conn=14730 op=3 RESULT err=0 tag=97
nentries=0 etime=0 dn="uid=theManager,dc= domain,dc=com"
[12/Mar/2012:14:35:20 -0500] conn=14730 op=4 SRCH
base="ou=groups,ou=external,dc= domain,dc=com" scope=2
filter="(&(cn=domain)(|(objectClass=groupOfURLs)(objectClass=groupOfNames)))"
attrs="1.1"
[12/Mar/2012:14:35:20 -0500] conn=14730 op=4 RESULT err=0 tag=101
nentries=1 etime=0
[12/Mar/2012:14:35:20 -0500] conn=14730 op=5 SRCH
base="ou=groups,ou=external,dc= domain,dc=com" scope=2
filter="(&(member=cn=domain,ou=groups,ou=external,dc=domain,dc=com)(objectClass=groupOfNames))"
attrs="cn"
[12/Mar/2012:14:35:20 -0500] conn=14730 op=5 RESULT err=0 tag=101
nentries=0 etime=0
[12/Mar/2012:14:36:50 -0500] conn=14730 op=6 SRCH
base="ou=users,ou=external,dc=domain,dc=com" scope=2
filter="(&(uid=xxxxxx)(objectClass=inetUser))" attrs="1.1"
[12/Mar/2012:14:36:50 -0500] conn=14730 op=6 RESULT err=0 tag=101
nentries=1 etime=0
[12/Mar/2012:14:36:50 -0500] conn=14730 op=7 BIND
dn="uid=xxxxxx,ou=users,ou=external,dc=domain,dc=com" method=128 version=3
[12/Mar/2012:14:36:50 -0500] conn=14730 op=7 RESULT err=0 tag=97
nentries=0 etime=0 dn="uid=xxxxxx,ou=users,ou=external,dc=domain,dc=com"
[12/Mar/2012:14:36:50 -0500] conn=14730 op=8 BIND
dn="uid=theManager,dc=domain,dc=com" method=128 version=3
[12/Mar/2012:14:36:50 -0500] conn=14730 op=8 RESULT err=0 tag=97
nentries=0 etime=0 dn="uid=theManager,dc=domain,dc=com"
[12/Mar/2012:14:37:02 -0500] conn=14730 op=9 SRCH
base="ou=groups,ou=external,dc=domain,dc=com" scope=2
filter="(&(cn=domain)(|(objectClass=groupOfURLs)(objectClass=groupOfNames)))"
attrs="1.1"
[12/Mar/2012:14:37:02 -0500] conn=14730 op=9 RESULT err=0 tag=101
nentries=1 etime=0
[12/Mar/2012:14:37:02 -0500] conn=14730 op=10 SRCH
base="ou=groups,ou=external,dc=domain,dc=com" scope=2
filter="(&(member=cn=domain,ou=groups,ou=external,dc=domain,dc=com)(objectClass=groupOfNames))"
attrs="cn"
[12/Mar/2012:14:37:02 -0500] conn=14730 op=10 RESULT err=0 tag=101
nentries=0 etime=0
[12/Mar/2012:14:39:35 -0500] conn=14730 op=11 SRCH
base="ou=users,ou=external,dc=domain,dc=com" scope=2
filter="(&(uid=xxxxxx)(objectClass=inetUser))" attrs="1.1"
[12/Mar/2012:14:39:35 -0500] conn=14730 op=11 RESULT err=0 tag=101
nentries=1 etime=0
[12/Mar/2012:14:39:35 -0500] conn=14730 op=12 BIND
dn="uid=xxxxxx,ou=users,ou=external,dc=domain,dc=com" method=128 version=3
[12/Mar/2012:14:39:35 -0500] conn=14730 op=12 RESULT err=0 tag=97
nentries=0 etime=0 dn="uid=xxxxxx,ou=users,ou=external,dc=domain,dc=com"
[12/Mar/2012:14:39:35 -0500] conn=14730 op=13 BIND
dn="uid=theManager,dc=domain,dc=com" method=128 version=3
[12/Mar/2012:14:39:35 -0500] conn=14730 op=13 RESULT err=0 tag=97
nentries=0 etime=0 dn="uid=theManager,dc=domain,dc=com"
[12/Mar/2012:14:40:23 -0500] conn=14730 op=14 SRCH
base="ou=groups,ou=external,dc=domain,dc=com" scope=2
filter="(&(cn=domain)(|(objectClass=groupOfURLs)(objectClass=groupOfNames)))"
attrs="1.1"
[12/Mar/2012:14:40:23 -0500] conn=14730 op=14 RESULT err=0 tag=101
nentries=1 etime=0
[12/Mar/2012:14:40:23 -0500] conn=14730 op=15 SRCH
base="ou=groups,ou=external,dc=domain,dc=com" scope=2
filter="(&(member=cn=domain,ou=groups,ou=external,dc=domain,dc=com)(objectClass=groupOfNames))"
attrs="cn"
[12/Mar/2012:14:40:23 -0500] conn=14730 op=15 RESULT err=0 tag=101
nentries=0 etime=0
The scenario seems to be that the DS works fine after a restart until
it runs out of unused connections and/or file descriptors (max FDs=
8192). When it starts recycling connections and/or file descriptors
the 104 errors start appearing more often in the access logs and we
start getting more authentication errors. We suspect that the
original connection never got terminated correctly but don’t know if
it is the application that is at fault or a DS setting.
We fixed some of these sorts of connection issues in 1.2.9.9. I suggest
upgrading to that release.
In the meantime, you could try to lower the nsslapd-idletimeout and/or
the nsslapd-ioblocktimeout
http://docs.redhat.com/docs/en-US/Red_Hat_Directory_Server/9.0/html/Confi...
Our servers have been tuned according to the wiki doc at
http://directory.fedoraproject.org/wiki/Performance_Tuning#Linux
We have set our idle “timeout” to 60 seconds and search “timelimit” to
120 seconds with no change in behavior.
Watching netstat -nap | grep slapd shows established connections that
do not drop off, just continually grow.
Any help would be greatly appreciated.
*Nicholas J Alther*
Sr. Software Developer/Analyst
Black Hills Corporation
Phone: 605.721.2158
Cell: 605.593.1899
*Nicholas J Alther*
Sr. Software Developer/Analyst
Phone: 605.721.2158
Cell: 605.593.1899
------------------------------------------------------------------------
This electronic message transmission contains information from Black
Hills Corporation, its affiliate or subsidiary, which may be
confidential or privileged. The information is intended to be for the
use of the individual or entity named above. If you are not the
intended recipient, be aware the disclosure, copying, distribution or
use of the contents of this information is prohibited. If you received
this electronic transmission in error, please reply to sender
immediately; then delete this message without copying it or further
reading.
--
389 users mailing list
389-users(a)lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users