I am seeking some assistance in isolating a re-occurring problem we are experiencing with
our 389 DS Version 1.2.8.3 installation. We use the directory server for user
authentication to our website. Every couple of days we start getting errors from our
website login application reporting a user authentication timed out. These timeouts get
more frequent as time passes. Our fix now is to restart the directory server which fixes
the problem for a couple of days then the timeouts start happening again. I traced one
application timeout back to the ds access logs and found the following entry at the same
time:
[14/Mar/2012:10:23:01 -0500] conn=14730 op=-1 fd=1093 closed error 104 (Connection reset
by peer) - TCP connection reset by peer.
I looked through the older logs and the only time this conn/fd was used was two days ago.
Here are the access log entries:
[12/Mar/2012:14:33:06 -0500] conn=14730 fd=1093 slot=1093 connection from 10.1.xx.xx to
10.1.xx.xx
[12/Mar/2012:14:33:06 -0500] conn=14730 op=0 BIND dn="uid,dc=domain,dc=com"
method=128 version=3
[12/Mar/2012:14:33:06 -0500] conn=14730 op=0 RESULT err=0 tag=97 nentries=0 etime=0
dn="uid=theManager,dc=domain,dc=com"
[12/Mar/2012:14:33:06 -0500] conn=14730 op=1 SRCH
base="ou=users,ou=external,dc=domain,dc=com" scope=2
filter="(&(uid=xxxxx)(objectClass=inetUser))" attrs="1.1"
[12/Mar/2012:14:33:06 -0500] conn=14730 op=1 RESULT err=0 tag=101 nentries=1 etime=0
[12/Mar/2012:14:33:06 -0500] conn=14730 op=2 BIND
dn="uid=xxxxx,ou=users,ou=external,dc=domain,dc=com" method=128 version=3
[12/Mar/2012:14:33:06 -0500] conn=14730 op=2 RESULT err=0 tag=97 nentries=0 etime=0
dn="uid=xxxxx,ou=users,ou=external,dc=domain,dc=com"
[12/Mar/2012:14:33:06 -0500] conn=14730 op=3 BIND dn="uid,dc=domain,dc=com"
method=128 version=3
[12/Mar/2012:14:33:06 -0500] conn=14730 op=3 RESULT err=0 tag=97 nentries=0 etime=0
dn="uid=theManager,dc= domain,dc=com"
[12/Mar/2012:14:35:20 -0500] conn=14730 op=4 SRCH base="ou=groups,ou=external,dc=
domain,dc=com" scope=2
filter="(&(cn=domain)(|(objectClass=groupOfURLs)(objectClass=groupOfNames)))"
attrs="1.1"
[12/Mar/2012:14:35:20 -0500] conn=14730 op=4 RESULT err=0 tag=101 nentries=1 etime=0
[12/Mar/2012:14:35:20 -0500] conn=14730 op=5 SRCH base="ou=groups,ou=external,dc=
domain,dc=com" scope=2
filter="(&(member=cn=domain,ou=groups,ou=external,dc=domain,dc=com)(objectClass=groupOfNames))"
attrs="cn"
[12/Mar/2012:14:35:20 -0500] conn=14730 op=5 RESULT err=0 tag=101 nentries=0 etime=0
[12/Mar/2012:14:36:50 -0500] conn=14730 op=6 SRCH
base="ou=users,ou=external,dc=domain,dc=com" scope=2
filter="(&(uid=xxxxxx)(objectClass=inetUser))" attrs="1.1"
[12/Mar/2012:14:36:50 -0500] conn=14730 op=6 RESULT err=0 tag=101 nentries=1 etime=0
[12/Mar/2012:14:36:50 -0500] conn=14730 op=7 BIND
dn="uid=xxxxxx,ou=users,ou=external,dc=domain,dc=com" method=128 version=3
[12/Mar/2012:14:36:50 -0500] conn=14730 op=7 RESULT err=0 tag=97 nentries=0 etime=0
dn="uid=xxxxxx,ou=users,ou=external,dc=domain,dc=com"
[12/Mar/2012:14:36:50 -0500] conn=14730 op=8 BIND
dn="uid=theManager,dc=domain,dc=com" method=128 version=3
[12/Mar/2012:14:36:50 -0500] conn=14730 op=8 RESULT err=0 tag=97 nentries=0 etime=0
dn="uid=theManager,dc=domain,dc=com"
[12/Mar/2012:14:37:02 -0500] conn=14730 op=9 SRCH
base="ou=groups,ou=external,dc=domain,dc=com" scope=2
filter="(&(cn=domain)(|(objectClass=groupOfURLs)(objectClass=groupOfNames)))"
attrs="1.1"
[12/Mar/2012:14:37:02 -0500] conn=14730 op=9 RESULT err=0 tag=101 nentries=1 etime=0
[12/Mar/2012:14:37:02 -0500] conn=14730 op=10 SRCH
base="ou=groups,ou=external,dc=domain,dc=com" scope=2
filter="(&(member=cn=domain,ou=groups,ou=external,dc=domain,dc=com)(objectClass=groupOfNames))"
attrs="cn"
[12/Mar/2012:14:37:02 -0500] conn=14730 op=10 RESULT err=0 tag=101 nentries=0 etime=0
[12/Mar/2012:14:39:35 -0500] conn=14730 op=11 SRCH
base="ou=users,ou=external,dc=domain,dc=com" scope=2
filter="(&(uid=xxxxxx)(objectClass=inetUser))" attrs="1.1"
[12/Mar/2012:14:39:35 -0500] conn=14730 op=11 RESULT err=0 tag=101 nentries=1 etime=0
[12/Mar/2012:14:39:35 -0500] conn=14730 op=12 BIND
dn="uid=xxxxxx,ou=users,ou=external,dc=domain,dc=com" method=128 version=3
[12/Mar/2012:14:39:35 -0500] conn=14730 op=12 RESULT err=0 tag=97 nentries=0 etime=0
dn="uid=xxxxxx,ou=users,ou=external,dc=domain,dc=com"
[12/Mar/2012:14:39:35 -0500] conn=14730 op=13 BIND
dn="uid=theManager,dc=domain,dc=com" method=128 version=3
[12/Mar/2012:14:39:35 -0500] conn=14730 op=13 RESULT err=0 tag=97 nentries=0 etime=0
dn="uid=theManager,dc=domain,dc=com"
[12/Mar/2012:14:40:23 -0500] conn=14730 op=14 SRCH
base="ou=groups,ou=external,dc=domain,dc=com" scope=2
filter="(&(cn=domain)(|(objectClass=groupOfURLs)(objectClass=groupOfNames)))"
attrs="1.1"
[12/Mar/2012:14:40:23 -0500] conn=14730 op=14 RESULT err=0 tag=101 nentries=1 etime=0
[12/Mar/2012:14:40:23 -0500] conn=14730 op=15 SRCH
base="ou=groups,ou=external,dc=domain,dc=com" scope=2
filter="(&(member=cn=domain,ou=groups,ou=external,dc=domain,dc=com)(objectClass=groupOfNames))"
attrs="cn"
[12/Mar/2012:14:40:23 -0500] conn=14730 op=15 RESULT err=0 tag=101 nentries=0 etime=0
The scenario seems to be that the DS works fine after a restart until it runs out of
unused connections and/or file descriptors (max FDs= 8192). When it starts recycling
connections and/or file descriptors the 104 errors start appearing more often in the
access logs and we start getting more authentication errors. We suspect that the original
connection never got terminated correctly but don't know if it is the application that
is at fault or a DS setting.
Our servers have been tuned according to the wiki doc at
http://directory.fedoraproject.org/wiki/Performance_Tuning#Linux
We have set our idle "timeout" to 60 seconds and search "timelimit" to
120 seconds with no change in behavior.
Watching netstat -nap | grep slapd shows established connections that do not drop off,
just continually grow.
Any help would be greatly appreciated.
Nicholas J Alther
Sr. Software Developer/Analyst
Black Hills Corporation
Phone: 605.721.2158
Cell: 605.593.1899
Nicholas J Alther
Sr. Software Developer/Analyst
Phone: 605.721.2158
Cell: 605.593.1899
________________________________
This electronic message transmission contains information from Black Hills Corporation,
its affiliate or subsidiary, which may be confidential or privileged. The information is
intended to be for the use of the individual or entity named above. If you are not the
intended recipient, be aware the disclosure, copying, distribution or use of the contents
of this information is prohibited. If you received this electronic transmission in error,
please reply to sender immediately; then delete this message without copying it or further
reading.