[389-users] 389 DS is reseting connections

Diego Woitasen diego at woitasen.com.ar
Mon Feb 7 16:41:15 UTC 2011


Hi,
 I have 389 DS 1.2.7.5 running on Debian Squeeze. It was working fine
but the last days the process started to hang very often. I restart
the service, works fine for a few minutes and hangs again. The process
is running, accept connections but reset them.

The only error message that I see is from ldapsearch:

ldap_start_tls: Can't contact LDAP server (-1)
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)

I ran ldapsearch with strace, the last lines:

socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 3
fcntl(3, F_SETFD, FD_CLOEXEC)           = 0
setsockopt(3, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
setsockopt(3, SOL_TCP, TCP_NODELAY, [1], 4) = 0
connect(3, {sa_family=AF_INET, sin_port=htons(389),
sin_addr=inet_addr("140.191.48.138")}, 16) = 0
write(3, "0\35\2\1\1w\30\200\0261.3.6.1.4.1.1466.20037", 31) = 31
poll([{fd=3, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, -1) = 1
([{fd=3, revents=POLLIN|POLLERR|POLLHUP}])
read(3, 0x11ed85f, 8)                   = -1 ECONNRESET (Connection
reset by peer)
write(2, "ldap_start_tls: Can't contact LD"..., 47ldap_start_tls:
Can't contact LDAP server (-1)
) = 47
write(2, "ldap_sasl_bind(SIMPLE): Can't co"...,
55ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
) = 55
exit_group(-1)

I tried to trace ns-slapd, but I don't see anything special (except
the ENOTCONN error in getpeername() but it's on a different FD):

2007  accept(6, {sa_family=AF_INET, sin_port=htons(53395),
sin_addr=inet_addr("140.191.48.
138")}, [16]) = 34
2007  fcntl(34, F_GETFL)                = 0x2 (flags O_RDWR)
2007  fcntl(34, F_SETFL, O_RDWR|O_NONBLOCK) = 0
2007  fcntl(34, F_DUPFD, 64)            = 64
2007  close(34)                         = 0
2007  setsockopt(64, SOL_TCP, TCP_NODELAY, [0], 4) = 0
2007  getpeername(64, {sa_family=AF_INET, sin_port=htons(53395),
sin_addr=inet_addr("140.1
91.48.138")}, [16]) = 0
2007  getsockname(64, {sa_family=AF_INET, sin_port=htons(389),
sin_addr=inet_addr("140.191
.48.138")}, [16]) = 0
2007  getpeername(7, 0x7fff1acd6e90, [112]) = -1 ENOTCONN (Transport
endpoint is not conne
cted)
2007  poll([{fd=22, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7,
events=POLLIN}, {fd=-1},
 {fd=64, events=POLLIN}], 5, 250) = 1 ([{fd=64, revents=POLLIN}])
2007  close(64)                         = 0
2007  getpeername(7, 0x7fff1acd6e90, [112]) = -1 ENOTCONN (Transport
endpoint is not conne
cted)
2007  poll([{fd=22, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7,
events=POLLIN}, {fd=-1}]
, 4, 250 <unfinished ...>
2010  <... select resumed> )            = 0 (Timeout)
2010  select(0, NULL, NULL, NULL, {0, 100000} <unfinished ...>
2012  <... select resumed> )            = 0 (Timeout)


Any hint to help to find the problem? I tried with different slapd log
levels but i don't see anything special. I don't except a magical
solution, only a hint to discover what's happening.

Regards,
 Diego

-- 
Diego Woitasen



More information about the 389-users mailing list