Hi

Attached is the gdb output from both the servers. This was taken using the following command.

gdb -ex 'set confirm off' -ex 'set pagination off' -ex 'thread apply all bt full' -ex 'quit' /usr/sbin/ns-slapd `pidof ns-slapd`

Version of 389 DS is : 389-ds-base-1.3.3.8-1.fc21.x86_64

Any help is appreciated. This has been happening in our setup every 10-14days. 

Thanks.
--Prashant

On 3 September 2015 at 10:42, Prashant Bapat <prashant@apigee.com> wrote:
No nothing much in the error log. 

Let me wait for the next occurrence and get gdb.  

On 3 September 2015 at 22:11, Rich Megginson <rmeggins@redhat.com> wrote:
On 09/03/2015 09:02 AM, Prashant Bapat wrote:
Rich, 

Version is 389-ds-base-1.3.3.8-1.fc21.x86_64

Below is the "ldapsearch" command that works on the LDAP server.

ldapsearch -x -b "uid=testuser,cn=users,cn=accounts,dc=example,dc=com"

In python this would be

ldap.initialize("ldap://localhost") [1]
conn.simple_bind_s() [2]
response = conn.search_s("uid=testuser,cn=users,cn=accounts,dc=example,dc=com",ldap.SCOPE_BASE) [3]

[1] is different than "ipa.example.com" - so one possibility is that DNS is not working correctly due to DS - but it depends on where the script is hung
[2] is the same - anonymous bind
[3] assuming uid is "testuser", then the base is the same in your python script - however, in your python script, you are asking for a specific attribute list
["ipaSshPubKey", "ipaSshSigTimestamp", "loginshell"] - not sure why that would make a difference

So, inconclusive.  Will need to see the stacktrace from gdb when the server is hung.

Also, do you have any errors in the errors log?


Below is an excerpt of the python script.

#!/usr/bin/env python
import sys
import ldap
from ldap import LDAPError

SUFFIX = "dc=example,dc=com"
LDAPSERVER = "ipa.example.com"

if not len(sys.argv) == 2:
    raise sys.exit("Wrong arguments. Only argument should be the username")

uid = sys.argv[1]
search = "uid=%s,cn=users,cn=accounts,%s" % (uid, SUFFIX)

try:
    conn = ldap.initialize("ldap://%s" % (LDAPSERVER))
    conn.simple_bind_s()
    response = conn.search_s(search ,ldap.SCOPE_BASE, "(objectClass=*)", ["ipaSshPubKey", "ipaSshSigTimestamp", "loginshell"])
except LDAPError, e:
    print e
    print "Error getting info from LDAP. Either wrong username or issues with LDAP server "
    raise sys.exit(-1)



On 3 September 2015 at 19:17, Rich Megginson <rmeggins@redhat.com> wrote:
On 09/02/2015 09:45 PM, Prashant Bapat wrote:
Hi,

We have been using 389-ds as part of FreeIPA. In one of our environments, we have 2 389-ds installations with replication.

What version?  rpm -q 389-ds-base


Randomly, the 389-ds on either of them completely freezes and there are high number of CLOSE_WAITs on tcp/389 port.

http://www.port389.org/docs/389ds/FAQ/faq.html#debugging-hangs


Only way to recover from this situation is to either reboot or "kill -9" the ns-slapd process. Graceful restarts get stuck indefinitely. 

One curious thing when this happens, a search using "ldapsearch" command seems to work but a search using a python-ldap client does not. FreeIPA does not work either.

Can you be more specific?  What is the exact ldapsearch command line, and can you post/pastebin an excerpt of your python-ldap script?


Any pointers on troubleshooting this would be appreciated. 

Thanks.
--Prashant


--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users


--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users



--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users


--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users