[389-users] slapd didn't close connection and get into CLOSE_WAIT state
Chun Tat David Chu
beyonddc.storage at gmail.com
Mon Feb 22 19:47:44 UTC 2010
Hi Rich,
I downloaded OpenJDK source code, modified the
com.sun.jndi.ldap.NamingEventNotifier JAVA class to set the control as
NONCRITICAL instead of CRITICAL, and compiled the OpenJDK.
I reran my test, and as you expected, I no longer see
"-get_ldapmessage_controls failed: 12 (Unavailable critical extension)
(op=Abandon)" in the error log but the connection in CLOSE_WAIT state still
persist after my JAVA application terminated.
Do you have any other recommendation? Do you recommend me to file a bug
report for 389-ds?
Thanks,
David
On Fri, Feb 19, 2010 at 10:32 PM, Rich Megginson <rmeggins at redhat.com>wrote:
> Chun Tat David Chu wrote:
> > Hi Rich,
> >
> > I very appreciate your help.
> >
> > >> Does the client send the Abandon request to OpenDS? If so, does
> > OpenDS abandon the operation, or complain about the control?
> > The client does send the abandon request to OpenDS according to the
> > log file, and I think it processed the request peacefully. If you
> > look at the 3 debug lines below. The first 2 lines showed when OpenDS
> > received the abandon request, and the last line is when my app
> > requested to disconnect. We'll pay attention on the 2nd line only.
> >
> > According to OpenDS Wiki
> > (https://www.opends.org/wiki/page/DefinitionResultCode), result=118
> > means "Canceled", and it is an indication that the server canceled
> > processing on the request at the request of the client.
> >
> > [19/Feb/2010:18:57:00 -0500] ABANDON REQ conn=0 op=13 msgID=14
> > idToAbandon=3
> > [19/Feb/2010:18:57:00 -0500] ABANDON RES conn=0 op=13 msgID=14
> > result=118 etime=1
> > [19/Feb/2010:18:57:13 -0500] DISCONNECT conn=0 reason="Client Disconnect"
> >
> > I am still playing with the OpenDS debug settings to see if I can get
> > more out of it. I also plan to download OpenJDK and modify JNDI code
> > to set NONCRITICAL instead of CRITICAL and give that a try.
> Ok. Then I suppose OpenDS supports the control, or OpenDS does not
> error if an unsupported control is sent with the Abandon operation.
> >
> > - David
> >
> >
> >
> > On Fri, Feb 19, 2010 at 5:26 PM, Rich Megginson <rmeggins at redhat.com
> > <mailto:rmeggins at redhat.com>> wrote:
> >
> > Chun Tat David Chu wrote:
> > > Hi Rich,
> > >
> > > Thank you for your quick response!
> > >
> > > I'll use the information you provided and file a bug on JNDI.
> > >
> > > Do you think this "UnavailableCriticalExtension" is the cause of
> why
> > > connection not being closed successfully?
> > Could be - if it attempts to Abandon the persistent search, but the
> > Abandon request doesn't go through because of the control, the
> > connection will remain open
> > >
> > > I am trying to figure out which end is the issue (Client or
> Server).
> > > For testing purpose, I just downloaded OpenDS and installed it.
> > I ran
> > > the same test scenario for about 20 times, and I don't see any
> > > CLOSE_WAIT connection on the OpenDS process.
> > Does the client send the Abandon request to OpenDS? If so, does
> > OpenDS
> > abandon the operation, or complain about the control?
> > >
> > > - David
> > >
> > > On Fri, Feb 19, 2010 at 4:39 PM, Rich Megginson
> > <rmeggins at redhat.com <mailto:rmeggins at redhat.com>
> > > <mailto:rmeggins at redhat.com <mailto:rmeggins at redhat.com>>> wrote:
> > >
> > > Chun Tat David Chu wrote:
> > > > Hi Rich,
> > > >
> > > > That was quick response! Thank you!
> > > >
> > > > >> That seems like a bug in JNDI - can you file a bug?
> > > > Yes I can do that but I'll need more information to backup
> > my case.
> > > > For example, which RFC stated that application shouldn't
> > send a LDAP
> > > > Abandon request for control that is marked as CRITICAL. I
> > couldn't
> > > > find that information in RFC 22551.
> > > http://www.faqs.org/rfcs/rfc2251.html
> > > The problem will happen with any request - if the client
> > sends any
> > > request (except Unbind) to the server, and attaches a
> > control that is
> > > marked as CRITICAL, and the server does not support that
> > control, the
> > > server returns unavailableCriticalExtension.
> > >
> > > http://www.ietf.org/rfc/rfc4511.txt - 4.1.11. Controls
> > > "Specifically, the criticality field is applied as follows:
> > >
> > > - If the server does not recognize the control type,
> > determines that
> > > it is not appropriate for the operation, or is otherwise
> > unwilling
> > > to perform the operation with the control, and if the
> > criticality
> > > field is TRUE, the server MUST NOT perform the
> > operation, and for
> > > operations that have a response message, it MUST return
> > with the
> > > resultCode set to unavailableCriticalExtension."
> > >
> > > So the problem here is that
> > > 1) JNDI is sending the Abandon request with a control not
> > > supported for
> > > the Abandon operation and/or not supported by the server
> > > - solution - change the Abandon request to send only controls
> > > supported
> > > by the server (I'm not even sure why it is attempting to send a
> > > control
> > > with Abandon)
> > > 2) JNDI is marking the control as CRITICAL
> > > - solution - mark the control as non-critical
> > > 3) JNDI does not know how to handle this standard result in
> > a more
> > > graceful manner
> > > - solution - catch unavailableCriticalExtension and resubmit
> the
> > > request
> > > without the control or with the control marked non-critical
> > > >
> > > > >> Is there some way to verify that? Because the behavior
> you
> > > > describe below would seem to suggest that not all
> > operation results
> > > > have been completely read by the client.
> > > > :-( I am stuck on this part. I am not sure what is the
> > best way to
> > > > verify this because JNDI hides so much from me. I spend the
> > > last two
> > > > hours looking at tcpdump trying to see if there's any special
> > > messages
> > > > that was sent across but nothing pops up to me.
> > > >
> > > > >> Do you only see the problem when persistent search is
> used?
> > > Do you
> > > > see the problem when you don't use persistent search? This
> > > would also
> > > > lead me to believe the problem is that the client has not
> > completed
> > > > all operations successfully, and is therefore attempting to
> > > submit an
> > > > Abandon request to abandon those uncompleted operations.
> > > > Yes, from my observation, this only happen with persistent
> > > search. I
> > > > tried different test scenarios such as 1) Create
> > connection, do
> > > > nothing, close connection and 2) Create connection, add
> > > NamingListener
> > > > (Persistent Search), remove NamingListener, close
> > connection. I ran
> > > > these two scenarios many times, and only when I add/remove
> > > > NamingListener will trigger the CLOSE_WAIT connection on
> > the slapd
> > > > process.
> > > >
> > > > I did found something very interesting...
> > > > If I put a sleep say 6 seconds after I established a LDAP
> > connection
> > > > and before I add a NamingListener then the connection will
> > always
> > > > clean-up successfully when the application exit. However,
> > if I
> > > don't
> > > > put the sleep and add a NamingListener right after
> > establishing the
> > > > LDAP connection then when I my application exits, the
> > slapd process
> > > > will always end up with a CLOSE_WAIT connection. I tried
> > this test
> > > > scenario about 20 times and I always get the same result.
> > Below is
> > > > the code snippet and I'll re-attach the test case I use in
> > this
> > > e-mail.
> > > >
> > > > // Establish a context to LDAP.
> > > > LDAP_CONTEXT =
> > > > (EventDirContext) new InitialDirContext(
> > > >
> > > ldapEnvironmentHashtable).lookup(CONTEXT);
> > > > Thread.sleep(6000);
> > > > LDAP_CONTEXT.addNamingListener(
> > > > SANDBOX_CONTEXT,
> > EventContext.SUBTREE_SCOPE,
> > > > myListener);
> > > interesting
> > > >
> > > > Thanks for reading!
> > > >
> > > > - David
> > > >
> > > >
> > > >
> > > > On Fri, Feb 19, 2010 at 1:31 PM, Rich Megginson
> > > <rmeggins at redhat.com <mailto:rmeggins at redhat.com>
> > <mailto:rmeggins at redhat.com <mailto:rmeggins at redhat.com>>
> > > > <mailto:rmeggins at redhat.com <mailto:rmeggins at redhat.com>
> > <mailto:rmeggins at redhat.com <mailto:rmeggins at redhat.com>>>> wrote:
> > > >
> > > > Chun Tat David Chu wrote:
> > > > > Hi Rich,
> > > > >
> > > > > Thanks for your response. I very appreciate it.
> > > > >
> > > > > >> see if you can specify that no controls are to be
> > sent
> > > with the
> > > > > Abandon request
> > > > > I looked at the JNDI API documentation and I don't see
> > > there's a
> > > > way I
> > > > > can do this. I did verify by reviewing the OpenJDK
> > source
> > > code that
> > > > > it is setting the control as CRITICAL, and it is set
> > in the
> > > > > com.sun.jndi.ldap.NamingEventNotifier class.
> > > > That seems like a bug in JNDI - can you file a bug?
> > > > >
> > > > > >> if you wait for all of the operations and results
> > to be
> > > read by
> > > > > your app, JNDI might notice this and just do an Unbind
> > > instead of an
> > > > > Abandon.
> > > > > The test application that I use done things in a
> > sequential
> > > > order so I
> > > > > believe all the operations and results had already
> > been read.
> > > > Is there some way to verify that? Because the
> > behavior you
> > > describe
> > > > below would seem to suggest that not all operation
> results
> > > have been
> > > > completely read by the client.
> > > > >
> > > > > I did more testings today using the same test I
> > attached in my
> > > > > original e-mail and found out that this problem
> > happen very
> > > > often but
> > > > > sometime the connection does cleaned up correctly.
> > Here's two
> > > > > snippets of the errors log with "Connection management"
> > > log level
> > > > > enabled. The difference is that when it does work,
> > I see the
> > > > > "sasl_io_cleanup for connection" debug statement and it
> > > appears that
> > > > > slapd tries to release the connection and operation
> > in all
> > > cases
> > > > > except often time it doesn't work (when persistent
> > search
> > > is used)
> > > > Do you only see the problem when persistent search is
> > used? Do
> > > > you see
> > > > the problem when you don't use persistent search? This
> > > would also
> > > > lead
> > > > me to believe the problem is that the client has not
> > > completed all
> > > > operations successfully, and is therefore attempting to
> > > submit an
> > > > Abandon request to abandon those uncompleted operations.
> > > > > and the connection get into a CLOSE_WAIT state.
> > > > >
> > > > > I will do more testing and poke more on the JAVA
> client.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > David
> > > > >
> > > > > ###### WORKING LOG ######
> > > > > [19/Feb/2010:13:12:49 -0500] - activity on 68r
> > > > > [19/Feb/2010:13:12:49 -0500] - read activity on 68
> > > > > [19/Feb/2010:13:12:49 -0500] - listener got signaled
> > > > > [19/Feb/2010:13:12:49 -0500] - get_ldapmessage_controls
> > > failed: 12
> > > > > (Unavailable critical extension) (op=Abandon)
> > > > > [19/Feb/2010:13:13:18 -0500] - activity on 68r
> > > > > [19/Feb/2010:13:13:18 -0500] - read activity on 68
> > > > > [19/Feb/2010:13:13:18 -0500] - conn 98 activity
> > level = 3
> > > > > [19/Feb/2010:13:13:18 -0500] - listener got signaled
> > > > > [19/Feb/2010:13:13:18 -0500] Persistent Search -
> conn=98
> > > op=2 The
> > > > > operation has been abandoned
> > > > > [19/Feb/2010:13:13:18 -0500] - conn 98 turbo rank =
> > 3 out
> > > of 5 conns
> > > > > [19/Feb/2010:13:13:18 -0500] Persistent Search -
> > conn=98 op=2
> > > > > Releasing the connection and operation
> > > > > [19/Feb/2010:13:13:18 -0500] - sasl_io_cleanup for
> > > connection 0
> > > > > [19/Feb/2010:13:13:18 -0500] - sasl_pop_IO_layer: no
> > SASL
> > > IO layer
> > > > > [19/Feb/2010:13:13:18 -0500] - listener got signaled
> > > > >
> > > > > ###### NOT WORKING LOG ######
> > > > > [19/Feb/2010:13:14:33 -0500] - activity on 68r
> > > > > [19/Feb/2010:13:14:33 -0500] - read activity on 68
> > > > > [19/Feb/2010:13:14:33 -0500] - conn 99 activity
> > level = 2
> > > > > [19/Feb/2010:13:14:33 -0500] - conn 99 turbo rank =
> > 3 out
> > > of 5 conns
> > > > > [19/Feb/2010:13:14:34 -0500] - get_ldapmessage_controls
> > > failed: 12
> > > > > (Unavailable critical extension) (op=Abandon)
> > > > > [19/Feb/2010:13:14:35 -0500] - listener got signaled
> > > > > [19/Feb/2010:13:14:39 -0500] - activity on 68r
> > > > > [19/Feb/2010:13:14:39 -0500] - read activity on 68
> > > > > [19/Feb/2010:13:14:39 -0500] - conn 99 activity
> > level = 2
> > > > > [19/Feb/2010:13:14:39 -0500] - conn 99 turbo rank =
> > 3 out
> > > of 5 conns
> > > > > [19/Feb/2010:13:14:39 -0500] Persistent Search -
> conn=99
> > > op=2 The
> > > > > operation has been abandoned
> > > > > [19/Feb/2010:13:14:39 -0500] Persistent Search -
> > conn=99 op=2
> > > > > Releasing the connection and operation
> > > > > [19/Feb/2010:13:14:39 -0500] - listener got signaled
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Feb 18, 2010 at 8:14 PM, Rich Megginson
> > > > <rmeggins at redhat.com <mailto:rmeggins at redhat.com>
> > <mailto:rmeggins at redhat.com <mailto:rmeggins at redhat.com>>
> > > <mailto:rmeggins at redhat.com <mailto:rmeggins at redhat.com>
> > <mailto:rmeggins at redhat.com <mailto:rmeggins at redhat.com>>>
> > > > > <mailto:rmeggins at redhat.com
> > <mailto:rmeggins at redhat.com> <mailto:rmeggins at redhat.com
> > <mailto:rmeggins at redhat.com>>
> > > <mailto:rmeggins at redhat.com <mailto:rmeggins at redhat.com>
> > <mailto:rmeggins at redhat.com <mailto:rmeggins at redhat.com>>>>> wrote:
> > > > >
> > > > > Chun Tat David Chu wrote:
> > > > > > Hi All,
> > > > > >
> > > > > > I am running 389 DS version 1.2.5, build number
> > > > 2010.012.2034 on
> > > > > RHEL 5.2.
> > > > > >
> > > > > > I have a problem that slapd didn't close a
> > > connection and
> > > > eventually
> > > > > > get into a CLOSE_WAIT state after my JAVA
> > > application exit.
> > > > > >
> > > > > > The scenario only happen when my application
> > registers a
> > > > > > NamingListener via the JAVA JNDI (Java Naming
> > Directory
> > > > > Interface). I
> > > > > > believe the NamingListener is equivalent to the
> > > Persistent
> > > > Search.
> > > > > > This problem doesn't exist if I don't use the
> JNDI
> > > > NamingListener
> > > > > > capability.
> > > > > >
> > > > > > From my understanding, I did everything
> > correctly in my
> > > > application.
> > > > > > I create a context, add a listener, do some
> > stuffs,
> > > remove the
> > > > > > listener and then close the context.
> > > > > >
> > > > > > One thing I notice is that in the slapd's
> > error log,
> > > I see the
> > > > > > following...
> > > > > > "-get_ldapmessage_controls failed: 12
> (Unavailable
> > > critical
> > > > > extension)
> > > > > > (op=Abandon)".
> > > > > > This message prints out right after I remove the
> > > listener
> > > > and before
> > > > > > my application closes the context.
> > > > > >
> > > > > > The closest bug report I found is this and it
> > said the
> > > > problem has
> > > > > > been resolved.
> > > > > >
> https://bugzilla.redhat.com/show_bug.cgi?id=450575
> > > > > >
> > > > > > At this point, I'm clueless. :-(
> > > > > > Can someone help me or give me some
> recommendation
> > > that I
> > > > could try?
> > > > > It looks as though the app is sending an LDAP
> > Abandon
> > > > request with
> > > > > controls that are marked as critical, and the
> > server does
> > > > not support
> > > > > them. This is standard LDAPv3 behavior. I'm
> > not sure why
> > > > it is using
> > > > > Abandon, perhaps to Abandon any outstanding
> > search or
> > > other
> > > > requests
> > > > > that have not completed. Some things to check:
> > > > > * see if you can specify that no controls are to
> > be sent
> > > > with the
> > > > > Abandon request
> > > > > * if you wait for all of the operations and
> > results to be
> > > > read by your
> > > > > app, JNDI might notice this and just do an Unbind
> > > instead of an
> > > > > Abandon.
> > > > > >
> > > > > > I will attach my JAVA JNDI replicator along
> > with this
> > > > e-mail. You
> > > > > > will need to modify 2-3 lines of code to get it
> > > running in
> > > > your
> > > > > > environment. Search for "MODIFY ME" and that
> > should
> > > be the
> > > > > lines that
> > > > > > you need to modify.
> > > > > >
> > > > > > Thanks!
> > > > > >
> > > > > > David
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------
> > > > > >
> > > > > > --
> > > > > > 389 users mailing list
> > > > > > 389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>
> > > <mailto:389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>>
> > > > <mailto:389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>
> > > <mailto:389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>>>
> > > > > <mailto:389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>
> > > <mailto:389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>>
> > > > <mailto:389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>
> > > <mailto:389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>>>>
> > > > > >
> > > https://admin.fedoraproject.org/mailman/listinfo/389-users
> > > > >
> > > > > --
> > > > > 389 users mailing list
> > > > > 389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>
> > > <mailto:389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>>
> > > > <mailto:389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>
> > > <mailto:389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>>>
> > > > > <mailto:389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>
> > > <mailto:389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>>
> > > > <mailto:389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>
> > > <mailto:389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>>>>
> > > > >
> > https://admin.fedoraproject.org/mailman/listinfo/389-users
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------
> > > > >
> > > > > --
> > > > > 389 users mailing list
> > > > > 389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>
> > > <mailto:389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>>
> > > > <mailto:389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>
> > > <mailto:389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>>>
> > > > >
> > https://admin.fedoraproject.org/mailman/listinfo/389-users
> > > >
> > > > --
> > > > 389 users mailing list
> > > > 389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>
> > > <mailto:389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>>
> > > > <mailto:389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>
> > > <mailto:389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>>>
> > > >
> https://admin.fedoraproject.org/mailman/listinfo/389-users
> > > >
> > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------
> > > >
> > > > --
> > > > 389 users mailing list
> > > > 389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>
> > > <mailto:389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>>
> > > > https://admin.fedoraproject.org/mailman/listinfo/389-users
> > >
> > > --
> > > 389 users mailing list
> > > 389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>
> > > <mailto:389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>>
> > > https://admin.fedoraproject.org/mailman/listinfo/389-users
> > >
> > >
> > >
> >
> ------------------------------------------------------------------------
> > >
> > > --
> > > 389 users mailing list
> > > 389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>
> > > https://admin.fedoraproject.org/mailman/listinfo/389-users
> >
> > --
> > 389 users mailing list
> > 389-users at lists.fedoraproject.org
> > <mailto:389-users at lists.fedoraproject.org>
> > https://admin.fedoraproject.org/mailman/listinfo/389-users
> >
> >
> > ------------------------------------------------------------------------
> >
> > --
> > 389 users mailing list
> > 389-users at lists.fedoraproject.org
> > https://admin.fedoraproject.org/mailman/listinfo/389-users
>
> --
> 389 users mailing list
> 389-users at lists.fedoraproject.org
> https://admin.fedoraproject.org/mailman/listinfo/389-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.fedoraproject.org/pipermail/389-users/attachments/20100222/401ed27b/attachment-0001.html
More information about the 389-users
mailing list