[389-users] Crashing

Rich Megginson rmeggins at redhat.com
Mon Aug 8 20:17:51 UTC 2011


On 08/08/2011 01:25 PM, Wendt, Trevor wrote:
>
> Hello Rich, thanks for the response. I was able to capture a core dump 
> but I think, like you said, our version is too old for this method to 
> work, here is the output when trying to load the dump file:
>
> ======================================================================
>
> [root@ logs]# gdb /opt/fedora-ds/bin/slapd/server/ns-slapd 
> /opt/fedora-ds/slapd-instance/logs/core.5916
>
This won't work.  You have to change to the directory first:

cd /opt/fedora-ds/bin/slapd/server
gdb ./ns-slapd ../../../slapd-instance/logs/core.5916
>
> GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-37.el5_7.1)
>
> Reading symbols from /opt/fedora-ds/bin/slapd/server/ns-slapd...(no 
> debugging symbols found)...done.
>
> [New Thread 5972]
>
> [New Thread 5978]
>
> [New Thread 5977]
>
> [New Thread 5976]
>
> [New Thread 5975]
>
> [New Thread 5974]
>
> [New Thread 5973]
>
> [New Thread 5971]
>
> [New Thread 5970]
>
> [New Thread 5969]
>
> [New Thread 5968]
>
> [New Thread 5967]
>
> [New Thread 5966]
>
> [New Thread 5965]
>
> [New Thread 5964]
>
> [New Thread 5963]
>
> [New Thread 5962]
>
> [New Thread 5961]
>
> [New Thread 5960]
>
> [New Thread 5959]
>
> [New Thread 5958]
>
> [New Thread 5957]
>
> [New Thread 5956]
>
> [New Thread 5955]
>
> [New Thread 5954]
>
> [New Thread 5953]
>
> [New Thread 5952]
>
> [New Thread 5951]
>
> [New Thread 5950]
>
> [New Thread 5949]
>
> [New Thread 5948]
>
> [New Thread 5947]
>
> [New Thread 5946]
>
> [New Thread 5945]
>
> [New Thread 5944]
>
> [New Thread 5943]
>
> [New Thread 5942]
>
> [New Thread 5941]
>
> [New Thread 5940]
>
> [New Thread 5939]
>
> [New Thread 5916]
>
> Error while mapping shared library sections:
>
> ./libslapd.so: No such file or directory.
>
> Error while mapping shared library sections:
>
> ../lib/libssl3.so: No such file or directory.
>
> Error while mapping shared library sections:
>
> ../lib/libnss3.so: No such file or directory.
>
> Error while mapping shared library sections:
>
> ../lib/libsoftokn3.so: No such file or directory.
>
> Error while mapping shared library sections:
>
> ../lib/libssldap60.so: No such file or directory.
>
> Error while mapping shared library sections:
>
> ../lib/libldap60.so: No such file or directory.
>
> Error while mapping shared library sections:
>
> ../lib/libprldap60.so: No such file or directory.
>
> Error while mapping shared library sections:
>
> ../lib/libplc4.so: No such file or directory.
>
> Error while mapping shared library sections:
>
> ../lib/libplds4.so: No such file or directory.
>
> Error while mapping shared library sections:
>
> ../lib/libnspr4.so: No such file or directory.
>
> Error while mapping shared library sections:
>
> ./libdb-4.2.so: No such file or directory.
>
> warning: .dynamic section for "/usr/lib/libstdc++.so.6" is not at the 
> expected address
>
> warning: difference appears to be caused by prelink, adjusting 
> expectations
>
> warning: .dynamic section for "/lib/libm.so.6" is not at the expected 
> address
>
> warning: difference appears to be caused by prelink, adjusting 
> expectations
>
> warning: .dynamic section for "/lib/libresolv.so.2" is not at the 
> expected address
>
> warning: difference appears to be caused by prelink, adjusting 
> expectations
>
> Error while mapping shared library sections:
>
> ../lib/libicui18n.so.34: No such file or directory.
>
> Error while mapping shared library sections:
>
> ../lib/libicuuc.so.34: No such file or directory.
>
> Error while mapping shared library sections:
>
> ../lib/libicudata.so.34: No such file or directory.
>
> Error while mapping shared library sections:
>
> ../lib/libns-dshttpd10.so: No such file or directory.
>
> Error while mapping shared library sections:
>
> ../lib/libfreebl3.so: No such file or directory.
>
> Symbol file not found for ./libslapd.so
>
> Symbol file not found for ../lib/libssl3.so
>
> Symbol file not found for ../lib/libnss3.so
>
> Symbol file not found for ../lib/libsoftokn3.so
>
> Symbol file not found for ../lib/libssldap60.so
>
> Symbol file not found for ../lib/libldap60.so
>
> Symbol file not found for ../lib/libprldap60.so
>
> Symbol file not found for ../lib/libplc4.so
>
> Symbol file not found for ../lib/libplds4.so
>
> Symbol file not found for ../lib/libnspr4.so
>
> Reading symbols from /lib/libdl.so.2...(no debugging symbols 
> found)...done.
>
> Loaded symbols for /lib/libdl.so.2
>
> Reading symbols from /usr/lib/libsasl2.so.2...(no debugging symbols 
> found)...done.
>
> Loaded symbols for /usr/lib/libsasl2.so.2
>
> Reading symbols from /lib/libcrypt.so.1...(no debugging symbols 
> found)...done.
>
> Loaded symbols for /lib/libcrypt.so.1
>
> Reading symbols from /lib/libpthread.so.0...(no debugging symbols 
> found)...done.
>
> [Thread debugging using libthread_db enabled]
>
> Loaded symbols for /lib/libpthread.so.0
>
> Symbol file not found for ./libdb-4.2.so
>
> Reading symbols from /usr/lib/libstdc++.so.6...(no debugging symbols 
> found)...done.
>
> Loaded symbols for /usr/lib/libstdc++.so.6
>
> Reading symbols from /lib/libm.so.6...(no debugging symbols found)...done.
>
> Loaded symbols for /lib/libm.so.6
>
> Reading symbols from /lib/libgcc_s.so.1...(no debugging symbols 
> found)...done.
>
> Loaded symbols for /lib/libgcc_s.so.1
>
> Reading symbols from /lib/libc.so.6...(no debugging symbols found)...done.
>
> Loaded symbols for /lib/libc.so.6
>
> Reading symbols from /lib/ld-linux.so.2...(no debugging symbols 
> found)...done.
>
> Loaded symbols for /lib/ld-linux.so.2
>
> Reading symbols from /lib/libresolv.so.2...(no debugging symbols 
> found)...done.
>
> Loaded symbols for /lib/libresolv.so.2
>
> Reading symbols from /lib/libnss_files.so.2...(no debugging symbols 
> found)...done.
>
> Loaded symbols for /lib/libnss_files.so.2
>
> Reading symbols from /usr/lib/sasl2/liblogin.so.2...(no debugging 
> symbols found)...done.
>
> Loaded symbols for /usr/lib/sasl2/liblogin.so.2
>
> Reading symbols from /usr/lib/sasl2/libsasldb.so.2...(no debugging 
> symbols found)...done.
>
> Loaded symbols for /usr/lib/sasl2/libsasldb.so.2
>
> Reading symbols from /usr/lib/sasl2/libplain.so.2...(no debugging 
> symbols found)...done.
>
> Loaded symbols for /usr/lib/sasl2/libplain.so.2
>
> Reading symbols from /usr/lib/sasl2/libanonymous.so.2...(no debugging 
> symbols found)...done.
>
> Loaded symbols for /usr/lib/sasl2/libanonymous.so.2
>
> Reading symbols from /opt/fedora-ds/lib/syntax-plugin.so...(no 
> debugging symbols found)...done.
>
> Loaded symbols for /opt/fedora-ds/lib/syntax-plugin.so
>
> Reading symbols from /opt/fedora-ds/lib/liblcoll.so...(no debugging 
> symbols found)...done.
>
> Loaded symbols for /opt/fedora-ds/lib/liblcoll.so
>
> Symbol file not found for ../lib/libicui18n.so.34
>
> Symbol file not found for ../lib/libicuuc.so.34
>
> Symbol file not found for ../lib/libicudata.so.34
>
> Reading symbols from /opt/fedora-ds/lib/pwdstorage-plugin.so...(no 
> debugging symbols found)...done.
>
> Loaded symbols for /opt/fedora-ds/lib/pwdstorage-plugin.so
>
> Reading symbols from /opt/fedora-ds/lib/des-plugin.so...(no debugging 
> symbols found)...done.
>
> Loaded symbols for /opt/fedora-ds/lib/des-plugin.so
>
> Reading symbols from /opt/fedora-ds/lib/attr-unique-plugin.so...(no 
> debugging symbols found)...done.
>
> Loaded symbols for /opt/fedora-ds/lib/attr-unique-plugin.so
>
> Reading symbols from /opt/fedora-ds/lib/acl-plugin.so...(no debugging 
> symbols found)...done.
>
> Loaded symbols for /opt/fedora-ds/lib/acl-plugin.so
>
> Symbol file not found for ../lib/libns-dshttpd10.so
>
> Reading symbols from /opt/fedora-ds/lib/chainingdb-plugin.so...(no 
> debugging symbols found)...done.
>
> Loaded symbols for /opt/fedora-ds/lib/chainingdb-plugin.so
>
> Reading symbols from /opt/fedora-ds/lib/cos-plugin.so...(no debugging 
> symbols found)...done.
>
> Loaded symbols for /opt/fedora-ds/lib/cos-plugin.so
>
> Reading symbols from /opt/fedora-ds/lib/http-client-plugin.so...(no 
> debugging symbols found)...done.
>
> Loaded symbols for /opt/fedora-ds/lib/http-client-plugin.so
>
> Reading symbols from /opt/fedora-ds/lib/libback-ldbm.so...(no 
> debugging symbols found)...done.
>
> Loaded symbols for /opt/fedora-ds/lib/libback-ldbm.so
>
> Reading symbols from /opt/fedora-ds/lib/replication-plugin.so...(no 
> debugging symbols found)...done.
>
> Loaded symbols for /opt/fedora-ds/lib/replication-plugin.so
>
> Reading symbols from /opt/fedora-ds/lib/passthru-plugin.so...(no 
> debugging symbols found)...done.
>
> Loaded symbols for /opt/fedora-ds/lib/passthru-plugin.so
>
> Reading symbols from /opt/fedora-ds/lib/referint-plugin.so...(no 
> debugging symbols found)...done.
>
> Loaded symbols for /opt/fedora-ds/lib/referint-plugin.so
>
> Reading symbols from /opt/fedora-ds/lib/retrocl-plugin.so...(no 
> debugging symbols found)...done.
>
> Loaded symbols for /opt/fedora-ds/lib/retrocl-plugin.so
>
> Reading symbols from /opt/fedora-ds/lib/roles-plugin.so...(no 
> debugging symbols found)...done.
>
> Loaded symbols for /opt/fedora-ds/lib/roles-plugin.so
>
> Reading symbols from /opt/fedora-ds/lib/statechange-plugin.so...(no 
> debugging symbols found)...done.
>
> Loaded symbols for /opt/fedora-ds/lib/statechange-plugin.so
>
> Reading symbols from /opt/fedora-ds/lib/views-plugin.so...(no 
> debugging symbols found)...done.
>
> Loaded symbols for /opt/fedora-ds/lib/views-plugin.so
>
> Symbol file not found for ../lib/libfreebl3.so
>
> Reading symbols from /opt/fedora-ds/alias/libnssckbi.so...(no 
> debugging symbols found)...done.
>
> Loaded symbols for /opt/fedora-ds/alias/libnssckbi.so
>
> Core was generated by `./ns-slapd -D /opt/fedora-ds/slapd-bhc -i 
> /opt/fedora-ds/slapd-bhc/logs/pid -w'.
>
> Program terminated with signal 11, Segmentation fault.
>
> #0  0x00f3b80c in ?? ()
>
> (gdb)
>
> ========================================================================
>
> I did try running the rest of the gdb commands to create the stack 
> trace but it never creates a file.
>
> Nothing helpful (to me) in the error logs either:
>
> ========================================================================
>
> [08/Aug/2011:13:39:59 -0500] - conn 4244 activity level = 0
>
> [08/Aug/2011:13:39:59 -0500] - listener got signaled
>
> [08/Aug/2011:13:39:59 -0500] - new connection on 70
>
> [08/Aug/2011:13:39:59 -0500] NSMMReplicationPlugin - 
> ruv_add_csn_inprogress: successfully inserted csn 4e402d7f000000010000 
> into pending list
>
> [08/Aug/2011:13:39:59 -0500] - activity on 70r
>
> [08/Aug/2011:13:39:59 -0500] - read activity on 70
>
> [08/Aug/2011:13:39:59 -0500] - conn 4245 activity level = 0
>
> [08/Aug/2011:13:39:59 -0500] - listener got signaled
>
> ****crash*******
>
> [08/Aug/2011:13:40:44 -0500] - Fedora-Directory/1.0.4 B2006.312.1539 
> starting up
>
> [08/Aug/2011:13:40:44 -0500] - Detected Disorderly Shutdown last time 
> Directory Server was running, recovering database.
>
> [08/Aug/2011:13:40:44 -0500] NSMMReplicationPlugin - 
> agmtlist_config_init: found 0 replication agreements in DIT
>
> [08/Aug/2011:13:40:44 -0500] NSMMReplicationPlugin - Purged state 
> information from entry dc=aquila,dc=com up to CSN 4e36f2e4000000010000
>
> [08/Aug/2011:13:40:44 -0500] - slapd started.  Listening on All 
> Interfaces port 389 for LDAP requests
>
> [08/Aug/2011:13:40:48 -0500] - new connection on 64
>
What log level are you using?
What about the access log?
>
> ========================================================================
>
> *From:*Rich Megginson [mailto:rmeggins at redhat.com]
> *Sent:* Monday, August 08, 2011 9:14 AM
> *To:* General discussion list for the 389 Directory server project.
> *Cc:* Wendt, Trevor
> *Subject:* Re: [389-users] Crashing
>
> On 08/05/2011 10:46 AM, Wendt, Trevor wrote:
>
> Hello all,
>
> Need some help with tuning and crash debugging. We're running 
> Fedora-Directory/1.0.4 B2006.312.1539. The problem is on our 
> "Dedicated Consumer" machine running on RHEL 5. We have over ~150,000 
> users authenticating against our FDS systems. System resources are not 
> a problem (~.39 load, free memory, 92k swap)
>
> For months, the system is solid without any issues then we seem to get 
> a large spike in traffic and FDS crashes. I run Monit so the service 
> is restarted automatically but I cannot figure out why the service 
> keeps crashing.
>
> FDS was setup and tuned based off: 
> http://directory.fedoraproject.org/wiki/Performance_Tuning#Linux
>
> I have reviewed 
> http://directory.fedoraproject.org/wiki/FAQ#Debugging_Crashes as well, 
> but some of that is over my head.
>
> Unfortunately these directions are for 1.1.x and later.  Most of the 
> paths/filenames have changed since 1.0.4 in the move to the FHS style 
> layout, and there is no debuginfo package.  But we may still be able 
> to get a core file and some stack information:
>
> sysctl -w fs.suid_dumpable=1
>
> edit /opt/fedora-ds/slapd-YOURINSTANCE/start-slapd
> somewhere near the top, add the line
> ulimit -c unlimited
>
> restart the directory server
> /opt/fedora-ds/slapd-YOURINSTANCE/restart-slapd
>
> If you get a crash, you should have a core file in 
> /opt/fedora-ds/slapd-YOURINSTANCE/logs
>
> After that, install gdb
>
> follow the instructions at 
> http://directory.fedoraproject.org/wiki/FAQ#Debugging_Crashes
> except:
> cd /opt/fedora-ds/slapd-YOURINSTANCE/logs
> gdb ../../bin/slapd/server/ns-slapd core.PID
>
>
> I have turned buffering off and increased the logging level in the 
> LDAP config.
>
> What is the last operation in the access log before a crash?  Any 
> corresponding errors in the errors log?
>
>
>
> Here is our "monitor" script output:
>
> version: 1
>
> dn: cn=monitor
>
> objectClass: top
>
> objectClass: extensibleObject
>
> cn: monitor
>
> version: Fedora-Directory/1.0.4 B2006.312.1539
>
> threads: 30
>
> currentconnections: 19
>
> totalconnections: 11918
>
> dtablesize: 8192
>
> readwaiters: 0
>
> opsinitiated: 43703
>
> opscompleted: 43702
>
> entriessent: 16086
>
> bytessent: 2911011
>
> currenttime: 20110805164243Z
>
> starttime: 20110805114053Z
>
> nbackends: 2
>
> So about 8700 ops/hour.  Not a heavy load.
>
> Here is our "Access Log Analyzer" summary for a 24 hour period:
>
> ---------------------------------------------------------------
>
> Access Log Analyzer 6.0
>
> Filename                        Total Lines     Lines processed
>
> ---------------------------------------------------------------
>
> /opt/fedora-ds/slapd/logs/access  298225                298231
>
> ----------- Access Log Output ------------
>
> Restarts:                     6
>
> Total Connections:            39720
>
> Peak Concurrent Connections:  84
>
> Total Operations:             95471
>
> Total Results:                95393
>
> Overall Performance:          99.9%
>
> Searches:                     48215
>
> Modifications:                167
>
> Adds:                         551
>
> Deletes:                      2
>
> Mod RDNs:                     0
>
> 6.x Stats
>
> Persistent Searches:          0
>
> Internal Operations:          0
>
> Entry Operations:             0
>
> Extended Operations:          845
>
> Abandoned Requests:           0
>
> Smart Referrals Received:     0
>
> VLV Operations:               0
>
> VLV Unindexed Searches:       0
>
> SORT Operations:              0
>
> SSL Connections:              0
>
> Entire Search Base Queries:   0
>
> Unindexed Searches:           6
>
> FDs Taken:                    39720
>
> FDs Returned:                 39657
>
> Highest FD Taken:             93
>
> Broken Pipes:                 0
>
> Connections Reset By Peer:    0
>
> Resource Unavailable:         10872
>
>      -  10872 (T1) Idle Timeout Exceeded
>
> Binds:                        45691
>
> Unbinds:                      27987
>
> LDAP v2 Binds:               15694
>
> LDAP v3 Binds:               29997
>
> SSL Client Binds:            0
>
> Failed SSL Client Binds:     0
>
> SASL Binds:                  0
>
> Directory Manager Binds:     0
>
> Anonymous Binds:             16346
>
> Other Binds:                 29345
>
> ---------------------------------------------------------------
>
> In FDS console:
>
> -- Configuration > Performance tab: Size Limit: 2000, Time Limit: 
> 3600, Idle Timeout: 60, Max file descriptors: 8192.
>
> The idle timeout is 1 minute - could be too low for some of your 
> clients, which is why you're seeing a lot of (T1) Idle Timeout 
> Exceeded connection closes.
>
> -- Configuration > Data > Database Link Settings > Connection 
> Management: Max TCP Connections: 10, Bind timeout: 20, Max binds per 
> connection: 20, Timeout before abandon: 10, Max LDAP Connections: 20, 
> Max bind retries: 3, Max operations per connection: 5, connection 
> life: 60.
>
> Are you using database links?
>
> I also suggest looking at your database cache tuning - see 
> http://docs.redhat.com/docs/en-US/Red_Hat_Directory_Server/8.2/html-single/Administration_Guide/index.html#Monitoring_Server_and_Database_Activity-Monitoring_Database_Activity
>
> We have talked about moving to the latest 389 Directory packages and I 
> have  a migration process tested out so it's a matter of getting the 
> OK and time but I doubt the upgrade will solve our crashing problem.
>
> I can't say for sure, but 1.0.4 is very old, and since then we have 
> fixed many issues which have caused crashes.
>
> It seems to me we are hitting some limits that just haven't been 
> accounted for yet and that is where I need help.
>
> Let's start with analyzing the crash data - if we can get a core file 
> and a stack trace, then we can work from there to figure out why it's 
> crashing.
>
> Any suggestions on how to proceed with stopping these crashes is 
> welcomed! Thanks for reading.
>
> *Trevor*
>
> ------------------------------------------------------------------------
>
>
> This electronic message transmission contains information from Black 
> Hills Corporation, its affiliate or subsidiary, which may be 
> confidential or privileged. The information is intended to be for the 
> use of the individual or entity named above. If you are not the 
> intended recipient, be aware the disclosure, copying, distribution or 
> use of the contents of this information is prohibited. If you received 
> this electronic transmission in error, please reply to sender 
> immediately; then delete this message without copying it or further 
> reading.
>
>   
>   
> --
> 389 users mailing list
> 389-users at lists.fedoraproject.org  <mailto:389-users at lists.fedoraproject.org>
> https://admin.fedoraproject.org/mailman/listinfo/389-users
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.fedoraproject.org/pipermail/389-users/attachments/20110808/eaa077b1/attachment.html>


More information about the 389-users mailing list