On Sun, Jan 22, 2012 at 5:55 PM, Diego Woitasen <diego(a)woitasen.com.ar> wrote:
On Thu, Nov 10, 2011 at 10:39 PM, Diego Woitasen
<diego(a)woitasen.com.ar> wrote:
> No, I'm not running that searches. I'm sure.
>
> I forgot to mention that I have replication working between 4 servers.
> there will be 150 in the future.
>
> Is there a relation between that searches and replication?
>
>
> On Thu, Nov 10, 2011 at 8:56 PM, Noriko Hosoi <nhosoi(a)redhat.com> wrote:
>> Hello,
>>
>> It looks you are running lots of psearch like this:
>> ps_service_persistent_searches: entry
>>
"cn=csidn,cn=replica,cn=ou\3Dcsidn\2Cou\3DConsulados\2Cdc\3Dmrec\2Cdc\3Dar,cn=mapping
>> tree,cn=config" not enqueued on any persistent search lists
>>
>> $ egrep ps_service_persistent_searches errors | wc -l
>> 55
>>
>> I'm curious if it changes the behavior if you shutdown the server after
>> killing them?
>> --noriko
>>
>> Diego Woitasen wrote:
>>> Hi,
>>> I have a weird problem with 389DS. It takes more than 5 minutes to
>>> shutdown. The init script sends a SIGTERM to the process and it
>>> finishes clean. That's clear looking at the log file too:
>>>
>>> grep "slapd shutting down" errors
>>> [10/Nov/2011:17:55:52 -0300] - slapd shutting down - waiting for 22
>>> threads to terminate
>>> [10/Nov/2011:17:55:52 -0300] - slapd shutting down - closing down
>>> internal subsystems and plugins
>>> [10/Nov/2011:17:55:52 -0300] - slapd shutting down - waiting for
>>> backends to close down
>>> [10/Nov/2011:18:01:41 -0300] - slapd shutting down - backends closed down
>>>
>>> First I thought that I was related to my 150 DBs but I created a test
>>> case with a clean server, 150 DBs and 10.000 entries and the shutdown
>>> takes 2 seconds.
>>>
>>> The only weird thing that I see is the dse.ldif.tmp file being
>>> truncated and written and again and again... several times until
>>> shutdown. Strace shows me that the process is writting configuration
>>> entries too.
>>>
>>> I'm using DS 1.2.9.9 (same problem with 1.2.8.3) on Debian Squeeze.
>>>
>>> I set errorlevel to 1 but I don't know is there is something
>>> interesting in the log. I upload the log here if someone want to have
>>> a look:
http://main.woitasen.com.ar/errors
>>>
>>> What can I do to start to discover what's happening here?
>>>
>>> Regards,
>>> Diego
>>>
>>
>> --
>> 389 users mailing list
>> 389-users(a)lists.fedoraproject.org
>>
https://admin.fedoraproject.org/mailman/listinfo/389-users
>
>
>
> --
> Diego Woitasen
I'm trying to figure out what's going on with this again. I ran
ns-slapd with strace for a few minutes:
strace -fco /tmp/trace.ldap -s 1000 /opt/dirsrv/sbin/ns-slapd -D
/etc/dirsrv/slapd-mreldc03 -i /var/run/dirsrv/slapd-mreldc03.pid -w
/var/run/dirsrv/slapd-mreldc03.startpid -d 0 > /tmp/ldap.out 2>&1
I added the -c arg to strace to count the time spent in each syscall
and the top 10 is:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
94.80 4989.357069 181240 27529 234 futex
4.87 256.464031 17298 14826 select
0.29 15.404732 1867 8250 1 poll
0.03 1.594276 5 333723 fsync
0.00 0.077862 0 5439669 write
0.00 0.012001 5 2183 mmap
0.00 0.008001 4 1895 getsockname
0.00 0.005716 1 5153 2 read
0.00 0.004300 5 910 rename
0.00 0.002482 1 3003 sendto
94% of the time spent in futex, that's really bad I think. :)
Ideas are welcome ...
Regards,
Diego
--
Diego Woitasen
I've discovered that disabling the "Multimaster replication plugin",
the shutdown takes 5 secs, instead of 5 mins :P
--
Diego Woitasen