[389-users] Strange Disk IO issue

Rich Megginson rmeggins at redhat.com
Wed May 16 23:01:54 UTC 2012


On 05/16/2012 04:06 PM, Nathan Kinder wrote:
> On 05/16/2012 01:09 PM, Brad Schuetz wrote:
>> On 05/16/2012 11:54 AM, Nathan Kinder wrote:
>>> On 05/16/2012 11:19 AM, Brad Schuetz wrote:
>>>> On 05/16/2012 06:16 AM, Paul Robert Marino wrote:
>>>>> The exact timing of the issue is to strange is there a backup job
>>>>> running at midnight. Or some other timed job that could be eating the
>>>>> ram or disk IO. Possibly one that is reliant on ldap queries that
>>>>> would otherwise be inocuious.
>>>>>
>>>>>
>>>> It doesn't happen at midnight, it's 24 hours from when the process was
>>>> started, so I can restart dirsrv at 3:17pm on Wednesday and at right
>>>> around 3:17pm on Thursday that server will go to 100% disk IO usage.
>>> The default tombstone purge interval is 1 day, which seems to fit what
>>> you are seeing.  The tombstone reap thread will start every 24 hours
>>> to find tombstone entries that can be deleted.  The default retention
>>> period for tombstones is 1 week.  It is possible that you have a large
>>> number of tombstone entries that need to be deleted.  This will occur
>>> independently on all of your server instances.  This is controlled by
>>> the "nsDS5ReplicaTombstonePurgeInterval" and "nsDS5ReplicaPurgeDelay"
>>> attributes in your "cn=replica,cn=<suffixDN>,cn=mapping
>>> tree,cn=config" entry.
>>>
>> I have no "nsDS5ReplicaTombstonePurgeInterval" value set (so it's using
>> that default), and "nsDS5ReplicaPurgeDelay" is set to 3600
> Ok, so this means every 24 hours, the tombstone reap thread will look 
> for tombstones older than 1 hour and remove them.
>>
>>
>>> You can search for "(objectclass=nstombstone)" as Directory Manager to
>>> see how many tombstone entries you have.
>> I have a LOT of tombstone entries, over 200k on this one server (I'm
>> guessing since I've been restarting the process for over a week now, not
>> letting it run the cleanup process).
> That's possible if you really do 200k delete operations in 1 week, but 
> that sounds like a lot.  It would seem that these tombstones have been 
> building up for a longer time than 1 week.
>>
>> So, any suggestions on what can I do to fix this?  The process that's
>> reaping the entries is using too much IO making queries time out, older
>> versions of the software did not exhibit this behavior.  In fact, I can
>> reinitalize the entire replica faster than this thing is reaping the
>> entries, it takes 7 minutes to reinit a replica, but when this issue
>> first started I let the dirsrv run much longer before restarting it.
> Due to the number of matching entries for the tombstone search, it is 
> having to walk your entire database, which is why you see the IO spiking.

Perhaps also try increasing nsslapd-idlistscanlimit so that it can hold 
the entire candidate list of tombstones to delete -
http://docs.redhat.com/docs/en-US/Red_Hat_Directory_Server/9.0/html/Administration_Guide/Managing_Indexes.html#About_Indexes-Overview_of_the_Searching_Algorithm
> What you could do is to export your database with "db2ldif -r".  This 
> will preserve the replication related data in the LDIF.  You can then 
> remove the tombstone entries in the LDIF file via a script and 
> reimport it.  You would have to do this on each server, or do it on 
> one master and then reinitialize the rest of your servers.  One thing 
> to watch out for is that you do not want to remove the RUV entry, 
> which will have the "nstombstone" objectclass.  This RUV entry will 
> have a "nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff" value that you 
> can use to distinguish if from the rest of the tombstones.
>>
>> Should I make it purge more frequently so there are fewer entries to
>> reap?  Or is this just some weird bug?
> I'd leave the purge settings as they are.
>>
>> -- 
>> Brad
>> -- 
>> 389 users mailing list
>> 389-users at lists.fedoraproject.org
>> https://admin.fedoraproject.org/mailman/listinfo/389-users
>
> -- 
> 389 users mailing list
> 389-users at lists.fedoraproject.org
> https://admin.fedoraproject.org/mailman/listinfo/389-users




More information about the 389-users mailing list