[389-devel] Re: [discuss] Entry cache and backend txn plugin problems

Monday, 4 March 2019

On 2/22/19 11:46 AM, Mark Reynolds wrote:
...
 I want to start a brief discussion about a major problem we have 
 backend transaction plugins and the entry caches.  I'm finding that 
 when we get into a nested state of be txn plugins and one of the later 
 plugins that is called fails then while we don't commit the disk 
 changes (they are aborted/rolled back) we DO keep the entry cache 
 changes!

 For example, a modrdn operation triggers the referential integrity 
 plugin which renames the member attribute in some group and changes 
 that group's entry cache entry, but then later on the memberOf plugin 
 fails for some reason.  The database transaction is aborted, but the 
 entry cache changes that RI plugin did are still present :-(  I have 
 also found other entry cache issues with modrdn and BE TXN plugins, 
 and we know of other currently non-reproducible entry cache crashes as 
 well related to mishandling of cache entries after failed operations.

 It's time to rework how we use the entry cache.  We basically need a 
 transaction style caching mechanism - we should not commit any entry 
 cache changes until the original operation is fully successful.  
 Unfortunately the way the entry cache is currently designed and used 
 it will be a major change to try to change it.

 William wrote up this doc: 
 http://www.port389.org/docs/389ds/design/cache_redesign.html

 But this also does not currently cover the nested plugin scenario 
 either (not yet).  I do know how how difficult it would be to 
 implement William's proposal, or how difficult it would be to 
 incorporate the txn style caching into his design.  What kind of time 
 frame could this even be implemented in?  William what are your thoughts?

 If William's design is too huge of a change that will take too long to 
 safely implement then perhaps we need to look into revising the 
 existing cache design where we use "cache_add_tentative" style 
 functions and only apply them at the end of the op.  This is also not 
 a trivial change.

 And what impact would changing the entry cache have on Ludwig's 
 plugable backend work?

 Anyway we need to start thinking about redesigning the entry cache - 
 no matter what approach we want to take.  If anyone has any ideas or 
 comments please share them, but I think due to the severity of this 
 flaw redesigning the entry cache should be one of our next major goals 
 in DS (1.4.1?). 
We are actually seeing more of these cases popping up now, so we need to 
do something soon.  I had proposed we could always just flush the entire 
cache when a backend txn op fails, but Ludwig had a much better idea 
that we could implement a type of csn in the entry cache.  So when a 
backend txn plugin fails, we flush the entry cache entries with a csn >= 
start of the parent operation.

So until LMDB or a new caching mechanism is implemented this could be a 
viable/realistic option.

Mark

>
> Thanks,
>
> Mark
> _______________________________________________
> 389-devel mailing list -- 389-devel(a)lists.fedoraproject.org
> To unsubscribe send an email to 389-devel-leave(a)lists.fedoraproject.org
> Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives: 
> https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproje...

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

[389-devel] Re: [discuss] Entry cache and backend txn plugin problems