[389-devel] RFC: New Design: Fine Grained ID List Size

Sat Sep 7 03:02:40 UTC 2013

On 9/6/2013 8:49 PM, Nathan Kinder wrote:
> This is a good idea, and it is something that we discussed briefly 
> off-list.  The only downside is that we need to change the index 
> format to keep a count of ids for each key.  Implementing this isn't a 
> big problem, but it does mean that the existing indexes need to be 
> updated to populate the count based off of the contents (as you 
> mention above).

I don't think you need to do this (I certainly wasn't advocating doing 
so). The "statistics" state is much the same as that proposed in Rich's 
design. In fact you could probably just use that same information. My 
idea is more about where and how you use the information. All you need 
is something associated with each index that says "not much point 
looking here if you're after something specific, move along, look 
somewhere else instead". This is much the same information as "don't use 
a high scan limit here".

>
> In the short term, we are looking for a way to be able to improve 
> performance for specific search filters that are not possible to 
> modify on the client side (for whatever reason) while leaving the 
> index file format exactly as it is.  I still feel that there is 
> potentially great value in keeping a count of ids per key so we can 
> optimize things on the server side automatically without the need for 
> complex index configuration on the administrator's part. I think we 
> should consider this for an additional future enhancement.

I'm saying the same thing. Keeping a cardinality count per key is way 
more than I'm proposing, and I'm not sure how useful that would be 
anyway, unless you want to do OLAP in the DS ;)