[Resultsdb-users] Re: De-duplicating test results: 'scenarios'

Friday, 24 March 2017

On Fri, Mar 24, 2017 at 12:50 AM, Adam Williamson <
adamwill(a)fedoraproject.org&gt; wrote:

...
 On Fri, 2017-03-24 at 00:05 +0100, Josef Skladanka wrote:

 So, wait. Let me be totally sure I'm on the same page here. You mean
 that dist.depcheck has been run on that single package *sixteen
 thousand times*? So part of the problem is, if you just ask for 'all
 test results for package X', occasionally you might get tens of
 thousands of results?
 ....
 If I'm understanding correctly, that certainly helps me to understand
 why there might be a problem, and I'll go back and read kparal's mail
 again in that light.

Yup, that's it. The other part of what kparal and I were discussing was
that if the "query all, deduplicate in consumer" usecase is the most
common, it would make sense to optimize for it. Not that it's necessary,
but convenient for both parties.

...
 Still, if we're mostly ignoring the outliers for now and thinking
about
 the 'normal' cases, I think I'd contend that getting 140 results at a
 time should be a pretty 'normal' use case for something like resultsdb.
  I'm not sure I'm a fan of doing something more complex than the
 'scenario' key if the goal is purely to try and reduce the number of
 results that actually need to be returned by the server in the first
 place from, say, 200 to 50. That's just MHO, though. :)

I tend to agree, and the solution Kamil and I discussed is basically "just
that" - using scenario. On top of that, we wanted to have some conventions
on the "default scenario value" - say for koji_builds the "default"
scenario would be `arch`, and if you wanted to differentiate even more,
you'd just fill the actual scenario field.
The bonus, as we see it, is in fact the convention part of it, since it can
act as a point of reference. I'm sure that saying "do whatever, but we
handle stuff this way" is also a solution. I just feel like it contains
more of the "well, if you want to do this, then you should do that, but
lets first make sure you understand what we do, so you can base your
decisions on something relevant" conversations with devels, where the
sensible conventions give most of them the right solution out of the box.
The added value of those conventions (for 'minimal set of relevant fields
per item type') have the added benefit of being able to do data validation
in resultsdb.

...

 I *think* kparal's approach was an attempt to be more generic than just
 tailored to Bodhi's use case, but I'll have to go back and read it
 again.

 At the beginning of our three-days-long flamewar (in person) it certainly
was, but AFAIK the current goal is to have the Bodhi convenience (where
'bodhi convenience' is equivalent of 'common usecase' for us at the
moment).

Josef

2024

2023

2022

2021

2020

2019

2018

2017

[Resultsdb-users] Re: De-duplicating test results: 'scenarios'