On Fri, Mar 24, 2017 at 12:50 AM, Adam Williamson <adamwill@fedoraproject.org> wrote:

On Fri, 2017-03-24 at 00:05 +0100, Josef Skladanka wrote:

So, wait. Let me be totally sure I'm on the same page here. You mean
that dist.depcheck has been run on that single package *sixteen
thousand times*? So part of the problem is, if you just ask for 'all
test results for package X', occasionally you might get tens of
thousands of results?
....
If I'm understanding correctly, that certainly helps me to understand
why there might be a problem, and I'll go back and read kparal's mail
again in that light.

Yup, that's it. The other part of what kparal and I were discussing was that if the "query all, deduplicate in consumer" usecase is the most common, it would make sense to optimize for it. Not that it's necessary, but convenient for both parties.

Still, if we're mostly ignoring the outliers for now and thinking about
the 'normal' cases, I think I'd contend that getting 140 results at a
time should be a pretty 'normal' use case for something like resultsdb.
I'm not sure I'm a fan of doing something more complex than the
'scenario' key if the goal is purely to try and reduce the number of
results that actually need to be returned by the server in the first
place from, say, 200 to 50. That's just MHO, though. :)

I tend to agree, and the solution Kamil and I discussed is basically "just that" - using scenario. On top of that, we wanted to have some conventions on the "default scenario value" - say for koji_builds the "default" scenario would be `arch`, and if you wanted to differentiate even more, you'd just fill the actual scenario field.

The bonus, as we see it, is in fact the convention part of it, since it can act as a point of reference. I'm sure that saying "do whatever, but we handle stuff this way" is also a solution. I just feel like it contains more of the "well, if you want to do this, then you should do that, but lets first make sure you understand what we do, so you can base your decisions on something relevant" conversations with devels, where the sensible conventions give most of them the right solution out of the box.

The added value of those conventions (for 'minimal set of relevant fields per item type') have the added benefit of being able to do data validation in resultsdb.

I *think* kparal's approach was an attempt to be more generic than just
tailored to Bodhi's use case, but I'll have to go back and read it
again.

At the beginning of our three-days-long flamewar (in person) it certainly was, but AFAIK the current goal is to have the Bodhi convenience (where 'bodhi convenience' is equivalent of 'common usecase' for us at the moment).

Josef