Excerpts from Kamil Paral's message of 2017-05-11 10:35 +02:00:
On Thu, May 11, 2017 at 10:13 AM, Josef Skladanka
<jskladan(a)redhat.com>
wrote:
> ... Following up with that last sentence:
>
> I sure agree, that having thousands of Depcheck results is nonsense. And
> in this special case, I really agree that pruning the data is a good idea
> (if we have a very well defined way of devising the "this is just a
> duplicate" thing, that is). I also agree, that we don't need to have the
> whole history of all the results in one place, that is fast-read access.
> But it still makes sense (to me at least) to just have an archiving policy,
> rather than deletion policy, as the first step.
>
I didn't realize that, but that's conceptually no different from what I
proposed, right? It can still be an external process that prunes the last
day's worth of data during night, just instead of throwing those away
completely, it saves them to a secondary database. So it can be still
implemented outside of resultsdb. Correct?
Also, this could also be OK for RHEL folks, they could have different
policies for archiving, and the old results would still be available, just
in a different database (and with slower access).
Unfortunately this will be basically a non-starter. Imagine that you
need to go back to an old Bodhi update that shipped three years ago for
some kind of auditing purpose. Bodhi needs to be able to show the same
results, waivers, and decisions today as it did three years ago.
Okay, maybe with Bodhi it is okay to just wing it and say "sorry the
data is gone now" but that doesn't really fly if you imagine Errata Tool
in place of Bodhi.
So I think relying on deleting/moving/archiving data out of the
ResultsDB database just to make it perform well, is not a viable option.
There really has to be a proper solution to the "how do I find the
latest relevant results" problem.
(Doing some one-off pruning to handle pathological cases like old
depcheck results is a different story -- that would be more just about
reducing the size of Fedora's ResultsDB a little, rather than being
crucial to making ResultsDB perform well.)
--
Dan Callaghan <dcallagh(a)redhat.com>
Senior Software Engineer, Products & Technologies Operations
Red Hat