De-duplicating test results: 'scenarios'
by Adam Williamson
Hi folks! So rather than send a welcome mail I figured let's get right
into something real ;) I actually got the idea for this list because I
wrote up some thoughts on something, and realized there was nowhere
good to send it. So I invented this mailing list. And now I can send
it! Here it is.
I've been thinking some more about convenient consumption of ResultsDB
results. I'm thinking about the problem of generic handling of
duplicated / repeated tests.
Say Alice from release engineering wants to decide whether a given
deliverable is of release quality. With ResultsDB, Alice can query all
test results for that deliverable, regardless of where they came from.
Great. However, it's not uncommon for a test to have been repeated. Say
the test failed, the failure was investigated and determined to be a
bug in the test, and the test was repeated and passed. Both results
wind up in ResultsDB; we have a fail, then a pass.
How does Alice conveniently and *without special knowledge of the
system that submitted the result* identify the second result as being
for 'the same test' as the first, and thus know she can consider only
the second (most recent) result, and not worry about the failure? (I'm
expecting that to usually be the desired behaviour). There are also
other situations in which it's useful to be able to identify 'the same
test' for different executions; for instance, `check-compose` needs to
do this when it does its 'system information comparison' checks from
compose to compose.
I guess it's worth noting that this is somewhat related to the similar
question for test 'items' (the 'thing' being tested, in ResultsDB
parlance) - the question of uniquely identifying 'the same' item within
and across composes. At least for productmd 'images', myself and
lsedlar are currently discussing that in
https://pagure.io/pungi/issue/525 . Obviously it's more or less a
solved problem for RPMs.
I can think of two possible ways to handle this: via the extradata, or
via the test case name.
openQA has a useful concept here. It defines what combination of
metadata defines a unique test scenario like this, and calls it...well,
that - the 'scenario'. There's a constant definition called
SCENARIO_KEYS in openQA that you can use to discover the appropriate
keys. So I'm going to use the term 'scenario' for this from now on.
There's kinda two levels of scenario, now I think about it, depending
on whether you include 'item' identification in the scenario definition
or not. For identifying duplicates within the results for a single
item, you don't need to, but it doesn't hurt; for identifying the same
scenario across multiple composes, you do need to. I suppose someone
may have a case for identifying 'the same' test against different
items; for that purpose, you'd need the lighter 'scenario' definition
(not including the item identifier).
One thing we could do is make it a convention that each test case (and
/ or test case name?) indicates a test 'scenario' - such that all
results for the same test case for the same item should be considered
'duplicates' in this sense, and consumers can confidently count all
results for the same test case as results for the same test 'scenario'.
This seems to me like the simplest possibility, but I do see two
potential issues.
First, there's a possibility it may result in rather long and unwieldy
test case names in some situations. If we take the more complete
'scenario' definition and include sufficient information to uniquely
identify the item, The test case name for an openQA test that includes
sufficient information to uniquely identify the item under test may
look something like: `fedora.25.server-dvd-
iso.x86_64.install_default.uefi` (and that's with a fairly short test
name).
Second, it makes it difficult to handle the two different kinds of
'scenario' - i.e. it's not obvious how to split off the bits that
identify the 'item' from the bits that identify the 'test scenario'
proper. In this case the 'test scenario' is `install_default.uefi` and
the 'item identifier' is `fedora.25.server-dvd.iso.x86_64`, but there's
no real way to *know* that from the outside, unless we get into
defining separators, which always seems to be a losing game.
Another possibility would be to make it a convention to include some
kind of indication of the test 'scenarios' in the extradata for each
result: a 'scenario' key, or something along those lines. This would
make it much easier to include the 'item identifier' and 'test
scenario' proper separately, and you could simply combine them when you
needed the 'complete' scenario.
I'm trying to avoid consumers of ResultsDB data having to start
learning about the details of individual test 'sources' in order to be
able to perform this kind of de-duplication. It'd suck if releng had to
learn the openQA 'scenario keys' concept directly, for instance, then
learn corresponding smarts for any other system that submitted results.
Any thoughts on this? Any better ideas? Any existing work? Thanks!
--
Adam Williamson
Fedora QA Community Monkey
IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net
http://www.happyassassin.net
6 years, 6 months
Top level namespace?
by Ralph Bean
Looking for opinions here. We're putting together something
internally to put results about AMIs in resultsdb and the question
arose: what should we use for the top-level namespace?
- ami.sometool.sometest.somesubtest
- cloud.sometool.sometest.somesubtest with item_type=ami
- something else?
Ralph
p.s., the test tool here is "dva" https://github.com/RedHatQE/dva
p.p.s., how should I interpret the "dist" namespace? I've always read
it as "distro level check", i.e., something that fedora-qe enforces
distro-wide.
6 years, 7 months
fedmsg message on koji builds
by Pierre-Yves Chibon
Hi Everyone,
I have been working on a small tool named protron [1] that listens to pagure
pull-request messages, triggers a build and report build status as well as
taskotron/resultsdb test results to the pull-request.
Since protron kicks off the koji build, it knows its task identifier and keeps
it in memory so that it can later recognize a koji build it kicked versus a koji
build from something/someone else.
However, taskotron/resultsdb message do not include the koji task it was running
the tests against. So protron queries koji taskinfo <id> to get the package
nevr this task was about and keep then that info in memory to recognize
taskotron messages of interest.
In our regular workflow this isn't a problem because of the nevra restriction on
koji is such that we cannot build the same package twice, but with the PR model
coming to dist-git, we will need to somehow lift that constraint. There could
very well be two persons or two systems proposing a pull-requests doing the same
update (say the-new-hotness bumps foo to 1.2.3 while missing something, while a
contributor comes by and do the same update to 1.2.3 with the rebase of the
patch or so), both PRs will be building foo-1.2.3 and this will confuse protron:
to which PR corresponds foo-1.2.3?
If the PR are created with sufficient time in between, this will not be a
problem, the newer PR will override the info in memory of the old one.
However, if the two PRs are created sufficiently close to one another (so that
the tests for the first PR aren't finished running), this will be an issue.
To avoid this, one way would be to add to the taskotron message sent the task id
of the build taskotron was testing.
Would this be something that is doable?
Thanks in advance,
Pierre
6 years, 8 months