<p>+1</p>

<p>I really like the proposal, don&#39;t really have anything to add. Makes sense to me. Kudos. </p>

<p>-AdamM (From Android)</p>

<p><blockquote type="cite">On Mar 26, 2010 5:51 PM, &quot;Adam Williamson&quot; &lt;<a href="mailto:awilliam@redhat.com">awilliam@redhat.com</a>&gt; wrote:<br><br>Hi, folks. At the last QA meeting, I volunteered (dumb of me!) to draft<br>


a policy for testing updates - basically, a policy for what kind of<br>

feedback should be posted in Bodhi for candidate updates.<br>

<br>

This turns out to be pretty hard. =) Thinking about it from an<br>

high-level perspective like this, I think it becomes pretty clear that<br>

the current system is just broken.<br>

<br>

The major problem is it attempts to balance things that don&#39;t really<br>

balance. It lets you say &#39;works for me&#39; or &#39;doesn&#39;t work&#39; and then sums<br>

the two and subtracts the second from the first to give you a &#39;rating&#39;<br>

for the update.<br>

<br>

This doesn&#39;t really mean anything. As has been rehashed many times,<br>

there are situations where an update with a positive rating shouldn&#39;t go<br>

out, and situations where an update with a negative rating should. So<br>

the current system isn&#39;t really that great.<br>

<br>

I can&#39;t think of a way to draft a policy to guide the use of the current<br>

system in such a way that it will be really reliable. I think it&#39;d be<br>

much more productive to revise the Bodhi feedback system alongside<br>

producing a policy.<br>

<br>

So, here&#39;s a summary of what the new system should aim for.<br>

<br>

At the high level, what is this system for? It&#39;s there for three<br>

purposes:<br>

<br>

1) to provide maintainers with information they can use in deciding<br>

whether to push updates.<br>

<br>

2) to provide a mechanism for mandating a certain minimum level of<br>

manual testing for &#39;important&#39; packages, under Bill Nottingham&#39;s current<br>

update acceptance criteria proposal.<br>

<br>

3) to provide an &#39;audit trail&#39; we can use to look back on how the<br>

release of a particular update was handled, in the case where there are<br>

problems.<br>

<br>

Given the above, we need to capture the following types of feedback, as<br>

far as I can tell. I don&#39;t think there is any sensible way to assign<br>

numeric values to any of this feedback. I think we have to trust people<br>

to make sensible decisions as long as it&#39;s provided, in accordance with<br>

any policy we decide to implement on what character updates should have.<br>

<br>

1. I have tried this update in my regular day-to-day use and seen no<br>

regressions.<br>

<br>

2. I have tried this update in my regular day-to-day use and seen a<br>

regression: bug #XXXXXX.<br>

<br>

3. (Where the update claims to fix bug #XXXXXX) I have tried this update<br>

and found that it does fix bug #XXXXXX.<br>

<br>

4. (Where the update claims to fix bug #XXXXXX) I have tried this update<br>

and found that it does not fix bug #XXXXXX.<br>

<br>

5. I have performed the following planned testing on the update: (link<br>

to test case / test plan) and it passes.<br>

<br>

6. I have performed the following planned testing on the update: (link<br>

to test case / test plan) and it fails: bug #XXXXXX.<br>

<br>

Testers should be able to file multiple types of feedback in one<br>

operation - for instance, 4+1 (the update didn&#39;t fix the bug it claimed<br>

to, but doesn&#39;t seem to cause any regressions either). Ideally, the<br>

input of feedback should be &#39;guided&#39; with a freeform element, so there&#39;s<br>

a space to enter bug numbers, for instance.<br>

<br>

There is one type of feedback we don&#39;t really want or need to capture:<br>

&quot;I have tried this update and it doesn&#39;t fix bug #XXXXXX&quot;, where the<br>

update doesn&#39;t claim to fix that bug. This is a quite common &#39;-1&#39; in the<br>

current system, and one we should eliminate.<br>

<br>

I think Bill&#39;s proposed policy can be modified quite easily to fit this.<br>

All it would need to say is that for &#39;important&#39; updates to be accepted,<br>

they would need to have one &#39;type 1&#39; feedback from a proven tester, and<br>

no &#39;type 2&#39; feedback from anyone (or something along those lines; this<br>

isn&#39;t the main thrust of my post, please don&#39;t sidetrack it too<br>

much :&gt;).<br>

<br>

The system could do a count of how many of each type of feedback any<br>

given update has received, but I don&#39;t think there&#39;s any way we can<br>

sensibly do some kind of mathematical operation on those numbers and<br>

have a &#39;rating&#39; for the update. Such a system would always give odd /<br>

undesirable results in some cases, I think (just as the current one<br>

does). I believe the above system would be sufficiently clear that there<br>

would be no need for such a number, and we would be able to evaluate<br>

updates properly based just on the information listed.<br>

<br>

What are everyone&#39;s thoughts on this? Thanks!<br>

--<br>

Adam Williamson<br>

Fedora QA Community Monkey<br>

IRC: adamw | Fedora Talk: adamwill AT fedoraproject DOT org<br>

<a href="http://www.happyassassin.net" target="_blank">http://www.happyassassin.net</a><br>

<font color="#888888"><br>

--<br>

test mailing list<br>

<a href="mailto:test@lists.fedoraproject.org">test@lists.fedoraproject.org</a><br>

To unsubscribe:<br>

<a href="https://admin.fedoraproject.org/mailman/listinfo/test" target="_blank">https://admin.fedoraproject.org/mailman/listinfo/test</a><br>

</font></blockquote></p>