Proposed udpates policy change

Tue Mar 9 19:40:52 UTC 2010

On Tue, 2010-03-09 at 14:10 -0500, Bill Nottingham wrote:

> Proposal
> --------
> 
> For a package to be pushed to the stable updates repository, it must
> meet the following criteria.
> 
> 1) All updates (even security) must pass AutoQA tests.
> 
> Rationale: If a package breaks dependencies, does not install, or
> fails other obvious tests, it should not be pushed. Period. Obviously,
> this proposal would not be enacted until AutoQA is live.

I agree with the idea, but it should be better phrased. It's not the way
the test is implemented that matters, but what's being tested. The fact
that the test is done by AutoQA is really neither here nor there.

It would be safer to explicitly define the types of tests we want all
updates to pass: set out exactly what the 'obvious' hurdles every
package pushed should get over are (i.e. dependencies should be sane).
*How* we choose to test that is really just an implementation detail.

> 2) Updates that constitute a part of the 'important' package set (defined
> below) must follow the rules as defined for critical path packages for
> pending releases, meaning that they require positive karma from releng
> and/or QA before they go stable. This also includes security updates for
> these packages.
> 
> The 'important' package set is defined as the following:
> - The current critical path package set
> - All major desktop environments' core functionality (GNOME, KDE, XFCE,
>   LXDE)
> - Package updating frameworks (gnome-packagekit, kpackagekit)
> - Major desktop productivity apps. An initial list would be firefox,
>   kdebase (konqueror), thunderbird, evolution, kdepim (kmail).
> 
> Rationale: These are the sets of packages where regressions most affect
> users, and would most prevent them from Getting Their Work Done.
> Furthermore, while I can accept that there may be some packages in Fedora that
> cannot find a significant enough testing base for all potential updates, I
> reject the notion that any desktop widely used enough that we deploy a
> image or spin for it would fit into that category. I accept that this places a
> larger burden on QA, and would expect them to be able to contribute testing
> to this initiative.
> 
> 3) All other non-security updates must either: 
> 
>  a) reach their specified bodhi karma threshold
>  b) spend some minimum amount of time in updates-testing, with a tracked
>     number of downloads.
> 
>  Proposed time would be one week, but is open to negotiation.
> 
>  Rationale: We do want additional eyes on updates wherever possible. We do
>    have one Fedora mirror that Fedora infrastructure controls; we should
>    be able to mine this server for data on updates-testing downloads.
> 
> Any update that wants to bypass these procedures would need majority
> approval from FESCo.
> 
> ....
> 
> Comments, questions, reasoned arguments? Part of me wonders if this should be
> expanded with a sliding scale for update types (enhancements, for example, get
> more stringent treatment than bugfix/security).

This feels broadly correct to me at present, and that's what I'll say at
the meeting if input from non-FESCo members is okay. I think Matt's
proposal is far too much of a leap at present; it may be true that we
can drive much more feedback in Bodhi than we currently get, but we
should prove that *before* we push a policy that relies on it happening,
not after. I know there's a potential chicken/egg problem there, but we
haven't even really tried very hard yet.

It also has other issues that have already been discussed, that are
mainly shortcomings in the current Bodhi process. As many have pointed
out (and QA has basically accepted), relying on a simple overall Bodhi
'score' is pretty unreliable at present, because we have no really good
process for defining what positive and negative Bodhi votes actually
mean. It's just not a robust enough process to make all updates depend
on getting a hard-defined overall Bodhi score at present. It can go
wrong in both directions; updates that shouldn't get pushed can get +3
(the obvious example being a kernel that boots fine for 7 people and
breaks for 3 would score +4), and updates that should get pushed might
not get +3.

Right now we should limit reliance on Bodhi to only critpath packages at
most, and even for that we do need to tighten up Bodhi procedures quite
urgently. And it shouldn't rely on an arbitrary combined
positive/negative score.
-- 
Adam Williamson
Fedora QA Community Monkey
IRC: adamw | Fedora Talk: adamwill AT fedoraproject DOT org
http://www.happyassassin.net