Drawing lessons from fatal SELinux bug #1054350

Fri Jan 24 19:14:50 UTC 2014

On Fri, 2014-01-24 at 19:26 +0100, Michael Schwendt wrote:

> > * That update made it out to the stable updates! In other words, the 
> > draconian Update Policies that were enacted in a vain attempt to prevent 
> > such issues from happening utterly failed at catching this bug.
> 
> Those policies are not "draconian" enough [1]. On erroneous belief that
> a +1 from three different testers would mean that the update has seen
> enough testing, the test update has been published with the default karma
> threshold of +3. The testers have failed. It's too simple for testers to
> rush through the voting in bodhi without testing the updates
> painstakingly. "The faster the better" has lead to a fatal mistake in
> this case.

I think that's being unnecessarily harsh on the testers. It's not at all
obvious to anyone that you ought to test update/install of another
package in order to validate an update to selinux-policy-targeted .
Hell, I don't do that.

Hate to sound like a broken record, but really the problem here is just
the complete lack of granularity in the karma system: to phrase it
theoretically, we know there are a huge spectrum of meanings for both +1
and -1:

+1
--

* I installed it and nothing blew up
* I installed it, rebooted and nothing blew up
* I installed it, ran the entire test suite, grabbed the source tarball
and inspected it line-by-line for vulnerabilities, fuzz tested all the
variable handling, then deployed it to my extensive test farm for a week
and assessed the results
* It fixes my bug, and I didn't test anything else
* It fixes my bug, and nothing blew up
* It fixes my bug, and...(you see where I'm going with this)
* It installs, it works, maybe it fixes some bugs, but it also
introduces this other regression
* I like the update text / the update submitter / candy

-1
--

* It failed to install
* I installed it, and something blew up
* I installed it, rebooted and something blew up
* (etc)
* It doesn't fix my bug (and that's the only bug the update was meant to
fix)
* It doesn't fix my bug (but the update also fixes 50 other bugs,
successfully)
* It doesn't fix this other bug I have that the update didn't even claim
to fix
* It installs, it works, maybe it fixes some bugs, but it also
introduces this other new bug (yes. this is identical to one of the +1
entries. That is the point. The same thing can also be registered as 0,
giving us the perfect set. Depending on the details of what's fixed and
what's broken, and the individual karma submitter's instincts, it can
seem 'right' to file this as any one of the three possible values.)
* It installs, it works, it doesn't exactly introduce any bugs, but I
think it is not compliant with the update policy (i.e. too drastic a
change in behaviour from the previous package)
* I don't like the update text / the update submitter / candy

The 'comment' field exists to allow people to express all these things,
but as it's just a completely free-form text field, it's intrinsically
impossible to really base any programmatic stuff or even policy on it.
In theory maintainers could submit updates without using autokarma and
then keep a careful eye on the feedback and 'tend' their updates
manually, but I think it's pretty clear that in practice, this is not
what happens: maintainers really want to be able to use the karma system
as a 'helper', they want to farm out the evaluation process to Bodhi/the
karma system. But our current system is too stupid to handle this
perfectly, so we get these breakdowns.

With a more flexible karma system we have a *lot* of opportunity to do
much cleverer stuff. We can provide presets for all the above different
things that are currently commonly expressed via +1 or -1 with a
comment. This opens up possibilities at two different levels: the distro
policy level, and the packager level. We can make the distro policy much
more fine-grained, if we want to - we can require certain of the 'karma
types' to be available in all updates, and for instance, block any
update where X people pull the 'it's completely busted' or 'it
introduces a security vulnerability' cord, regardless of how much
broadly-categorized 'positive' karma it has. At the packager level, the
packager gets the freedom to define a much more fine-grained policy for
when they're happy that updates to their package are 'good to go', but
they still don't have to sit there reading the emails and manually
interpreting what people have written. You get to define the policy that
makes the most sense for your package, within the confines of the
distro-wide policy - if you have a good package-specific test suite, you
can say to the auto-karma system 'don't send this update out until at
least one person sets the "I ran the test suite and it passed" karma
property.

Those are just examples: the point is that what we badly need here is a
more expressive and flexible system. (As well, as I've said elsewhere in
the discussion, as a good automated test for this specific and
well-known category of 'delayed action' update problems).
-- 
Adam Williamson
Fedora QA Community Monkey
IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net
http://www.happyassassin.net