Criterion revision proposal: KDE default applications

Fri Dec 13 06:28:40 UTC 2013

On Fri, 2013-12-13 at 05:38 +0000, "Jóhann B. Guðmundsson" wrote:
> On fös 13.des 2013 04:06, Adam Williamson wrote:
> > it was -1'ed at go/no-go meeting in about five seconds. No-one voted +1
> > blocker on it.
> 
> For the first I was not present there since I arrived later then usual 
> due to $dayjob but otherwise I would have voted +1 on it ( which would 
> have then just have been me  ) + most of individual present have been 
> voting to push the release out the door to meet that arbitrary deadline 
> we never make anyway which renders this argument mood.

Well, it's not an arbitrary deadline. It's the release schedule.
Fedora's supposed to be strongly tied to its schedule; we're not a
feature-based release project. So, it inevitably has to be the case that
we have to calibrate our quality standards to what it's practical to
achieve, in terms of testing and fixing, approximately within that
cycle. The way Fedora is set up, it is reasonable for us to slip, oh,
say up to three weeks a cycle, to fix really serious problems. But we
shouldn't be slipping almost every milestone for every release, which we
are. We shouldn't be putting ourselves into positions, _repeatedly_,
where we have the choice of fudging our quality standards or delaying
releases for multiple months. When those things are happening, what it
indicates is that there is a fundamental mismatch between the quality
standards we're setting ourselves, the development goals we're setting
ourselves, and the resources - in terms of overall tester/developer
hours, i.e. a product of the number of devs and testers we have and the
release cycle we chose - we have to achieve those things.

The last few cycles we've theoretically set our standards at a certain
point, and then not really got close to living up to them. If we say
we're going to block on every possible partition layout, we need to test
at least some reasonable representative sample of all possible partition
layouts by, at latest, Beta RC1. That didn't happen. If we say we're
going to block on every single app installed by default for either of
two desktops, we need to actually test every single app installed by
default on either of those two desktops by Beta RC1, and ideally keep
doing it with every relevant change pushed stable after that. That
doesn't happen.

The current state is that we're setting a standard which we clearly
don't have the QA resources to stand behind sufficiently within the
timeframe the project is comfortable with spending on a release, and on
freeze/test/stabilize cycles. Pete knows what's coming out of the
three-product proposal, but somehow I don't see it making the freezes
longer or the set of deliverables any smaller.

If we want to maintain the standards of quality we claimed to be setting
for F18, F19 and F20 and actually live up to them, we would need to
either drastically increase the resources we dedicate to QA and
bugfixing, or reduce the development goals we set. It never seems like
it's on the cards to reduce Fedora's pace of development: no-one seems
to have the appetite for it. No-one wants to be the person who says 'no,
that is a nice sounding feature but we don't have time for it. Put it in
the next release.' It never happens.

So we're left with the resources. Red Hat does not have a dozen interns
lying around the place we can feed into the Fedora QA hopper. We've
certainly got excellent community testers, and some who've started
contributing or contributing more in recent releases and helped hugely
to make them less of a disaster than they could have been. But it's not
like we're getting a dozen new volunteers who'll put in multiple hours
per day on testing either. Extending the release cycle or lengthening
freezes seems to be as unlikely to happen as reducing the development
churn; look at how this thing with trying to make the F21 cycle longer
is going with FESCo.

So, the way I see it, if we can't significantly reduce the pace of
Fedora development, significantly increase the number of QA and
developer people we have available, or lengthen the release cycle or
freezes to give us the effect of more resources dedicated to
stabilization with the same number of people, we either reduce the
expectations we say we have, or we keep on basically lying about what
our quality expectations are and then coming up with paper-thin excuses
as to why not meeting them doesn't 'really' mean we have to block, while
running around like headless chickens throwing builds against the wall
one day before go/no-go to fix a bug we didn't know about until 1.1 days
before go/no-go. I dunno about you, but the headless chicken act and the
hero validation runs are wearing on me. Maybe if we do what I'm
suggesting, and we do well for a while, we'll start feeling like we're
on top of everything and we have the confidence to move in the direction
of increasing the standards again. But right now, honestly, can you say
we're in a position where we're able to realistically meet all the
standards we are claiming to set for ourselves? I certainly don't.

I was the one who proposed the desktop criteria in the first place, and
some of the pushback against them was based on the worry that we
wouldn't have time to enforce them. At the time I said I'd stand behind
them and make sure the testing got done. Well for a few releases I think
we did, and I remember back around 15-16-17 I actually spent quite a bit
of time _testing the desktops_, I didn't spend four months straight
running anaconda for five hours a day. I also had time to talk to the
desktop SIGs and ask them for help and just generally look after the
desktop testing process. From F18 on, with the extra workload from
anaconda being much more unstable (in the technical sense of 'changing a
lot'), with the addition of ARM as a primary arch and cloud as a
first-class deliverable and UEFI testing and more USB testing and all
the other expansions that have crept into our workload, I really haven't
been able to do that the way I did before, and no-one else seems to have
stepped in and taken over. So now we have these desktop criteria we (QA)
supposedly commit to 'policing', in the sense of actually running the
tests and making sure things get fixed on a reasonable timeline, and I
don't think that in practice that is actually _happening_.

Now a couple of people have suggested reducing the KDE package set for
the DVD install. Like adjusting the criteria, that would also have the
effect of reducing the testing workload, so it seems like a reasonable
approach. If that's what we want to do instead, fine. But we need the
KDE SIG to agree with that approach and commit to it and land the
changes, like, _now_. Comfortably in advance of F21 TC1. And we need to
be sure that we (QA) can provide the resources to actually do the
testing required to properly back the criteria: we need to be actually
running all the desktop tests, frequently, during Alpha and Beta, and
working with the desktop and KDE SIGs to fix the bugs.

> The fact is it should not matter if I install from a live or from the 
> dvd the end result should be the same.

That might be your opinion of what's correct. And hey, maybe everyone
agrees with you. But the practical fact of the matter is that is not, at
all, how Fedora as it exists right now works. For both GNOME and KDE, if
you install from the DVD you get a significant number more packages than
if you install from the live image. You do not get the same result. If
we all agree that you should, and we go away and _actually implement
that_ within the next, say, month, then great: that helps solve all our
problems. Does everyone want to do that?
-- 
Adam Williamson
Fedora QA Community Monkey
IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net
http://www.happyassassin.net