Investigation of the F23 mass rebuild

Adam Jackson ajax at redhat.com
Thu Jul 2 17:47:11 UTC 2015


On Thu, 2015-07-02 at 11:09 -0400, Matthew Miller wrote:
> On Thu, Jul 02, 2015 at 10:49:37AM -0400, Adam Jackson wrote:
> > Beyond that, the fact that we have such blatant packaging errors, and
> > that nearly 4% of our binary packages haven't rebuilt in F23, is quite
> > worrisome.
> 
> I agree. What can and should we do about it?

Good question.  I'm not entirely sure, but I have opinions.

The binaries-in-/usr/share/doc thing is the sort of clearly obviously
wrong thing that, ideally, would get your build rejected.  I would have
hoped AutoQA or similar would be sufficiently potent for this by now.

The longstanding FTBFS thing is harder.  In principle we do actually
retire things that haven't built for multiple releases; in practice,
things apparently get missed.  pathfinder logiweb and python-rpi-gpio,
for example, were all on the chopping block for F21:

https://lists.fedoraproject.org/pipermail/devel/2014-June/199524.html

Though all mysteriously disappeared from the final warning:

https://lists.fedoraproject.org/pipermail/devel/2014-July/200694.html

I don't see anything in any related thread to indicate they were
repaired, so, who knows.  They're all still present in F23, all not
successfully rebuilt since F19.

Apparently for F22 we only retired packages with broken dependencies,
and didn't consider long-term FTBFS:

https://lists.fedoraproject.org/pipermail/devel/2015-April/210208.html

I'm not sure that was intentional.  We should be consistent.

People were notified of the F23 mass rebuild failures, in a sense:

https://lists.fedoraproject.org/pipermail/devel/2015-June/211496.html

If I'd seen that, I might have clicked through.  If I'd seen the list in
the email itself - like in the retirement notices above - I'd have been
more likely to read the results.

Common to all of this is a certain reactive posture.  There's not a
dashboard view of "sick packages".  Which could be useful along a number
of axes, really.  How far behind is a package relative to its upstream's
releases?  For a given sick package, how many packages depend on it?
How idle has pkg git been relative to the incoming bug rate for a
package?  The data exists, but we're not looking at it.  Obviously not
all metrics are going to be comparable across packages, maybe for the
kernel we want more of a moving average than a raw counter.

It's also worth remembering that the more developers we have, the fewer
are generalists.  Clearly we don't have someone routinely inspecting
packages to ensure CFLAGS is set properly, for example.  More to the
point, when we do have systemic issues, we don't have people able or
willing to dive into arbitrary packages and fix them, and we certainly
don't have anyone _tasked_ with that.

When we do make systemwide changes like this, we don't have known points
of contact for the resulting bugs, or we don't communicate them well.
The gnutls spec, for example, claims to support building with the
hardened flags in the changelog, but also says this just before %
configure:

# this overrides the -znow from hardened builds.
CFLAGS="$RPM_OPT_FLAGS -Wl,-z,lazy"
export CFLAGS

I mean, yes, it's honest, that does in fact override -z now.  It also
_completely defeats the purpose_ of the hardened flags.  A proper fix
could have been to apply that workaround only to the guile modules that
(apparently) need it.  But who would the maintainer have asked for
advice about that?  In some sense the answer is fesco, who approved the
change and know who is involved with it, but well, I'm on fesco, and
this never came up there.  (The hardening change did, the gnutls thing
did not.)

- ajax



More information about the devel mailing list