Secondary arch build changes

Tuesday, 9 February 2016

Hi All,

In a panel session at devconf this past weekend, Dennis mentioned some
possible plans to change how secondary architectures work.  Primarily,
the builds for every arch included in a package's set would be
required to build successfully, even if the arch is a secondary arch.
However, the compose of that architecture could fail and the rest
would still be pushed.  I hope I summarized that correctly.

That leaves me a bit confused.  The major distinction from a
developer's perspective today is that build failures on a secondary
architecture do not fail the build on primary.  The compose of a
secondary architecture is even one step further removed from their
workflow.  With the proposed change, there is very much no distinction
between primary and secondary architectures for a package maintainer.

The assumption here seems to be that they can ExcludeArch a failing
architecture and then resubmit the build.  That is certainly possible,
particularly with the proposed notification to the secondary
architecture maintainers helping.  However, for packages which take a
significant amount of time to build in general, this is going to have
an impact.  Even if we assume there is no difference between arches in
terms of build performance, waiting 3 or 4 hours for a build to fail
because a secondary arch fails is really irritating.

While it doesn't solve the overall irritation factor, I have a small
suggestion.  Today, if an architecture fails to build the remainder of
the builds are immediately canceled.  If this proposed change to koji
happens, I would like to suggest we not do that.  Instead, I would
suggest letting all builds on the various architectures run to their
natural completion and if one fails, send a failure notification on a
per arch basis as soon as that task fails.  This allows the maintainer
to verify which arches a package builds on and which it does not.  If
they wish to cancel the build upon an arch failure notification, they
still can do so with koji cancel.  The build as a whole could still be
failed, but only after all arch tasks are complete.

This might not seem like an issue to most packages, but I do know that
in the kernel we hit different failures on different arches at
different points in a single build quite often.  E.g. a driver will
fail to build on arm and cancel the whole build early.  Then perf will
fail to build on i686 but work on x86_64, which comes much later in
the build process.  In a theoretical world of expanding architecture
support, I very much don't want to rinse and repeat a build any more
than is necessary.  Allowing us to see how each arch fares
individually helps avoid that problem.

As an aside, I'm not fully convinced this koji change is a great idea.
ExcludeArch is the hammer that will get used most to "fix" failures,
and that isn't helping resolve the underlying issues.  For things like
the kernel, gcc, or glibc, it isn't even really an option.  Yes, we
can use ExcludeArch but if we do so then there is no possibility of
doing a successful _useful_ compose anyway.  However, maybe it won't
be so bad.  I prefer to focus on my suggested idea above for now, so
just log this paragraph as a note of caution perhaps.

josh

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008