On Tue, Aug 4, 2015 at 9:05 AM, Peter MacKinnon <pmackinn(a)redhat.com> wrote:
On 8/3/15 12:38 PM, Christopher wrote:
>
> To some extent, Fedora has employed each of these strategies already:
> repeated patching, convincing upstream, and supporting multiple
> versions. If a particular package is problematic, or the maintainer
> isn't up to the task to decide which combination of strategies is
> best... well, that's where co-maintainers can be a really big help.
> I'd only support dropping if maintainers can't be found who are
> willing to support the packages to the extent they need. Isn't that
> how things typically work in Fedora?
Will had an excellent and succinct rationalization for dropping Apache Spark
in the other related thread.
"I'm actually planning to retire Spark because the upstream has diverged too
far from what we can actually support in Fedora -- too many things depend on
bundled-but-modified libraries, nonstandard or older versions, etc., and
even in the 0.9.1 package we had to carry a lot of patches to make
everything work. I think that, given the developement direction the Spark
community has taken since Spark entered Fedora, the most sensible way to
have Spark integrated with Fedora going forward is in a Software Collection.
This is a bummer because it was a lot of work to get Spark in to Fedora, but
my concern is that having a (necessarily) limited Spark that is integrated
with Fedora is not going to be sufficiently compelling for end-users to
warrant not just using upstream binaries, and it's a lot of bandwidth for
volunteers to take on. I'm open to discussion on this matter, but I think
this is the best way forward."
The Fedora upstream feedback virtues you describe are laudable but the game
has changed. That model works well for packages that have both limited scope
and dependency sets. But these projects are massive, under rapid
development, and IMHO the upstreams aren't interested in a continuous and
steady stream of PR or JIRA to upgrade versions, unbundle, etc. for the sake
of Fedora. We already went down that road with Hadoop for example and JIRAs
just languished as we continually rebased them for a time. It wasn't for
want of effort downstream. Neither does that make the upstreams "bad" or
"evil".
My $0.02,
\Pete
Perhaps it makes sense to drop Spark (I don't have a vested interest
in Spark), and some others. Perhaps it's even worth it to drop the
idea of a Big Data SIG (or mutate it to reflect the proposed focus on
Docker, or something else). However, I still have an interest in some
of the Big Data packages (ZooKeeper, Accumulo, Hadoop, etc.), and am
willing to do some effort to keep them alive as standard packages, and
I hope I can count on other Fedora packers to support this, as needed,
even if it is no longer the focus of the Big Data SIG.