Is it time to remove the existing Big Data packages from Fedora?

Peter MacKinnon pmackinn at redhat.com
Tue Aug 4 13:05:19 UTC 2015


On 8/3/15 12:38 PM, Christopher wrote:
> To some extent, Fedora has employed each of these strategies already:
> repeated patching, convincing upstream, and supporting multiple
> versions. If a particular package is problematic, or the maintainer
> isn't up to the task to decide which combination of strategies is
> best... well, that's where co-maintainers can be a really big help.
> I'd only support dropping if maintainers can't be found who are
> willing to support the packages to the extent they need. Isn't that
> how things typically work in Fedora?

Will had an excellent and succinct rationalization for dropping Apache 
Spark in the other related thread.

"I'm actually planning to retire Spark because the upstream has diverged 
too far from what we can actually support in Fedora -- too many things 
depend on bundled-but-modified libraries, nonstandard or older versions, 
etc., and even in the 0.9.1 package we had to carry a lot of patches to 
make everything work. I think that, given the developement direction the 
Spark community has taken since Spark entered Fedora, the most sensible 
way to have Spark integrated with Fedora going forward is in a Software 
Collection. This is a bummer because it was a lot of work to get Spark 
in to Fedora, but my concern is that having a (necessarily) limited 
Spark that is integrated with Fedora is not going to be sufficiently 
compelling for end-users to warrant not just using upstream binaries, 
and it's a lot of bandwidth for volunteers to take on. I'm open to 
discussion on this matter, but I think this is the best way forward."

The Fedora upstream feedback virtues you describe are laudable but the 
game has changed. That model works well for packages that have both 
limited scope and dependency sets. But these projects are massive, under 
rapid development, and IMHO the upstreams aren't interested in a 
continuous and steady stream of PR or JIRA to upgrade versions, 
unbundle, etc. for the sake of Fedora. We already went down that road 
with Hadoop for example and JIRAs just languished as we continually 
rebased them for a time. It wasn't for want of effort downstream. 
Neither does that make the upstreams "bad" or "evil".

My $0.02,
\Pete



More information about the bigdata mailing list