apache spark, anyone?

Gerald Henriksen ghenriks at gmail.com
Thu Jul 30 20:20:47 UTC 2015


On Wed, 29 Jul 2015 20:54:32 -0400, you wrote:

>I fear that we are tilting at windmills here. Containers and projects 
>like Atomic and CoreOS are redefining what application composition can 
>look like. The base OS is no longer driving this since I can compose an 
>application with containers that could be a fabulous mix of components 
>running on Fedora, CentOS, Ubuntu, and so on.
>
>Language ecosystems have evolved for delivering applications. But they 
>_are not reliant__nor interested_ in the whims of any particular distro 
>packaging rules. I have an uber jar with my application and its bundled 
>runtime dependencies. Do you have a current compatible JVM? Great! Let's 
>get work done!
>
>But it's not clear that Fedora is evolving. The CVE paranoia only 
>travels so far up the stack to a point _where the value of the 
>application __outweighs the security risks of having multiple log4j 
>versions installed_. A "conform or be cast out" ethos is the road to 
>irrelevance IMHO.

My feeling is that if Fedora (and I suppose by default the Big Data
SIG) want to have Hadoop and company running on Fedora then we need to
stop trying to package it into RPMS.

My personal opinion, having looked into the idea of working on Big
Data packages, is that Java apps are unsuitable for packaging.

I understand that people have spent a lot of time coming up with
guidelines, and helper apps, for packaging Java stuff over the years.
But it is still a mess and that is without considering what the
developers of those JVM based apps are doing.

So forget the idea of trying to get exceptions to the rules, or
creating alternate repos.

In my mind the best thing that can be done is:

1) actively test the Hadoop and other componenet releases on Fedora
and make them as reliable and bug free on Fedora as possible.  This
means making sure they work on openJDK given most in the Hadoop
community seem to use Oracle's JVM, and making sure that stuff works
on the newer openJDK versions as Hadoop and company seem to be very
reluctant to move on from versions of the JVM that are no longer
supported.

2) if you want to have Hadoop (or Spark, etc) as an easy to run thing
then forget packaging and instead create Docker containers.


More information about the bigdata mailing list