apache spark, anyone?

Martin Bukatovic mbukatov at redhat.com
Thu Jul 30 16:22:23 UTC 2015


On 07/28/2015 10:27 PM, Haïkel wrote:
> 2015-07-28 6:24 GMT+02:00 Gerald Henriksen <ghenriks at gmail.com>:
>>
>> I would disagree.
>>
>> Would I like to see these projects packaged in Fedora?  Yes.
>>
>> But amongst the things the Fedora brand means is attempting to do
>> things the right way.  This is why there are rules and guidelines to
>> be followed.
>>
>> If you allow SIGs to ship software that violates the Fedora guidelines
>> (whether it be in packaging, license, or any other) then you dilute
>> what the Fedora name stands for.
>>
> 
> I understand your concerns, but there conflicting goals:
> * making Fedora relevant for Big Data by shipping projects consumable
> by Fedora users
> * Uphelding our current guidelines strictly
> 
> Unless we accept some compromise, we can't maintain both.
>
> That's why it seems perfectly acceptable to allow layered products:
> 1. they'll be based upon Fedora and follow it lifecycle
> 2. they'll be able to lift *some* guidelines (and not all of them) by
> explicit approval
> from technical committee (Fesco? FPC?)
> => not all guidelines would be suitable for such exceptions like
> licensing ones or FHS, we're Fedora,
> we have ethics.
> 3. packages will be shipped in separate repositories and will be under
> the responsibility of the SIG.
> 4. Fixing such issues on a best effort basis.
> 
> That's the best compromise to ship components relevant to Fedora users without
> compromising too much with quality.

I see your point, but the only way how to do this your way is to
package Big Data projects into dedicated software collection(s)
(as has been already mentioned in this thread) and maintain it
in a copr repo. Then you can:

* relax some requirements based on your needs
* have your nasty library bundling stuff isolated from the rest of
  the system
* have easy way to enable your repo (dnf has copr plugin to
  handle this)

On the other hand, I'm not sure you could claim that this is the
official part of fedora project.

> As far bundled libraries are concerned, Fedora repositories ship many
> packages with
> bundled libraries with or without FPC permission.
> And though we have a process to allow bundled libraries on a case per
> case basis, it would completely
> jam FPC if we did a request for each library/package.

This is just a direct consequence that library bundling is a terrible
idea.

> Moreover, this prevents us to find better way to track bundled
> libraries CVE like leveraging provides
> to identify them and allow automated tracking.

Yep, it's a pain to maintain bundled libraries and I guess it has
something to do with the reason why fedora do not allow it by default.

>> Yes, it seems a lot of the Java ecosystem is a dependency mess.  But
>> the solution isn't to turn Fedora into a mess as well.
>>
> 
> Why would Fedora turn into a mess?
> 
> Assuming that it would only concerned a limited set of packages, shipped in a
> separate repository, respecting most guidelines, maintained by trusted people.
> By default, packages will have to follow all Fedora guidelines as of
> now, and it must
> not change.

You still needs to separate it in a software collection, you can't
guarantee that you don't break the rest of policy abiding system
via your policy relaxing layered repository.

>>
>> I would also hope that the exceptions would be denied.
>>
> 
> Or package maintainers will just ignore FPC and don't give a shit
> about bundled libraries.
> Before blindly enforcing guidelines, you need to know why they're here
> in the first place.

The fact that upstream developers doesn't follow good practices doesn't
mean it's ok to ignore them.

> Moreover, according their own rules, Big Data packages are likely to
> be granted exceptions
> (though through the most detestable one path) and shipped in main repository.
> I prefer not wasting packagers and FPC time with tedious procedures,
> and not mix packages
> with higher/lesser quality standards in the same repository.
> To me, that would be an improvement to the current policy: keeping
> stricter standards for the core
> while shipping highly curated set of packages through trusted repositories.
> rpmfusion is not a solution, its goals is to provide packages that are
> not allowed in Fedora due to
> licensing/patents issues.
> 
> Off course, we can keep a strict attitude but that would be mean that
> Fedora would be irrelevant
> for large audiences (ie: data scientists, operators etc.).

As I said, why not start with bigdata copr soft col. repository. You
can try to experiment with it without breaking anything or asking
anybody as it's outside of fedora while going on with the talk.

> We may discuss this, but this really falls under the council
> competencies but I encourage
> the objective lead to consider such arrangement for the best of Fedora
> and its success.


-- 
Martin Bukatovic


More information about the bigdata mailing list