RPM and GEMS - Unholy marriage

Mon Jun 28 08:42:55 UTC 2010

Bert Meerman wrote:
> Hi everyone,
> 
> I've been following the discussion on RPM and GEMS conflict of interest. 
> I found it quite interesting, and I would like to discuss some packaging 
> options already present within ruby and rails.
> 
> There are three things I would like to add to Jeroen's page:
> 1) A valid Rails RPM package *explicit* links to gems required.
> Strictly speaking, you can lazy load the gems an application needs. This 
> might work, especially if you use gems that are very common (like 
> Rails). Problem is, that a critical dependency is missing. You can add 
> this info in the RPM spec, but I think it is better to make it mandatory 
> more downstream. For RPM packaging, all you need to do is ask for the 
> dependent gems.
> My question: what are your thoughts on this?
> 

With a packaged Rails product, for example, one aspect of the RPM package will 
most likely be:

  Requires:  rubygem(rails)

One can enforce a specific version of the Ruby gem:

  Requires:  rubygem(rails) = 2.3.5

in order to prevent breakage on platform or application upgrades, through 
system capabilities. Pretty much the same goes for Ruby gems, not just 
packaged applications from downstream consumers or vendors.

Normally, in cases where the upstream of the dependent package is using sane 
versioning, one could even:

  Requires: rubygem(rails) >= 2.3, rubygem(rails) < 2.4

which, logically, should result in the Rails RPM package changing for bug- and 
security-fixes only. However, such seems to not be common practice in the 
larger Ruby community. This is a problem mid-streams like Fedora will have to 
deal with -or let it be a Free For All but then what are we doing here?

Anyway, while the application may have (with emphasis on "may have"):

  RAILS_GEM_VERSION = '2.3.2'

this does not correspond with the capabilities provided by the system in any 
way. It locks the application down to using 2.3.2 and 2.3.2 only, which should 
thus also not be used lightly in any situation.

Because of it's own little package management, we have two completely separate 
dependency paths each of them searching to resolve dependencies (but only one 
of them through global dependencies), and both are as dependent on a closed 
dependency tree as another.

However, if you install a Ruby gem RPM, and gem update the system, your RPM is 
still there, and you can depend on it, and you can be sure that, should the 
system-wide capability be attempted to be removed in any case, the transaction 
is going to fail because of missing dependencies.

Imagine you gem install rails but then yum remove ruby -just to emphasize the 
point I'm trying to make here. 

In another example, gem install mysql and then yum update mysql from 5.0 to 
5.1, or yum update ruby from 1.8.6 to 1.8.7.

There can only be one package manager doing all the work. It's RPM.

Does that limit anyone from using Gem? No, we're shipping it and in the end, 
it's your system. Knock yourself out.

Do you have the liberty to derive from what we provide, either by using gem or 
apt or ...? Sure! Have fun.

Does that mean the RPMs provided are worthless? Euh, not so much.

For one, RPM builds are reproducible. If a test fails on one build system, it 
will fail on another build system in very much the same way (exceptions 
apply). If compilation fails because it's using a dynamic library instead of a 
static library to compile against, or it uses the latest and greatest version 
rather then the compat- version. We'll figure that out and fix it. 
Documentation for the upstream gem only accounts for so much of so many system 
aspects.

> 2) A valid Rails RPM package must contain the version when specifying 
> gems and the version must be limited.
> Example:
> # require 'aws/s3'
> config.gem 'aws-s3', :lib => 'aws/s3', :version => '0.4.0',      #     
> :source => "http://code.whytheluckystiff.net"

Actually, this is not the case. in a config.gem statement, one only is 
required to list the name, and all other parameters are optional.

For example, one my require a mysql gem, but not care about the version - 
because the platforms the package or application is distributed for includes 
the correct version and the mysql gem doesn't change that much.

> (see also 
> http://api.rubyonrails.org/classes/Rails/Configuration.html#M002064)
> This means that the module will use any version greater than 0.4.0. You 
> can also specify smaller, a range or a specific number. The trick is 
> this: if you do this correctly, you can use multiple gem versions on the 
> same system. In theory, this must work. This is what was causing the 
> issue with Lighthouse. I think you should fix the version or make it a 
> range, never a 'greater than' (as in the example and with Lighthouse).
> My question: what are your thoughts on this, especially the mandatory 
> 'limit the version number'?
> 

Multiple versions of a single applications are useful in a situation in which, 
from the perspective of a midstream distributor, each and every version can be 
appropriately maintained. Nevermind the bug-fixes and let's go straight to 
security fixes. We, the mid-stream distributor, have a tendency to solve most 
of those security issues for you, and give you the appropriate, fixed version, 
without breaking anything. It's something virtually unheard of in the greater 
Ruby community, where feature enhancements and API breakage go hand-in-hand in 
a release marked as a Security related release.

That said, we do have mechanisms in place that allow us to install more then 
one version of an application through RPM. We currently use that mechanism for 
kernel packages. It has a setting that is called "installonlyn_limit", so you 
may choose how many versions (or actually, RPM releases, not application 
versions) of the package should be preserved.

> 3) Gems can be vendorized.
> OK, this is the moment where a lot of people are going to be amazed, 
> mouth open and staring at the screen for the next five minutes: you can 
> make a Rails application and include the gems in it, as it where local 
> code. This is called vendorization.

</sarcasm> ;-)

> A vendorized gem is only available 
> to the application using it: it is part of the source code of that 
> application and located in the /vendor directory of the application root 
> (not in the normal gem directory). In fact, you can vendorize a gem and 
> then even modify the code.
> On one hand, if you say all gems should be vendorized, you don't have 
> any version conflicts. On the other hand, I heard someone complain about 
> static linking...
> Personally, I disagree with the vendorization of gems. It obfuscates 
> version management.
> Question: does Fedore prefer (or mandatorize) a specific approach? Why?
> 

We do not allow the shipping of vendored code or libraries. A good explanation 
of what vendored code entails is at 
https://bugzilla.redhat.com/show_bug.cgi?id=470696#c37

That does not mean, however, that downstream consumers or Independent Software 
Vendors are not allowed to vendor parts of their project or product. We 
discourage it, for all the right reasons might I add, but we can not (and 
should not) prohibit it.

Vendored code is not allowed for the parts that the Fedora Project 
distributes, hence if a product is ever to be a part of the distribution, the 
vendored code needs to go.

-- Jeroen