Modular Kernel Packaging for Cloud

Don Zickus dzickus at redhat.com
Thu Mar 6 14:57:17 UTC 2014


On Thu, Mar 06, 2014 at 08:16:00AM +0900, Sandro "red" Mathys wrote:
> On Thu, Mar 6, 2014 at 12:13 AM, Don Zickus <dzickus at redhat.com> wrote:
> > On Wed, Mar 05, 2014 at 10:02:17AM -0500, Josh Boyer wrote:
> >> On Wed, Mar 5, 2014 at 9:54 AM, Don Zickus <dzickus at redhat.com> wrote:
> >> > On Wed, Mar 05, 2014 at 08:25:12PM +0900, Sandro "red" Mathys wrote:
> >> > For example, lets start with 100MB package requirement for the kernel (and
> >> > say 2 GB for userspace).  This way the kernel team can implement
> >> > reasonable changes and monitor proper usage (because things grow over
> >> > time).
> >> >
> >> > If later on you realize 100 MB is not competitive enough, come back and
> >> > chop it down to say 50 MB and let the kernel team figure it out.
> >> >
> >> > But please do not come in here with a 'every MB counts' approach.  It is
> >> > not very sustainable for future growth nor really easy to implement from
> >> > an engineering approach.
> >> >
> >> > Is that acceptable?  The kernel team can start with a hard limit of 100MB
> >> > package requirement (or something reasonably less)?  Let's work off budget
> >> > requirements please.
> >>
> >> This is a fair point.  To be honest, I've ignored the "every MB
> >> counts" aspect entirely for now.  I've instead been focusing on
> >> required functionality, because that's likely going to be the main
> >> driver of what the resulting size will be.
> 
> That's the point, we want a reasonably small package while still
> providing the required functionality. Not sure how providing a fixed
> size number is helping in this. But most of all, I didn't throw in a
> number because I have no idea what is reasonably possible. I really
> only just said "every MB counts" because the question came up before
> (in Josh's old thread) and I hoped I could stop this discussion from
> happening again before we have any numbers for this.

Ever work in the embedded space?  Every MB counts there too. :-)  This was
solved by creating budgets for size and memory requirements.  This helped
control bloat, which is going to be your biggest problem with cloud
deployment.

What concerns me is that you don't know what size your cloud deployment is
but expect everyone to just chop chop chop.  How do we know if the kernel
is already at the right size?

There is s a huge difference between re-architecting the kernel packaging
to save 1 or 2 MB (off ~143 MB size currently) vs. re-architecting to save
50 MB.  The former is really a wasted exercise in the bigger picture,
wherease the latter (if proven needed) accomplishes something.

But again it comes down to understanding your environment.  Understanding
your environment revovles around control.  I get the impression you are
not sure what size your environment should be.

So I was proposing the kernel stay put or maybe create _one_ extras
package that gets installed in addition to the bzImage.  But from the
sound of it, the chopping is really going to get you savings of about
~30MB or so.

The thing is a few years ago, the kernel converted a lot of modules to
built-ins (inside the kernel).  What that means from a novice perspective
is we took a whole bunch of external bloat that could be stripped away and
stuffed it into the kernel binary itself.  The goal at the time was to
speed up boot times (because loading modules was slow).

Now with the re-design of module loading in the last couple of years,
maybe this can be reverted.  This could shave MBs off the kernel binary
itself.  And then you can only package the modules cloud really needs.

This at least has the ability to scale across other SIGs too.

However, this all hinges on _how much chopping we should do_.  I doubt
Josh really wants to embark on this thrashing without very good convincing
reason to do so.

So having some numbers to work off of, provides us the right idea how much
and what type of work needs to get done (little tweaks vs re-think the
whole approach).


> 
> > Of course. :-)
> >
> >>
> >> FWIW, the existing kernel package installed today (a debug kernel
> >> even) is ~142 MB.  123MB of that is the /lib/modules content.  ~6MB of
> >> that is vmlinuz.  The remaining 13MB is the initramfs, which is
> >> actually something that composes on the system during install and not
> >> something we can shrink from a packaging standpoint.
> >
> > It also helps with monitoring.  3-4 years from now after all the chopping,
> > these pacakges bloat right back up and everyone forgets why we chopped in
> > the first place.  Hard requirements help keep everything in check and
> > forces people to request more space which the cloud team can evaluate
> > properly and still control their enviroment.
> 
> Well, if we can remember why we put up a fixed size requirement, why
> can't we remember why we did the chopping? ;) Anyway, I think it's

Heh.  Ever work with open source projects?  The turnover and lost
knowledge is one thing.  However, to me the biggets problem would be the
'lack of caring'.

Trust me you can convince everyone to chop MBs out of their packages.  You
might even get something really small for your cloud deployment.

Fast forward a year.  Do you think package maintainers are going to
remember to go through the painful exercise of chopping again when they
rebase their packages or add new features?  Probably not.  Bloat will
shoot right back up.

Heck look at Fedora in the last few years.  It went from running fine on a
laptop with 2GB of memory to now needing a laptop with 4GB of memory and
lots of disk space.  Why?  Because everyone thinks memory and disk is
cheap.

However, cloud folks have a different attitude.  Memory and disk is
_not_ cheap and can impact their business.

Without disk and memory budgets how are you going to manage the bloat and
make sure the cloud deployments are competitive and provide the host
providers the most bang for the resource buck?

And I get that it is hard to come up with numbers like this.  So I would
start with an image and see how it compares.  I think Cloud already has an
image.  Is it to big?  What makes you believe that?  Are competitors that
much smaller?

Then start chopping say 5% off at time to see how low you can bring it
before everyone screams too much.


> fair to define a "kernel-core should be smaller than X MB" requirement
> but I don't think it's fair to say Y MB because I like the number Y. I
> also don't like that we might throw out e.g. NFS just because we're
> 1MB over the limit.

Why not?  If the host provider can buy a 1 TB disk and carve out X number
of cloud deployments on there, then if Fedora Cloud bloats up 10%, that
means less cloud deployments and less revenue.  So yes, I would be concerned
over the 1MB.  In fact, I would come back and ask the kernel team to find
something else to chop to bring it down in size.  Or look to see if some
other package can trade the size.

> 
> But if it helps the kernel team to have a fixed number, someone tell
> me what we roughly save by throwing out the stuff we discussed and we
> can discuss what number would long-termish make sense, I guess. Also,
> I'm not sure whether we should measure the extracted files, the
> compressed RPM or both.

It should be the opposite.  Tell us what number makes long term sense to
you and we can figure out what to throw out.  I would calculate the number
based on deployment size (that is ultimately what you need to manage).
That includes the kernel binary, the initrd and the /lib/modules (though
the initrd image is thrown out once the disk is mounted).

I can also tell you how much memory that will take up too (as I did a
similar analysis to reduce the kdump memory usage on RHEL).

Based on Josh's rough number of ~143MB, I would selfishly propose 150 MB
disk space requirement. :-)  The memory requirement for that is going to
be about 300MB I believe (I would have to re-run the numbers and figure
out what modules are loaded [probably a _lot_ less than that though]).


Cheers,
Don


More information about the kernel mailing list