PROPOSAL: Core size reduction "bug day"

Lamar Owen lowen at pari.edu
Sun Jul 25 01:17:42 UTC 2004


On Fri, 2004-07-23 at 20:28, David Nielsen wrote:
> As everyone probably knows a few days ago a suggestion was brought up
> that we start moving none essential stuff like KDE, XFce and a lot of
> the other duplication into Extras in order to reduce the size of Core.

KDE is essential.

There are other candidates, one of which I mentioned to Michael Tiemann the 
other day.  These are but a few suggestions, based upon sorting by size and 
running test removals under Synaptic.  All sizes quoted below are installed 
sizes as reported by Synaptic.  It appears to me that Synaptic underreports 
sizes by a significant margin, though.

* octave (do you know anyone who uses it?)  If anything, it should be replaced 
with R, which is larger, unfortunately.  I haven't had octave installed in a 
very long time.  Definite extras material.  This one I mentioned to Michael 
the other day.

* 4Suite (system-config-bind and system-config-httpd are dependants; is there 
another package that can provide the functionality that those two 
configurators prequire?  Does anyone use 4Suite outside these two packages?) 
(24MB)

* Gnucash (72MB).  Extras, not Core.  I have this one installed, but it still 
is not Core functionality.

* kdeedu (46MB) Extras, not Core.  kdeedu-devel too.

* MySQL. (:-P)  Red Hat shipped PostgreSQL first, but any RDBMS might easily 
be considered Extras material.  Along goes tora.

* koffice. While I think KDE proper should be in (and not necessarily all KDE 
modules) Core, OOo is pretty much the standard office, even though I 
personally use koffice much more heavily, as it is not as bloated.  But 
koffice not being the default it might should be Extras material.

* The X.org 100DPI fonts, unless and until a GUI config utility can be had to 
easily switch between the 75DPI fonts and these. (and those that might say 
that I just need a higher resolution to appreciate them, well, I have one 
reply: I'm at 1400x1050 now; want me to go higher? :-))  Just how many fonts 
need to be Core material?

* The double compilers.  If we're going gcc3.4, let's get there and lose the 
other versions.  The second compiler set costs lots of space. Looking at 
RawHide I see that it is just as bad, since we now have real gcc-3.4.1, but 
we duplicate to gcc35-3.5.0.  The source RPM is 25MB; I can imagine how 
bloated the binaries are... :-(

* Until and unless the soundsystem can be made to use it, scrap timidity++.  
My sound card has no wavetable, yet getting a simple configuration for 
playing MIDI files (in KDE, which is where I live; I do not and will not 
install GNOME for technical and practical reasons, the best of which is that 
kmail is the best GUI e-mail package available for Linux bar none) is not 
easily found.

* Pick one of: vim, emacs, xemacs to toss.  Preferably toss xemacs.  Talk 
about redundancy: we have two full-bore EMACS!  This is a worse problem than 
two different and competing desktops!  Can't check the sizes of them, though, 
since I don't have either *macs installed...

* Does anyone use the GNU Ada compiler?
(Again, I'm talking about tossing out of Core into Extras, not about throwing 
it out altogether).  But splitting subpackages into separate repositories 
might prove technically challenging.

* Mozilla.  Go to Firefox instead.  The Mozilla mail client and the other 
things of Mozilla, except the browser itself, are not the default 
applications.

* Any application that pulls in scads of libraries that only it uses.  
Distribution bloat is easily tracable to the Favorite Tools syndrome.  "I 
like guile, you like scheme, doo-da, doo-da, We need more LISPers in the 
meme, oh the doo-da day.... "(Going to code all night, going to code all day, 
bet my money on the K-D-E, oh the doo-da day)  (I'm not going to a second 
verse featuring clisp and gcl, though....)

(In case you're wondering, I'm tracking bloat using the new Synaptic (0.52), 
which allows sorting by SIZE within categories, or across the whole 
installation! Then you can track dependencies and dependants easily.)

* The whole Java infrastructure is pretty bloated, and is apparently why FC2 
is *blessed* with a second compiler.  But we then still have the 3.3 libgcj, 
which only has gcc-java and libgcj-devel as dependants!   Those three 
packages take 13+22+4=39MB.  And that's the UNUSED gcj!  The used libgcj is 
larger, and has large dependants.  Actually, lessee... of the things 
depending upon libgcj34 that I have installed, if I select to remove 
libgcj34, it tells me 81.7MB will be freed, which is not as large as I 
thought.  

* While the Python subsystem is large, it is also core to all things 
system-config and must stay.

* Tcl, OTOH, can go.  The ISDN4K stuff has a dependency, but then again why 
exactly is ISDN4K in CORE?  That's one of the first packages I trim out.  I 
do, however, keep tcl, but it's a custom tcl to run OpenACS/AOLserver, which 
requires a multithreaded tcl, which AFAIK is not the default build.  And the 
FC tcl is usually pretty far behind the curve.  A pity, since tcl is quite 
small, only weighing in at a couple of MB.

* Tetex dependants.  While this is fairly core stuff, its dependants mass over 
a hundred MB. (Synaptic tells me 113MB).  Is DocBook Core material?  
Possibly, but there is lots of stuff that could be looked at.

* SELinux.  Yes, I know, this sounds wrong, and I know it is core technology 
for RHEL, or at least will be at some point.  But when the sample policy 
package masses 22MB, something is wrong.  How many FC users have SElinux 
disabled?  If it can be made more compact then by all means include away.  
But 22MB is too large, IMHO.

* FWIW, the kdelibs are the largest single library package outside of the core 
glibc stuff (that I have installed, at least), weighing in at 35M.  I wonder 
how much of that is i18n?  FWIW, if I select to remove qt and all its 
dependants, 733M is selected.  But I have much more installed than the 
regular KDE.  I can't check GNOME the same way, since I don't have GNOME 
installed (there are a few things that depend upon gtk that I do have 
installed, hmmm...in my NON-GNOME installation it marks 616MB of packages to 
be removed if GTK2 is removed.  I wonder how much more if I actually had a 
usable GNOME installed?)  Maybe gtk2 should go to extras?  (:-), just 
kidding, it's too core for that. GNOME, OTOH....)

Other things to help the installed bloat:
* i18n resources for OpenOffice.org split into locale-aware packaging.  This 
is, IMO, one of the serious deficiencies of the kde-redhat repository, where 
the OOo i18n package is _427_MB, which is absolutely ridiculous.  If I never 
plan on using Korean or Japanese I don't need them installed.  OTOH, the 
Korean and Japanese (to use a couple of handy examples) users need those 
packages. 

* i18n trimming all around.  Now, this is done for some things already, but 
each packages should be selected-locales-aware.  That is, after all, why the 
installer ASKS which languages you want installed.  All packages should honor 
those choices; however, packagers need to know how to get to those choices 
and set up the proper locales.

* Trimming down of the printer kluge.  Yes, kluge.  How many hundred megabytes 
does the printing stack take these days?  Well, lessee, you have Omni (55MB), 
Foomatic (31MB), ghostscript , etc.  Lots and lots of bloat here, not 
counting CUPS proper.  At least we finally rid ourselves of the other 
printing subsystem, lprNG.

Oh, as another benchmark, the kernel itself is the 16th largest package on my 
system, massing 35MB (!!!!) (A BIG part of that is 
the /lib/modules/$version/build tree).  That means that there are fifteen 
packages on my NON-GNOME system that are over 35MB in size, with 
OpenOffice.org topping the scale with no close competitors.  This is wrong.  
Very wrong.

Now, let me tell you a story of a package that got cut out of the shipping Red 
Hat Linux because of size limits.  This was an inocuous package; it was a 
useful package, and, doggone it, it was MY package!  But out it went.  The 
package was postgresql-test, which is a package of the regression test suite 
for PostgreSQL. (The whole PostgreSQL set of packages is in a sense MY set of 
packages, even though Tom Lane is the Official Red Hat Maintainer (I am the 
PostgreSQL Global Development Group's maintainer, which is confusing since 
Tom is a member of the PostgreSQL steering committee.  I was maintaining the 
community package before he started working for Red Hat)).  It was just a few 
MB, but it did get canned (it has since resurrected, but can once again get 
canned for that matter, although it might be difficult to have the 
subpackages from one source RPM split between Core and Extras, assuming 
PostgreSQL stays in Core).

That is just the tip of the iceberg.  There is significant bit-rot in the 
distribution as a whole.  There are lots of things that can be trimmed.  The 
list above is about  an hour's worth of looking.

On the point about Extras itself: I think making fedora.us the 'de facto' 
Extras is the wrong thing to do.  Nothing wrong with fedora.us; the problem 
is that Extras needs to really be spun out of Red Hat, IMO, so that it will 
have a similitude of being official.  I personally much prefer some of the 
other repositories over fedora.us due to the packages that are there.  Some 
packages' quality is not so good, unfortunately.  None of the different 
repositories really cooperate well; I have to be careful mixing and matching 
them due to the many differing intersections.  But with Synaptic things are 
very easy to configure; very easy to see what you are doing, and very easy to 
back out if you decide that you do not want to do that.  

I think the various repository maintainers need to see once again if they can 
cooperate a little better.

Incidentally, while I prefer using the kde-redhat KDE packages, they do come 
with their problems.  First, they are very invasive with your system, 
insisting upon your using their OpenOffice.org, their gtk2, and others.  You 
get kde-redhat on your box, and you get lots more than just KDE installed.
-- 
Lamar Owen
Director of Information Technology
Pisgah Astronomical Research Institute
1 PARI Drive
Rosman, NC  28772
(828)862-5554
www.pari.edu





More information about the devel mailing list