Package categorization and distribution construction

Bill Nottingham notting at redhat.com
Wed Jan 11 22:57:35 UTC 2012


AKA, taking a blowtorch to the comps file.

TL;DR version - come to the talk at FUDCon!

I'm here to propose a reworking of how we handle the data in the comps file.
(If you don't know what that is, it likely doesn't concern you.)

Currently, we have two main use cases for the package groups and listings in
the comps file:

1) Distribution construction

We define groups of packages, that are then organized into:

- Live spins
- Install media
- Installation package sets (via package selection in anaconda, yum,
  kickstart, etc.)

Examples would be 'core', 'base', 'gnome-desktop', 'electronics-lab'.

2) Package categorization for browsing

We define lists of packages that can be individually selected from, in:

- anaconda
- assorted yum/PK frontends (gnome-packagekit, apper, yumex, etc.)

Examples would be 'games'.

In the context of the anaconda redesign, it occured to me that these two use
cases really aren't aligned that much, and likely shouldn't be using the
same data store. So, I propose that for Fedora 17, we split these use cases
into two separate logical data stores (that could still live in the same
file), and adjust our processes and technologies accordingly.

Given that assumption, we need to simply define how we want to tackle the
two cases. Here's what I proposed:

== Distribution construction ==

For this, we will continue to use groups in comps.

PRO:
- Don't have to change any distribution tools
- Don't have to change kickstarts

CON that can be fixed:
- Doesn't allow for tracking what groups a user has installed
- Doesn't allow for adjusting installations to new groups
- The 'group removal' operation does not behave in a way users expect

By using something like what's suggested in:
	http://lists.baseurl.org/pipermail/yum-devel/2010-December/007740.html

we can make groups persistent objects in yum, such that it's stored what
groups the user has installed, what packages are in those groups, and so on.

CON that still needs solved:
- Doesn't allow for defining higher level organization of groups (other than
  categories, which suck from a spin and presentation perspective)

An example:

Someone wants to create a desktop product. They'll define the product as:

MyDesktop
--> @gnome-desktop
 |--> @fonts
 |--> @input-methods
 |--> @x11
   |--> @core
   |--> @base

They also want to define some set of addons that could be optionally
presented in the installer, such as:

MyDekstopAddons
-> @office-suite
-> @eclipse

Right now, we merely have the 'categories' heirarchy in comps, which is
pushed to the anaconda UI. Anything that is conceptually an addon (such as
office-suite, or eclipse, is a common group that applies to any product that
might use the groups. What would be good to have is something where each
product that is defined can define its own addons.

In any case, I intend to start working on this now. What this would mean is
that for any group that is essentially a 'distribution construction' group,
it becomes a single entity with *NO* optional components. It is also no
longer exposed for post-install package selection.

As part of this, we'd create more of these sorts of groups by subdividing
some of the existing groups into smaller, feature based units. 

== Package categorization for browsing ==

In PackageKit parlance, these are 'collections'. Here, I have a few proposals.

1) Continue to use groups in comps

PRO:
- Don't have to change any package tools
CON:
- Requires editing to list a package
- Need to separate them from groups used for distribution construction

2) Give packages in tags. Sort packages for choosing by them

Options would be:
- as a header in the RPM package, extracted into metadata
- as a entry in the package database, extracted into some useable form
- as part of the pkgtags yum metadata

PRO:
- Decentralized. Allows packages to organize themselves
CON:
- Requires touching every package (if it's set in the package)
- Requires RPM changes (if it's set in the package)
- Requires a mechanism to generate metadata 
- Requires modifying packaging tools (mostly PackageKit) to support this
  metadata
- Decentralized. Any package can mess up the display and organization by
  using a garbage tag.

3) Invent some similar curated listing, similar to groups in comps, but not
in comps

CON:
- ... why bother, other than to just move the data?

== Summary ==

In any case, I'm pitching a FUDCon talk about this. So, if you're going to
be at FUDCon Blacksburg, stop by...



More information about the devel mailing list