Hi all,
I've done the initial work for a new library, libosinfo (better name recommendations appreciated). This library will provide OS meta data for use in virt applications, replacing the dictionary we currently keep in virtinst.
This is based off of a post by Dan Berrange:
https://www.redhat.com/archives/et-mgmt-tools/2009-March/msg00028.html
The code can be fetched with:
git clone http://fedorapeople.org/~crobinso/osinfo/.git
Check out the TODO list for a simple roadmap.
http://fedorapeople.org/~crobinso/osinfo/TODO
The public API looks like:
/** * Values stored in the OS dictionary */ enum _os_value_type { OS_VALUE_NAME = 1, /** Human readable family/distro... name */ OS_VALUE_MEDIA_INSTALL_URL, /** URL to an install tree */ }; typedef enum _os_value_type os_value_t;
int os_init(); void os_close();
int os_find_families (char ***list); int os_find_distros (const char *parent_id, char ***list); int os_find_releases (const char *parent_id, char ***list); int os_find_updates (const char *parent_id, char ***list);
int os_lookup_value (os_value_t value_type, const char *os_id, char **value);
The unique identifier for each distro is its 'id', which is a simple human readable string, similar to values we use for virt-install --os-variant today.
The user will ask the API for available families/distros/releases/updates, which will return a list of ids. We then pass an id to os_lookup_value to actually retrieve data. The family/distro/... separation will likely be removed pretty soon, in favor of an arbitrary hierarchy, where every OS can have child OSes: no doubt hardcoding the family/distro/... split would come back to bite us in the ass.
As an example, the following code will list the 'Name' of every id at the'family' level, where name is full name of the OS (id = rhel5.3, name = Red Hat Enterprise Linux 5.3):
char *value = NULL; char **namelist = NULL; int i, num_fams;
if (os_init() < 0) goto error;
if ((num_fams = os_list_families(&namelist)) < 0) goto error;
for (i = 0; i < num_fams; ++i) { if (os_lookup_value(OS_VALUE_NAME, namelist[i], &value) < 0) goto error;
printf("%s\n", value); free(value); }
error: free(namelist); free(value); os_close();
There is a simple tool called 'osinfo-tool' which allows listing an ASCII repr of the OS hierarchy, and listing all values for an individual id (which is pretty sparse at the moment since most of these values haven't been filled in yet).
So, things that I'm interested in feedback on:
- How do we expect apps to list OS choices? Currently, virt-manager lists type (linux, windows, unix, etc.) and associated distros (Fedora 8, RHEL4, Debian Lenny, etc.). The linux/windows/unix info isn't represented in the xml (should it be?) so the best way seems to be:
Distro | --> Release | --> Update
Ex.
RHEL | -> RHEL5 | -> 5.0 5.1 5.2
If we do away with the family/distro/... distinction, the user won't have much choice in the matter, but the 'family' concept (e.g. value of 'Red Hat') isn't very useful to expose to a user.
- How should we handle derivatives like Scientific Linux + CentOS: should we expect users to understand they are based on RHEL, or give them explicit IDs?
- Querying for device values (supported buses, models, etc.). Dan's original proposal talks about this; to recommend a default with the best chance of actually working, we need to know:
- OS being installed - Virt type ('hvm' vs. 'xen') - Guest Architecture (i386, x86_64, ...) - Hypervisor (kvm, qemu, xen, vbox, ...) - Hypervisor version - Libvirt version
We would need to find the intersection of what the OS, the hypervisor, and libvirt support, and return what we decide is the best choice.
How to expose this in the API? We could simply have one long function
os_lookup_device_value(char *os_id, char *virt_type, char *arch, ...)
It works, but its pretty tedious, and I'm afraid that we would need even more info to make a correct choice in the future, and the above isn't flexible. We may also need some of the above info for other values (ACPI/APIC settings, returning a proper install url may depend on arch). Any suggestions?
- os_init and os_close: Any better ideas for this? os_init just parses the xml document, os_close frees it. We could run os_init with the first API call, but I think that makes it less clear that the user would then need to call os_close().
Any feedback appreciated.
Thanks, Cole
On Sun, Jun 14, 2009 at 06:50:02PM -0400, Cole Robinson wrote:
I'm not convinced by the "arbitrary hierarchy" thing: I just don't see how that could possibly be useful. Surely it's either an entirely flat namespace, or a shallow structured one like you have.
The flat namespace would be a set of keys and multiple values for each key. One value would be "os type" (linux, windows, etc.). You'd most likely have a "generic" entry still for fallback.
An alternative is to represent the hierarchy in the UNIX way, via the filesystem:
/var/lib/osinfo/ /var/lib/osinfo/linux/info.xml /var/lib/osinfo/linux/fedora/info.xml /var/lib/osinfo/linux/fedora/8/info.xml
The fallback is pretty obvious then. Maybe it's over the top. The idea of a single delivered XML file that users edit does trouble me though. Maybe we at least have two files, one for customisations.
int os_init(); void os_close();
Always, always, always, pass back an opaque identifier in an API - you never know when you'll need to track per-thread state. It's generally a good idea to pass in a version define too.
The XML parsing itself should happen lazily. We might want to let the user specify a file, for example.
int os_find_families (char ***list); int os_find_distros (const char *parent_id, char ***list); int os_find_releases (const char *parent_id, char ***list); int os_find_updates (const char *parent_id, char ***list);
Regardless of the hierarchy question I hate the idea of exposing it in the API like this.
There's really two (hopefully three eventually) things this library needs to do: provide a list of everything it knows about so the user can select it in a GUI or whatever, and provide configuration recommendations given a particular set of values.
I don't think the library can make any assumptions about how the former might look. It just needs to return some thing like:
struct osinfo { const char *id; nvpair_list_t values; };
Either allocated as an array (probably easiest) or in a list. It's then up to the client to decide what hierarchy to actually use, and it's all dependent on what values they pick out of the nv list.
(The third thing is to identify OS types based upon installation media when possible.)
If we do away with the family/distro/... distinction, the user won't have much choice in the matter, but the 'family' concept (e.g. value of 'Red Hat') isn't very useful to expose to a user.
If you mean API user, think "icon".
- How should we handle derivatives like Scientific Linux + CentOS: should we expect users to understand they are based on RHEL, or give them explicit IDs?
Explicit IDs. You'd be surprised.
We would need to find the intersection of what the OS, the hypervisor, and libvirt support, and return what we decide is the best choice.
How to expose this in the API? We could simply have one long function
os_lookup_device_value(char *os_id, char *virt_type, char *arch, ...)
The API user should pass in an nvlist, where a set of the names are defined and known about. The response needs to indicate whether it's a preferred setting ("would like virtio") or a required one.
regards john
On Mon, Jun 15, 2009 at 07:16:56AM -0400, John Levon wrote:
On Sun, Jun 14, 2009 at 06:50:02PM -0400, Cole Robinson wrote:
I'm not convinced by the "arbitrary hierarchy" thing: I just don't see how that could possibly be useful. Surely it's either an entirely flat namespace, or a shallow structured one like you have.
The flat namespace would be a set of keys and multiple values for each key. One value would be "os type" (linux, windows, etc.). You'd most likely have a "generic" entry still for fallback.
I think this is the way to go. Flat list + properties, and allow apps to build hiearchies on the fly as needed, based off properties.
An alternative is to represent the hierarchy in the UNIX way, via the filesystem:
/var/lib/osinfo/ /var/lib/osinfo/linux/info.xml /var/lib/osinfo/linux/fedora/info.xml /var/lib/osinfo/linux/fedora/8/info.xml
The fallback is pretty obvious then. Maybe it's over the top. The idea of a single delivered XML file that users edit does trouble me though. Maybe we at least have two files, one for customisations.
A single XML file would be pretty horrible for package upgrades. With a flat list we can have 1 XML file per distro. If we allow multiple search paths for XML files, we can have the stadnard shipped ones in /usr/share/osinfo, and customized ones in /etc/osinfo the latter overriding (or inheriting from) the former.
- How should we handle derivatives like Scientific Linux + CentOS: should we expect users to understand they are based on RHEL, or give them explicit IDs?
Explicit IDs. You'd be surprised.
Download/install URLs already require that we do separate IDs :-)
Daniel
John Levon wrote:
On Sun, Jun 14, 2009 at 06:50:02PM -0400, Cole Robinson wrote:
I'm not convinced by the "arbitrary hierarchy" thing: I just don't see how that could possibly be useful. Surely it's either an entirely flat namespace, or a shallow structured one like you have.
The flat namespace would be a set of keys and multiple values for each key. One value would be "os type" (linux, windows, etc.). You'd most likely have a "generic" entry still for fallback.
An alternative is to represent the hierarchy in the UNIX way, via the filesystem:
/var/lib/osinfo/ /var/lib/osinfo/linux/info.xml /var/lib/osinfo/linux/fedora/info.xml /var/lib/osinfo/linux/fedora/8/info.xml
The fallback is pretty obvious then. Maybe it's over the top. The idea of a single delivered XML file that users edit does trouble me though. Maybe we at least have two files, one for customisations.
The single XML file approach certainly isn't the way forward. There are numerous ways we can solve the 'let admin customize osinfo' problem, but that can come after pinning down the API I think.
int os_init(); void os_close();
Always, always, always, pass back an opaque identifier in an API - you never know when you'll need to track per-thread state. It's generally a good idea to pass in a version define too.
Sounds good, but I'm not sure what you mean by passing in a version define?
The XML parsing itself should happen lazily. We might want to let the user specify a file, for example.
int os_find_families (char ***list); int os_find_distros (const char *parent_id, char ***list); int os_find_releases (const char *parent_id, char ***list); int os_find_updates (const char *parent_id, char ***list);
Regardless of the hierarchy question I hate the idea of exposing it in the API like this.
There's really two (hopefully three eventually) things this library needs to do: provide a list of everything it knows about so the user can select it in a GUI or whatever, and provide configuration recommendations given a particular set of values.
I don't think the library can make any assumptions about how the former might look. It just needs to return some thing like:
struct osinfo { const char *id; nvpair_list_t values; };
Either allocated as an array (probably easiest) or in a list. It's then up to the client to decide what hierarchy to actually use, and it's all dependent on what values they pick out of the nv list.
Agreed, I think Dan's mail covered this pretty well.
(The third thing is to identify OS types based upon installation media when possible.)
Agreed, I'd like to do all detection here in the future.
If we do away with the family/distro/... distinction, the user won't have much choice in the matter, but the 'family' concept (e.g. value of 'Red Hat') isn't very useful to expose to a user.
If you mean API user, think "icon".
- How should we handle derivatives like Scientific Linux + CentOS: should we expect users to understand they are based on RHEL, or give them explicit IDs?
Explicit IDs. You'd be surprised.
We would need to find the intersection of what the OS, the hypervisor, and libvirt support, and return what we decide is the best choice.
How to expose this in the API? We could simply have one long function
os_lookup_device_value(char *os_id, char *virt_type, char *arch, ...)
The API user should pass in an nvlist, where a set of the names are defined and known about. The response needs to indicate whether it's a preferred setting ("would like virtio") or a required one.
I can see doing something like
os_info_set_install_prop(os_info_t info, int prop, char *propval)
So the API user might do:
os_info_set_install_prop(myinfo, OS_INSTALL_VIRT_TYPE, "hvm"); os_info_set_install_prop(myinfo, OS_INSTALL_ARCH, "x86_64"); os_info_set_install_prop(myinfo, OS_INSTALL_HV_TYPE, "kvm");
Then lookup device properties like you would any other prop.
Thanks! - Cole
On Mon, Jun 15, 2009 at 11:28:09AM -0400, Cole Robinson wrote:
int os_init(); void os_close();
Always, always, always, pass back an opaque identifier in an API - you never know when you'll need to track per-thread state. It's generally a good idea to pass in a version define too.
Sounds good, but I'm not sure what you mean by passing in a version define?
osinfo.h:
... #define OSINFO_VERSION 1 ...
client.c:
#include <osinfo.h>
oi_handle_t os_init(OSINFO_VERSION);
This allows certain incompatible changes without having to rev the soversion.
regards john
On Mon, Jun 15, 2009 at 11:28:09AM -0400, Cole Robinson wrote:
The API user should pass in an nvlist, where a set of the names are defined and known about. The response needs to indicate whether it's a preferred setting ("would like virtio") or a required one.
I can see doing something like
os_info_set_install_prop(os_info_t info, int prop, char *propval)
So the API user might do:
os_info_set_install_prop(myinfo, OS_INSTALL_VIRT_TYPE, "hvm"); os_info_set_install_prop(myinfo, OS_INSTALL_ARCH, "x86_64"); os_info_set_install_prop(myinfo, OS_INSTALL_HV_TYPE, "kvm");
This isn't going to work as we most definitely have more than one value of all of these settings.
Instead we need to pass in a list of "environments". Each one would specify a particular combination of the values above (along with a 'preferred' setting methinks).
regards john
John Levon wrote:
On Mon, Jun 15, 2009 at 11:28:09AM -0400, Cole Robinson wrote:
The API user should pass in an nvlist, where a set of the names are defined and known about. The response needs to indicate whether it's a preferred setting ("would like virtio") or a required one.
I can see doing something like
os_info_set_install_prop(os_info_t info, int prop, char *propval)
So the API user might do:
os_info_set_install_prop(myinfo, OS_INSTALL_VIRT_TYPE, "hvm"); os_info_set_install_prop(myinfo, OS_INSTALL_ARCH, "x86_64"); os_info_set_install_prop(myinfo, OS_INSTALL_HV_TYPE, "kvm");
This isn't going to work as we most definitely have more than one value of all of these settings.
The values the user sets are for what kind of guest they are installing at that moment (x86_64 kvm in this case, i686 xen PV in another). That teaches osinfo about our current setup, so it can give us valid recommendations for device properties and whatnot. This could take place after we already chose an OS ID.
Instead we need to pass in a list of "environments". Each one would specify a particular combination of the values above (along with a 'preferred' setting methinks).
Maybe we can avoid an explicit 'preferred' concept, and just return a list of supported values to the user. osinfo will put it's 'preferred' choice as the first in the list, but if the user wants to differ, they can choose from the other values in the list.
I guess this also raises the question of how we indicate 'return a single value' vs. 'return a list' when fetching info.
Thanks, Cole
On Mon, Jun 15, 2009 at 12:11:18PM -0400, Cole Robinson wrote:
The API user should pass in an nvlist, where a set of the names are defined and known about. The response needs to indicate whether it's a preferred setting ("would like virtio") or a required one.
I can see doing something like
os_info_set_install_prop(os_info_t info, int prop, char *propval)
So the API user might do:
os_info_set_install_prop(myinfo, OS_INSTALL_VIRT_TYPE, "hvm"); os_info_set_install_prop(myinfo, OS_INSTALL_ARCH, "x86_64"); os_info_set_install_prop(myinfo, OS_INSTALL_HV_TYPE, "kvm");
This isn't going to work as we most definitely have more than one value of all of these settings.
The values the user sets are for what kind of guest they are installing at that moment (x86_64 kvm in this case, i686 xen PV in another).
That's backwards, though. I don't care about kvm or xen. I care about installing a particular guest type, and want the library to tell me the best method. To do that it needs to match guest needs against host capabilities, and that implies the above properties need to be multi-valued. There is no one "golden setup" even on a single system and it would be a major mistake to presume there ever will be.
Instead we need to pass in a list of "environments". Each one would specify a particular combination of the values above (along with a 'preferred' setting methinks).
Maybe we can avoid an explicit 'preferred' concept, and just return a list of supported values to the user. osinfo will put it's 'preferred' choice as the first in the list, but if the user wants to differ, they can choose from the other values in the list.
There may be more than one preferred setting ('kvm or xenpv, but I'd avoid xenhvm'). Possibly we need a more nuanced notion.
regards john
John Levon wrote:
On Mon, Jun 15, 2009 at 12:11:18PM -0400, Cole Robinson wrote:
The API user should pass in an nvlist, where a set of the names are defined and known about. The response needs to indicate whether it's a preferred setting ("would like virtio") or a required one.
I can see doing something like
os_info_set_install_prop(os_info_t info, int prop, char *propval)
So the API user might do:
os_info_set_install_prop(myinfo, OS_INSTALL_VIRT_TYPE, "hvm"); os_info_set_install_prop(myinfo, OS_INSTALL_ARCH, "x86_64"); os_info_set_install_prop(myinfo, OS_INSTALL_HV_TYPE, "kvm");
This isn't going to work as we most definitely have more than one value of all of these settings.
The values the user sets are for what kind of guest they are installing at that moment (x86_64 kvm in this case, i686 xen PV in another).
That's backwards, though. I don't care about kvm or xen. I care about installing a particular guest type, and want the library to tell me the best method. To do that it needs to match guest needs against host capabilities, and that implies the above properties need to be multi-valued. There is no one "golden setup" even on a single system and it would be a major mistake to presume there ever will be.
No presumption here. In virt-manager, those above values are chosen by the user (qemu vs. kvm vs. xenner, arch, xenpv vs. xenfv). I'm not saying those above API calls would be hard coded, it would be the result of:
./virt-install --connect qemu:///system --arch x86_64 --virt-type kvm --os-variant foobar ...
I hear you that it would be nice if the user could say 'here's the OS I want, here's my host config, DO IT!', and to some degree virt-manager/virt-install already plays that role, but at the osinfo library it can come later and isn't a big priority at the moment. I'm interested in just reaching parity with the current virtinst osdict solution for now.
Instead we need to pass in a list of "environments". Each one would specify a particular combination of the values above (along with a 'preferred' setting methinks).
Maybe we can avoid an explicit 'preferred' concept, and just return a list of supported values to the user. osinfo will put it's 'preferred' choice as the first in the list, but if the user wants to differ, they can choose from the other values in the list.
There may be more than one preferred setting ('kvm or xenpv, but I'd avoid xenhvm'). Possibly we need a more nuanced notion.
regards john
Thanks, Cole
On Mon, Jun 15, 2009 at 12:51:49PM -0400, Cole Robinson wrote:
The values the user sets are for what kind of guest they are installing at that moment (x86_64 kvm in this case, i686 xen PV in another).
That's backwards, though. I don't care about kvm or xen. I care about installing a particular guest type, and want the library to tell me the best method. To do that it needs to match guest needs against host capabilities, and that implies the above properties need to be multi-valued. There is no one "golden setup" even on a single system and it would be a major mistake to presume there ever will be.
No presumption here. In virt-manager, those above values are chosen by the user (qemu vs. kvm vs. xenner, arch, xenpv vs. xenfv).
We aren't writing libvirtmanager though.
I'm not saying those above API calls would be hard coded, it would be the result of:
./virt-install --connect qemu:///system --arch x86_64 --virt-type kvm --os-variant foobar ...
I hear you that it would be nice if the user could say 'here's the OS I want, here's my host config, DO IT!', and to some degree virt-manager/virt-install already plays that role, but at the osinfo library it can come later
^^^^^^^^^^^^^^^^^
No, you're proposing an API which prevents it, that is, one value per key (one hypervisor type, one arch, etc.). That's precisely my complaint.
By all means make the current /implementation/ throw its hands up if given more than one virtenv[1]. Just don't encode it into the API.
regards john
[1] guest-arch+hypervisor-type+virt-type+... combo
On Sun, Jun 14, 2009 at 06:50:02PM -0400, Cole Robinson wrote:
The public API looks like:
/**
- Values stored in the OS dictionary
*/ enum _os_value_type { OS_VALUE_NAME = 1, /** Human readable family/distro... name */ OS_VALUE_MEDIA_INSTALL_URL, /** URL to an install tree */ }; typedef enum _os_value_type os_value_t;
int os_init(); void os_close();
int os_find_families (char ***list); int os_find_distros (const char *parent_id, char ***list); int os_find_releases (const char *parent_id, char ***list); int os_find_updates (const char *parent_id, char ***list);
int os_lookup_value (os_value_t value_type, const char *os_id, char **value);
There's a (little) overlap and a possible user in virt-inspector.
In virt-inspector we do things the other way around - we look inside the guest for files like /etc/redhat-release and /etc/debian_version, and parse those to determine the OS distro and release (also we parse the registry to do the same for Windows). The code for Linux is here:
http://git.et.redhat.com/?p=libguestfs.git;a=blob;f=inspector/virt-inspector...
It would be nice for virt-inspector to output ID strings which are compatible with osinfo.
'Course we'll need Perl bindings. Did you see this?
http://rwmj.wordpress.com/2009/04/20/generating-code/
Rich.
On 06/15/2009 08:15 AM, Richard W.M. Jones wrote:
On Sun, Jun 14, 2009 at 06:50:02PM -0400, Cole Robinson wrote:
The public API looks like:
/**
- Values stored in the OS dictionary
*/ enum _os_value_type { OS_VALUE_NAME = 1, /** Human readable family/distro... name */ OS_VALUE_MEDIA_INSTALL_URL, /** URL to an install tree */ }; typedef enum _os_value_type os_value_t;
int os_init(); void os_close();
int os_find_families (char ***list); int os_find_distros (const char *parent_id, char ***list); int os_find_releases (const char *parent_id, char ***list); int os_find_updates (const char *parent_id, char ***list);
int os_lookup_value (os_value_t value_type, const char *os_id, char **value);
There's a (little) overlap and a possible user in virt-inspector.
In virt-inspector we do things the other way around - we look inside the guest for files like /etc/redhat-release and /etc/debian_version, and parse those to determine the OS distro and release (also we parse the registry to do the same for Windows). The code for Linux is here:
http://git.et.redhat.com/?p=libguestfs.git;a=blob;f=inspector/virt-inspector...
It would be nice for virt-inspector to output ID strings which are compatible with osinfo.
Agreed, I was thinking something along the same lines.
'Course we'll need Perl bindings. Did you see this?
Thanks, I'll take a look.
- Cole
On Mon, Jun 15, 2009 at 08:22:16AM -0400, Cole Robinson wrote:
On 06/15/2009 08:15 AM, Richard W.M. Jones wrote:
On Sun, Jun 14, 2009 at 06:50:02PM -0400, Cole Robinson wrote:
The public API looks like:
/**
- Values stored in the OS dictionary
*/ enum _os_value_type { OS_VALUE_NAME = 1, /** Human readable family/distro... name */ OS_VALUE_MEDIA_INSTALL_URL, /** URL to an install tree */ }; typedef enum _os_value_type os_value_t;
int os_init(); void os_close();
int os_find_families (char ***list); int os_find_distros (const char *parent_id, char ***list); int os_find_releases (const char *parent_id, char ***list); int os_find_updates (const char *parent_id, char ***list);
int os_lookup_value (os_value_t value_type, const char *os_id, char **value);
There's a (little) overlap and a possible user in virt-inspector.
In virt-inspector we do things the other way around - we look inside the guest for files like /etc/redhat-release and /etc/debian_version, and parse those to determine the OS distro and release (also we parse the registry to do the same for Windows). The code for Linux is here:
http://git.et.redhat.com/?p=libguestfs.git;a=blob;f=inspector/virt-inspector...
It would be nice for virt-inspector to output ID strings which are compatible with osinfo.
Agreed, I was thinking something along the same lines.
If we keep to a flat list, then trivial answer here is to just use the official distro name given by the vendor/distributor.
Daniel
On Sun, Jun 14, 2009 at 06:50:02PM -0400, Cole Robinson wrote:
The public API looks like:
/**
- Values stored in the OS dictionary
*/ enum _os_value_type { OS_VALUE_NAME = 1, /** Human readable family/distro... name */ OS_VALUE_MEDIA_INSTALL_URL, /** URL to an install tree */ }; typedef enum _os_value_type os_value_t;
int os_init(); void os_close();
int os_find_families (char ***list); int os_find_distros (const char *parent_id, char ***list); int os_find_releases (const char *parent_id, char ***list); int os_find_updates (const char *parent_id, char ***list);
int os_lookup_value (os_value_t value_type, const char *os_id, char **value);
The unique identifier for each distro is its 'id', which is a simple human readable string, similar to values we use for virt-install --os-variant today.
As John suggested, I think we'd be safer having opaque structs for the conceptual objects. One for the library itself, and another for an OS distro.
Perhaps have an
'os_info_t' as a handle for a library itself returned by os_init 'os_distro_t' as a handle for a single OS distro instance
os_info_t os_info_new() os_info_init(os_info_t *info, char *uri); /* loads the XML data */
For OS distros I think we need APIs to:
- List all OS distros - Find OS distros, matching a specific set of properties - Read a property from an OS distro - Read all properties from an OS distro - List unique values for a property across all distros
The user will ask the API for available families/distros/releases/updates, which will return a list of ids. We then pass an id to os_lookup_value to actually retrieve data. The family/distro/... separation will likely be removed pretty soon, in favor of an arbitrary hierarchy, where every OS can have child OSes: no doubt hardcoding the family/distro/... split would come back to bite us in the ass.
I agree, the fixed hierarchy I describe really doesn't seem very nice looking back on it. The names I gave them are rather contrived and only really map nicely onto RHEL/Fedora release process. I think we're better off being more flexible and allowing for arbitrary relationships in the data files and API. I don't think we neccessarily want to force a single rooted tree structure here. The key important factor with the hierarchy is the concept of sharing metadata.
I think we should take a hint from the way RDF works and define the API and XML format as a flat list, but allow relationships to be defined, and also allow tagging.
- Flat list of OS distros with their full name, as defined by their vendor/distributor
"Red Hat Enterprise Linux 4.7" "Red Hat Enterprise Linux 5.0" "Fedora 10" "Fedora 10" "Debian Sarge"
- A 'derived' property. Allows derived distros to declare they should inherit metdata (eg Scientific Linux derives from RHEL)
- A 'clone' property. Allows functionally identical rebuilds to declare they use exactly same metadata. (eg CentOS / RHEL)
- A 'upgrades' property. Allows to indicate 'Fedora 11' is the release following on from 'Fedora 10'.
- A 'publisher' property to give name of entity producing the distro eg 'Fedora Project', 'Red Hat', 'Microsoft'
- A 'kernel type' and 'kernel version' property, eg 'linux' and '2.6.26'.
Application UI might simulate a hierarchy by using the 'publisher' property at first level, and then filtering the flat list of OS distros at the 2nd level according to selected publisher. This satisfies the key 'UI' reason for the hierarchy. The 'derived' and 'clone' allow for inheritance of metadata.
So, things that I'm interested in feedback on:
How do we expect apps to list OS choices? Currently, virt-manager lists type (linux, windows, unix, etc.) and associated distros (Fedora 8, RHEL4, Debian Lenny, etc.). The linux/windows/unix info isn't represented in the xml (should it be?) so the best way seems to be:
Distro | --> Release | --> Update
Ex.
RHEL | -> RHEL5 | -> 5.0 5.1 5.2
If we do away with the family/distro/... distinction, the user won't have much choice in the matter, but the 'family' concept (e.g. value of 'Red Hat') isn't very useful to expose to a user.
We should try to avoid forcing one representation onto apps. I think the flat OS list + sets of properties will allow apps to build a variety of UI models for this, either search based, tree based or filter based.
- How should we handle derivatives like Scientific Linux + CentOS: should we expect users to understand they are based on RHEL, or give them explicit IDs?
They need explicit IDs, since they have unique download URLs that have to be stored. The 'clone' and 'derived' properties will allow us to avoid duplicating other metadata, and also allow apps to show/hide clones as needed.
Querying for device values (supported buses, models, etc.). Dan's original proposal talks about this; to recommend a default with the best chance of actually working, we need to know:
- OS being installed
- Virt type ('hvm' vs. 'xen')
- Guest Architecture (i386, x86_64, ...)
- Hypervisor (kvm, qemu, xen, vbox, ...)
- Hypervisor version
- Libvirt version
We would need to find the intersection of what the OS, the hypervisor, and libvirt support, and return what we decide is the best choice.
How to expose this in the API? We could simply have one long function
os_lookup_device_value(char *os_id, char *virt_type, char *arch, ...)
It works, but its pretty tedious, and I'm afraid that we would need even more info to make a correct choice in the future, and the above isn't flexible. We may also need some of the above info for other values (ACPI/APIC settings, returning a proper install url may depend on arch). Any suggestions?
The more I think about this, the more I think we should avoid any specific named attributes in the API. Supported devices are just other types of property we can associated with a distro, in addition to ones I already listed earlier. This could be useful in the UI too, for example, if you know the hypervisor requires support for 'Xen paravirt disk', then when browsing OS, you can filter on this property just as you would with the others.
- os_init and os_close: Any better ideas for this? os_init just parses the xml document, os_close frees it. We could run os_init with the first API call, but I think that makes it less clear that the user would then need to call os_close().
I think its good to keep the initializer explicit, and if you add an opaque type representing a handle to the library, this will force apps to caller it and track it.
Regards, Daniel
Daniel P. Berrange wrote:
On Sun, Jun 14, 2009 at 06:50:02PM -0400, Cole Robinson wrote:
The public API looks like:
/**
- Values stored in the OS dictionary
*/ enum _os_value_type { OS_VALUE_NAME = 1, /** Human readable family/distro... name */ OS_VALUE_MEDIA_INSTALL_URL, /** URL to an install tree */ }; typedef enum _os_value_type os_value_t;
int os_init(); void os_close();
int os_find_families (char ***list); int os_find_distros (const char *parent_id, char ***list); int os_find_releases (const char *parent_id, char ***list); int os_find_updates (const char *parent_id, char ***list);
int os_lookup_value (os_value_t value_type, const char *os_id, char **value);
The unique identifier for each distro is its 'id', which is a simple human readable string, similar to values we use for virt-install --os-variant today.
As John suggested, I think we'd be safer having opaque structs for the conceptual objects. One for the library itself, and another for an OS distro.
Perhaps have an
'os_info_t' as a handle for a library itself returned by os_init 'os_distro_t' as a handle for a single OS distro instance
os_info_t os_info_new() os_info_init(os_info_t *info, char *uri); /* loads the XML data */
Sounds good, though why have a separate os_info_new()? And I'd rather have a separate API for initializing from a file, since that should be uncommon enough that we don't need to force uri=NULL on most users.
int os_info_init(os_info_t **info) int os_info_init_from_uri(os_info_t **info, char *uri)
Though it's a minor distinction for now.
For OS distros I think we need APIs to:
- List all OS distros
- Find OS distros, matching a specific set of properties
What uses do you have in mind for this? Being able to say e.g. 'all distros that support xen PV for IA64'?
How do you think an API call would look?
- Read a property from an OS distro
- Read all properties from an OS distro
Why would an API user want to do this? I don't see why we would need to enable this specifically, rather than make the user do this iteratively.
- List unique values for a property across all distros
Not sure I fully understand this. Why would a user want this?
The user will ask the API for available families/distros/releases/updates, which will return a list of ids. We then pass an id to os_lookup_value to actually retrieve data. The family/distro/... separation will likely be removed pretty soon, in favor of an arbitrary hierarchy, where every OS can have child OSes: no doubt hardcoding the family/distro/... split would come back to bite us in the ass.
I agree, the fixed hierarchy I describe really doesn't seem very nice looking back on it. The names I gave them are rather contrived and only really map nicely onto RHEL/Fedora release process. I think we're better off being more flexible and allowing for arbitrary relationships in the data files and API. I don't think we neccessarily want to force a single rooted tree structure here. The key important factor with the hierarchy is the concept of sharing metadata.
I think we should take a hint from the way RDF works and define the API and XML format as a flat list, but allow relationships to be defined, and also allow tagging.
Flat list of OS distros with their full name, as defined by their vendor/distributor
"Red Hat Enterprise Linux 4.7" "Red Hat Enterprise Linux 5.0" "Fedora 10" "Fedora 10" "Debian Sarge"
A 'derived' property. Allows derived distros to declare they should inherit metdata (eg Scientific Linux derives from RHEL)
A 'clone' property. Allows functionally identical rebuilds to declare they use exactly same metadata. (eg CentOS / RHEL)
A 'upgrades' property. Allows to indicate 'Fedora 11' is the release following on from 'Fedora 10'.
A 'publisher' property to give name of entity producing the distro eg 'Fedora Project', 'Red Hat', 'Microsoft'
A 'kernel type' and 'kernel version' property, eg 'linux' and '2.6.26'.
Application UI might simulate a hierarchy by using the 'publisher' property at first level, and then filtering the flat list of OS distros at the 2nd level according to selected publisher. This satisfies the key 'UI' reason for the hierarchy. The 'derived' and 'clone' allow for inheritance of metadata.
I like this idea: certainly will give more flexibility.
So, things that I'm interested in feedback on:
How do we expect apps to list OS choices? Currently, virt-manager lists type (linux, windows, unix, etc.) and associated distros (Fedora 8, RHEL4, Debian Lenny, etc.). The linux/windows/unix info isn't represented in the xml (should it be?) so the best way seems to be:
Distro | --> Release | --> Update
Ex.
RHEL | -> RHEL5 | -> 5.0 5.1 5.2
If we do away with the family/distro/... distinction, the user won't have much choice in the matter, but the 'family' concept (e.g. value of 'Red Hat') isn't very useful to expose to a user.
We should try to avoid forcing one representation onto apps. I think the flat OS list + sets of properties will allow apps to build a variety of UI models for this, either search based, tree based or filter based.
- How should we handle derivatives like Scientific Linux + CentOS: should we expect users to understand they are based on RHEL, or give them explicit IDs?
They need explicit IDs, since they have unique download URLs that have to be stored. The 'clone' and 'derived' properties will allow us to avoid duplicating other metadata, and also allow apps to show/hide clones as needed.
Querying for device values (supported buses, models, etc.). Dan's original proposal talks about this; to recommend a default with the best chance of actually working, we need to know:
- OS being installed
- Virt type ('hvm' vs. 'xen')
- Guest Architecture (i386, x86_64, ...)
- Hypervisor (kvm, qemu, xen, vbox, ...)
- Hypervisor version
- Libvirt version
We would need to find the intersection of what the OS, the hypervisor, and libvirt support, and return what we decide is the best choice.
How to expose this in the API? We could simply have one long function
os_lookup_device_value(char *os_id, char *virt_type, char *arch, ...)
It works, but its pretty tedious, and I'm afraid that we would need even more info to make a correct choice in the future, and the above isn't flexible. We may also need some of the above info for other values (ACPI/APIC settings, returning a proper install url may depend on arch). Any suggestions?
The more I think about this, the more I think we should avoid any specific named attributes in the API. Supported devices are just other types of property we can associated with a distro, in addition to ones I already listed earlier. This could be useful in the UI too, for example, if you know the hypervisor requires support for 'Xen paravirt disk', then when browsing OS, you can filter on this property just as you would with the others.
If the app already knows their hypervisor requires 'Xen paravirt disk', the above is fine, and we should facilitate filtering like that. However I would like osinfo to save the user from having to know those details: they should just be able to say 'I'm using xenpv on i386 for xen 1.2.3 and libvirt 4.5.6 with distro fedora10' and osinfo can return the required info. So I still don't see how to solve the above problem.
Let's say we are installing winxp on qemu via libvirt: we want to know the recommended sound device model:
winxp prefers ac97, es1370 qemu supports es1370, if >= 0.10.0, supports ac97 libvirt supports es1370, if >= 0.6.0 supports ac97
How do we solve this via the API?
- os_init and os_close: Any better ideas for this? os_init just parses the xml document, os_close frees it. We could run os_init with the first API call, but I think that makes it less clear that the user would then need to call os_close().
I think its good to keep the initializer explicit, and if you add an opaque type representing a handle to the library, this will force apps to caller it and track it.
Sounds good.
Thanks a bunch!
- Cole
On Mon, Jun 15, 2009 at 11:09:56AM -0400, Cole Robinson wrote:
Daniel P. Berrange wrote:
On Sun, Jun 14, 2009 at 06:50:02PM -0400, Cole Robinson wrote: For OS distros I think we need APIs to:
- List all OS distros
- Find OS distros, matching a specific set of properties
What uses do you have in mind for this? Being able to say e.g. 'all distros that support xen PV for IA64'?
How do you think an API call would look?
Just a function callback for doing the filtering. If the callback returned 1 the distro object is kept, otherwise it is discarded.
typedef int (*osfilter-t)(osdistro_t *distro, void *opaque); osdistro_t **os_distro_find(osinfo_t, osfilter_t filter, void *opaque);
- Read a property from an OS distro
- Read all properties from an OS distro
Why would an API user want to do this? I don't see why we would need to enable this specifically, rather than make the user do this iteratively.
- List unique values for a property across all distros
Not sure I fully understand this. Why would a user want this?
Imagine virt-manager building a 2 level hiearchy. For the first level it wants to use this method
- Get a list of uninque vendor names among all known OS distros
eg
char **vendors = os_info_unique_properties(osinfo, "vendor");
Upon selecting a vendor, it then wants the earlier method
- Get a list of all OS distros for that vendor
eg
int vendorfilter(osdistro_t distro, void *opaque) { char *wantvendor = opaque;
char *gotvendor = os_distro_get_prop(distro, "vendor"); if (gotvendor && STREQ(wantvendor, gotvendor) return 1; return 0; }
osdistro_t *distros = os_distro_find(osinfo, vendorfilter, "Red Hat");
- OS being installed
- Virt type ('hvm' vs. 'xen')
- Guest Architecture (i386, x86_64, ...)
- Hypervisor (kvm, qemu, xen, vbox, ...)
- Hypervisor version
- Libvirt version
We would need to find the intersection of what the OS, the hypervisor, and libvirt support, and return what we decide is the best choice.
How to expose this in the API? We could simply have one long function
os_lookup_device_value(char *os_id, char *virt_type, char *arch, ...)
It works, but its pretty tedious, and I'm afraid that we would need even more info to make a correct choice in the future, and the above isn't flexible. We may also need some of the above info for other values (ACPI/APIC settings, returning a proper install url may depend on arch). Any suggestions?
The more I think about this, the more I think we should avoid any specific named attributes in the API. Supported devices are just other types of property we can associated with a distro, in addition to ones I already listed earlier. This could be useful in the UI too, for example, if you know the hypervisor requires support for 'Xen paravirt disk', then when browsing OS, you can filter on this property just as you would with the others.
If the app already knows their hypervisor requires 'Xen paravirt disk', the above is fine, and we should facilitate filtering like that. However I would like osinfo to save the user from having to know those details: they should just be able to say 'I'm using xenpv on i386 for xen 1.2.3 and libvirt 4.5.6 with distro fedora10' and osinfo can return the required info. So I still don't see how to solve the above problem.
Let's say we are installing winxp on qemu via libvirt: we want to know the recommended sound device model:
winxp prefers ac97, es1370 qemu supports es1370, if >= 0.10.0, supports ac97 libvirt supports es1370, if >= 0.6.0 supports ac97
How do we solve this via the API?
The latter 2 questions really say that libvirt needs to export more info about what a driver has.
For the former, the OS info database needs to somehow provide a list of drivers available for each OS. Obviously you can take the intersection, but perhaps also define that they are listed in preferred order, "best" first.
Perhaps we should be really ambitious and for each OS distro allow for a full list of PCI device IDs it supports. For linux you can auto-generate that from the kernel module metadata. You could even imagine that the distro provide an XML file in our format with this info in their release media/trees.
Daniel
On Mon, Jun 15, 2009 at 04:27:49PM +0100, Daniel P. Berrange wrote:
- Find OS distros, matching a specific set of properties
What uses do you have in mind for this? Being able to say e.g. 'all distros that support xen PV for IA64'?
How do you think an API call would look?
Just a function callback for doing the filtering. If the callback returned 1 the distro object is kept, otherwise it is discarded.
typedef int (*osfilter-t)(osdistro_t *distro, void *opaque); osdistro_t **os_distro_find(osinfo_t, osfilter_t filter, void *opaque);
I don't really like the idea of filtering. Instead it should just be a callback API full stop. It's up to the client if they want to filter, or find a specific entry, or whatever.
The lifetime rules are such that an entry exists whilst the handle is open, so there's no issues with borrowing pointers in the callback etc.
Imagine virt-manager building a 2 level hiearchy. For the first level it wants to use this method
- Get a list of uninque vendor names among all known OS distros
Pruning duplicates is surely something done in the client. It really doesn't need library help...
regards john
On Mon, Jun 15, 2009 at 11:50:18AM -0400, John Levon wrote:
On Mon, Jun 15, 2009 at 04:27:49PM +0100, Daniel P. Berrange wrote:
- Find OS distros, matching a specific set of properties
What uses do you have in mind for this? Being able to say e.g. 'all distros that support xen PV for IA64'?
How do you think an API call would look?
Just a function callback for doing the filtering. If the callback returned 1 the distro object is kept, otherwise it is discarded.
typedef int (*osfilter-t)(osdistro_t *distro, void *opaque); osdistro_t **os_distro_find(osinfo_t, osfilter_t filter, void *opaque);
I don't really like the idea of filtering. Instead it should just be a callback API full stop. It's up to the client if they want to filter, or find a specific entry, or whatever.
So you mean that it can just be a list iterator pattern
typedef int (*osdistro_iter)(osdistro_t *distro, void *opaque); int os_distro_iterate(osinfo_t, osdistro_iter iter, void *opaque);
Imagine virt-manager building a 2 level hiearchy. For the first level it wants to use this method
- Get a list of uninque vendor names among all known OS distros
Pruning duplicates is surely something done in the client. It really doesn't need library help...
I was thinking that it may well be more efficient to let the library do it. That said, this is an optimization we can add later if observed to be neccessary in real world
Daniel
On Mon, Jun 15, 2009 at 05:00:00PM +0100, Daniel P. Berrange wrote:
I don't really like the idea of filtering. Instead it should just be a callback API full stop. It's up to the client if they want to filter, or find a specific entry, or whatever.
So you mean that it can just be a list iterator pattern
typedef int (*osdistro_iter)(osdistro_t *distro, void *opaque); int os_distro_iterate(osinfo_t, osdistro_iter iter, void *opaque);
Yep.
regards john