(I'm thinking of appying this patch to libvirt in the F15 and F16 branches, as well as Rawhide. The intent is to prepare everyone for qemu's removal of support for the "fedora-13" machine type, which will happen in F17 (already done in Rawhide)).
This addresses https://bugzilla.redhat.com/show_bug.cgi?id=754772 . It should only be applied to Fedora builds of libvirt (it seems appropriate to apply it to all versions currently in support).
Background:
During the lifetime of Fedora 13, some features were backported into the F13 build of qemu-kvm from upstream. These features were part of the functionality of machine type "pc-0.13" in upstream qemu-kvm, so a special "fedora-13" machine type was created for the F13 qemu-kvm. Since this fedora-13 became the new "canonical machine type", all new domains created with F13 libvirt tools by default contained that machine type in their configuration file.
With the release of Fedora 16, qemu-kvm initially removed support for this machine type, which caused failure of many guest configurations to start. qemu-kvm subsequently re-added the patch to support fedora-13, but with the promise that they could remove it with the release of Fedora 17. (see https://bugzilla.redhat.com/show_bug.cgi?id=748218 ).
Solution:
In order to create a repeat of the recent problems, prior to F17 existing guest configurations need to be updated to change fedora-13 to pc-0.13 (which has been determined to be equivalent for all practical purposes). That's what this patch does:
Each time libvirtd is started, it calls virDomainLoadAllConfigs() which calls virDomainLoadConfig(); this function has been modified to check for os.machine == "fedora-13", and change it to "pc-0.13" then write the updated config back to disk.
Also, any other time a domain definition is parsed, the parsed version in memory is changed to turn "fedora-13" into "pc-0.13". This handles domains that had been saved to disk prior to the upgrade, and are subsequently restarted.
Finally, whenever a domain definition is formatted into a string, any occurence of fedora-13 is replaced with pc-0.13. This should deal with those cases where a domain was running at the time of upgrade, and is later saved/snapshotted.
I had considered doing this with some sed commands in the specfile, but that wouldn't do anything to help the xml saved in image files.
(Also, one of the xml tests was using the machine type "fedora-13", and since that machine type is treated specially by the rest of this patch, it was failing. It has been changed to use machine type pc-0.13 instead.) --- src/conf/domain_conf.c | 52 ++++++++++++++++++- .../qemuxml2argv-encrypted-disk.args | 2 +- .../qemuxml2argv-encrypted-disk.xml | 2 +- 3 files changed, 51 insertions(+), 5 deletions(-)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 75e51a0..bf3dd88 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -8061,7 +8061,25 @@ virDomainDefPtr virDomainDefParseString(virCapsPtr caps, unsigned int expectedVirtTypes, unsigned int flags) { - return virDomainDefParse(xmlStr, NULL, caps, expectedVirtTypes, flags); + virDomainDefPtr def + = virDomainDefParse(xmlStr, NULL, caps, expectedVirtTypes, flags); + + /* Fedora-specific HACK - treat fedora-13 and pc-0.13 as equivalent. + * This handles the case of domains that had been saved to an image file + * prior to upgrade (save or snapshot), then restarted/reverted. + */ + if (def && STREQ_NULLABLE(def->os.machine, "fedora-13")) { + VIR_FREE(def->os.machine); + if (!(def->os.machine = strdup("pc-0.13"))) { + virReportOOMError(); + virDomainDefFree(def); + def = NULL; + } else { + VIR_WARN("Replacing deprecated 'fedora-13' machine type " + "with equivalent 'pc-0.13' in domain %s xml", def->name); + } + } + return def; }
virDomainDefPtr virDomainDefParseFile(virCapsPtr caps, @@ -11343,8 +11361,20 @@ virDomainDefFormatInternal(virDomainDefPtr def, virBufferAddLit(buf, " <type"); if (def->os.arch) virBufferAsprintf(buf, " arch='%s'", def->os.arch); - if (def->os.machine) - virBufferAsprintf(buf, " machine='%s'", def->os.machine); + if (def->os.machine) { + /* Fedora-specific HACK - replace "fedora-13" with "pc-0.13". + * This will catch XML being written to save/migration images + * of domains that were running when libvirtd was restarted at + * the time of upgrade. + */ + if (STREQ_NULLABLE(def->os.machine, "fedora-13")) { + virBufferAddLit(buf, " machine='pc-0.13'"); + VIR_WARN("substituting machine type 'fedora-13' with 'pc-0.13' " + "in domain %s", def->name); + } else { + virBufferAsprintf(buf, " machine='%s'", def->os.machine); + } + } /* * HACK: For xen driver we previously used bogus 'linux' as the * os type for paravirt, whereas capabilities declare it to @@ -11779,6 +11809,22 @@ static virDomainObjPtr virDomainLoadConfig(virCapsPtr caps, VIR_DOMAIN_XML_INACTIVE))) goto error;
+ /* Fedora-specific HACK - replace "fedora-13" with "pc-0.13". + * This updates all config files at the first restart of libvirt + * after upgrade. + */ + if (STREQ_NULLABLE(def->os.machine, "fedora-13")) { + VIR_FREE(def->os.machine); + if (!(def->os.machine = strdup("pc-0.13"))) { + virReportOOMError(); + goto error; + } + VIR_WARN("Replacing deprecated 'fedora-13' machine type " + "with equivalent 'pc-0.13' in domain %s configuration file", name); + if (virDomainSaveConfig(configDir, def) < 0) + goto error; + } + if ((autostartLink = virDomainConfigFile(autostartDir, name)) == NULL) goto error;
diff --git a/tests/qemuxml2argvdata/qemuxml2argv-encrypted-disk.args b/tests/qemuxml2argvdata/qemuxml2argv-encrypted-disk.args index 1da0073..c6634fd 100644 --- a/tests/qemuxml2argvdata/qemuxml2argv-encrypted-disk.args +++ b/tests/qemuxml2argvdata/qemuxml2argv-encrypted-disk.args @@ -1,5 +1,5 @@ LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin HOME=/root USER=root LOGNAME=root \ -/usr/bin/qemu -S -M fedora-13 -m 1024 -smp 1,sockets=1,cores=1,threads=1 -name \ +/usr/bin/qemu -S -M pc-0.13 -m 1024 -smp 1,sockets=1,cores=1,threads=1 -name \ encryptdisk -uuid 496898a6-e6ff-f7c8-5dc2-3cf410945ee9 -nographic -nodefconfig \ -nodefaults -chardev socket,id=monitor,\ path=//var/lib/libvirt/qemu/encryptdisk.monitor,server,nowait -mon \ diff --git a/tests/qemuxml2argvdata/qemuxml2argv-encrypted-disk.xml b/tests/qemuxml2argvdata/qemuxml2argv-encrypted-disk.xml index f5e5d74..fdcf624 100644 --- a/tests/qemuxml2argvdata/qemuxml2argv-encrypted-disk.xml +++ b/tests/qemuxml2argvdata/qemuxml2argv-encrypted-disk.xml @@ -5,7 +5,7 @@ <currentMemory>524288</currentMemory> <vcpu>1</vcpu> <os> - <type arch='i686' machine='fedora-13'>hvm</type> + <type arch='i686' machine='pc-0.13'>hvm</type> <boot dev='hd'/> </os> <clock offset='utc'/>
On 12/08/2011 03:08 PM, Laine Stump wrote:
(I'm thinking of appying this patch to libvirt in the F15 and F16 branches, as well as Rawhide. The intent is to prepare everyone for qemu's removal of support for the "fedora-13" machine type, which will happen in F17 (already done in Rawhide)).
This addresses https://bugzilla.redhat.com/show_bug.cgi?id=754772 . It should only be applied to Fedora builds of libvirt (it seems appropriate to apply it to all versions currently in support).
src/conf/domain_conf.c | 52 ++++++++++++++++++- .../qemuxml2argv-encrypted-disk.args | 2 +- .../qemuxml2argv-encrypted-disk.xml | 2 +-
The 4 line change to the testsuite is probably worth proposing upstream (upstream should try to favor upstream qemu names, rather than a Fedora-specific name), but I agree with your decision to make the rest of this patch Fedora-specific, and hope that by F18 we will have automatically phased out enough of the offenders to not have to carry the patch around any longer.
+++ b/src/conf/domain_conf.c @@ -8061,7 +8061,25 @@ virDomainDefPtr virDomainDefParseString(virCapsPtr caps, unsigned int expectedVirtTypes, unsigned int flags) {
- return virDomainDefParse(xmlStr, NULL, caps, expectedVirtTypes, flags);
- virDomainDefPtr def
= virDomainDefParse(xmlStr, NULL, caps, expectedVirtTypes, flags);
- /* Fedora-specific HACK - treat fedora-13 and pc-0.13 as equivalent.
* This handles the case of domains that had been saved to an image file
* prior to upgrade (save or snapshot), then restarted/reverted.
*/
- if (def && STREQ_NULLABLE(def->os.machine, "fedora-13")) {
We want the hack to only apply to qemu guests. At first, I wondered if maybe we should move this into files in src/qemu/qemu_*.c; but on thinking about it more, that would make the problem get bigger, as there are multiple code paths that call this function. For the purposes of a hack that we don't plan to support in F18, I can live with doing it here. But doing it here means we have to worry about this code being shared by other drivers. Therefore, I think you need one additional piece of logic gating this hack:
if (def->virtType == VIR_DOMAIN_VIRT_QEMU || def->virtType == VIR_DOMAIN_VIRT_KQEMU || def->virtType == VIR_DOMAIN_VIRT_KVM)
@@ -11343,8 +11361,20 @@ virDomainDefFormatInternal(virDomainDefPtr def, virBufferAddLit(buf, " <type"); if (def->os.arch) virBufferAsprintf(buf, " arch='%s'", def->os.arch);
- if (def->os.machine)
virBufferAsprintf(buf, " machine='%s'", def->os.machine);
- if (def->os.machine) {
/* Fedora-specific HACK - replace "fedora-13" with "pc-0.13".
* This will catch XML being written to save/migration images
* of domains that were running when libvirtd was restarted at
* the time of upgrade.
*/
if (STREQ_NULLABLE(def->os.machine, "fedora-13")) {
virBufferAddLit(buf, " machine='pc-0.13'");
VIR_WARN("substituting machine type 'fedora-13' with 'pc-0.13' "
"in domain %s", def->name);
} else {
virBufferAsprintf(buf, " machine='%s'", def->os.machine);
Same comment about gating this to just qemu. Also, do we want to print the VIR_WARN on every dumpxml, or do we want to modify def->os.machine in-place so that the remaining uses of the def are already at pc-0.13?
@@ -11779,6 +11809,22 @@ static virDomainObjPtr virDomainLoadConfig(virCapsPtr caps, VIR_DOMAIN_XML_INACTIVE))) goto error;
- /* Fedora-specific HACK - replace "fedora-13" with "pc-0.13".
* This updates all config files at the first restart of libvirt
s/libvirt/libvirtd/
* after upgrade.
*/
- if (STREQ_NULLABLE(def->os.machine, "fedora-13")) {
Again, this should be gated to just qemu.
Except for the missing gating, it looks reasonable; I'll take some time to test a domain snapshot taken before the patch and loaded after and report back with that (but at first glance, it should work).
On 12/08/2011 05:43 PM, Eric Blake wrote:
On 12/08/2011 03:08 PM, Laine Stump wrote:
(I'm thinking of appying this patch to libvirt in the F15 and F16 branches, as well as Rawhide. The intent is to prepare everyone for qemu's removal of support for the "fedora-13" machine type, which will happen in F17 (already done in Rawhide)).
This addresses https://bugzilla.redhat.com/show_bug.cgi?id=754772 . It should only be applied to Fedora builds of libvirt (it seems appropriate to apply it to all versions currently in support).
src/conf/domain_conf.c | 52 ++++++++++++++++++- .../qemuxml2argv-encrypted-disk.args | 2 +- .../qemuxml2argv-encrypted-disk.xml | 2 +-
The 4 line change to the testsuite is probably worth proposing upstream (upstream should try to favor upstream qemu names, rather than a Fedora-specific name),
I thought of that, but then realized that it's irrelevant, since that config / commandline is never passed to an actual qemu, and the tests still pass.
I can still send it separately upstream, but it needs to stay here, since make check will otherwise fail once the rest of the patch is applied. (I suppose I can put it in a separate patch, so that the main patch can be the same for rawhide and F16.)
but I agree with your decision to make the rest of this patch Fedora-specific, and hope that by F18 we will have automatically phased out enough of the offenders to not have to carry the patch around any longer.
+++ b/src/conf/domain_conf.c @@ -8061,7 +8061,25 @@ virDomainDefPtr virDomainDefParseString(virCapsPtr caps, unsigned int expectedVirtTypes, unsigned int flags) {
- return virDomainDefParse(xmlStr, NULL, caps, expectedVirtTypes, flags);
- virDomainDefPtr def
= virDomainDefParse(xmlStr, NULL, caps, expectedVirtTypes, flags);
- /* Fedora-specific HACK - treat fedora-13 and pc-0.13 as equivalent.
* This handles the case of domains that had been saved to an image file
* prior to upgrade (save or snapshot), then restarted/reverted.
*/
- if (def&& STREQ_NULLABLE(def->os.machine, "fedora-13")) {
We want the hack to only apply to qemu guests. At first, I wondered if maybe we should move this into files in src/qemu/qemu_*.c; but on thinking about it more, that would make the problem get bigger, as there are multiple code paths that call this function. For the purposes of a hack that we don't plan to support in F18, I can live with doing it here. But doing it here means we have to worry about this code being shared by other drivers. Therefore, I think you need one additional piece of logic gating this hack:
if (def->virtType == VIR_DOMAIN_VIRT_QEMU || def->virtType == VIR_DOMAIN_VIRT_KQEMU || def->virtType == VIR_DOMAIN_VIRT_KVM)
This looks really ugly to me. Is it really necessary, seeing that none of the other hypervisor types uses the machine name "fedora-13"? As a matter of fact it looks like os.machine is only used in 2 places outside of domain_conf.c and the qemu directory:
*** src/xenxs/xen_xm.c: xenParseXM[271] if (!(def->os.machine = strdup(defaultMachine)))
*** src/vbox/vbox_tmpl.c: <global>[3746] VIR_DEBUG("def->os.machine %s", def->os.machine);
@@ -11343,8 +11361,20 @@ virDomainDefFormatInternal(virDomainDefPtr def, virBufferAddLit(buf, "<type"); if (def->os.arch) virBufferAsprintf(buf, " arch='%s'", def->os.arch);
- if (def->os.machine)
virBufferAsprintf(buf, " machine='%s'", def->os.machine);
- if (def->os.machine) {
/* Fedora-specific HACK - replace "fedora-13" with "pc-0.13".
* This will catch XML being written to save/migration images
* of domains that were running when libvirtd was restarted at
* the time of upgrade.
*/
if (STREQ_NULLABLE(def->os.machine, "fedora-13")) {
virBufferAddLit(buf, " machine='pc-0.13'");
VIR_WARN("substituting machine type 'fedora-13' with 'pc-0.13' "
"in domain %s", def->name);
} else {
virBufferAsprintf(buf, " machine='%s'", def->os.machine);
Same comment about gating this to just qemu. Also, do we want to print the VIR_WARN on every dumpxml, or do we want to modify def->os.machine in-place so that the remaining uses of the def are already at pc-0.13?
I had thought about that, but the idea of modifying an object that the caller is expecting to not change wasn't very appealing to me. I could be persuaded otherwise, though.
(Actually I thought I'd just removed that VIR_WARN(), because I'd discovered that during the managedsave of a single 1GB domain, virDomainFormatDef was called 234 times!)
@@ -11779,6 +11809,22 @@ static virDomainObjPtr virDomainLoadConfig(virCapsPtr caps, VIR_DOMAIN_XML_INACTIVE))) goto error;
- /* Fedora-specific HACK - replace "fedora-13" with "pc-0.13".
* This updates all config files at the first restart of libvirt
s/libvirt/libvirtd/
* after upgrade.
*/
- if (STREQ_NULLABLE(def->os.machine, "fedora-13")) {
Again, this should be gated to just qemu.
Same comment as above. Does anyone else have an opinion on whether or not this is necessary/desirable?
If there's one other vote in your suggested direction (or even if there's no other votes against it), I'll do it.
Except for the missing gating, it looks reasonable; I'll take some time to test a domain snapshot taken before the patch and loaded after and report back with that (but at first glance, it should work).
It may also be useful to start a domain running prior to the upgrade, then do a snapshot after the upgrade is finished and see if it gets the right machine type in the xml that's stored in the snapshot file.
On 12/08/2011 06:56 PM, Laine Stump wrote:
On 12/08/2011 05:43 PM, Eric Blake wrote:
On 12/08/2011 03:08 PM, Laine Stump wrote:
(I'm thinking of appying this patch to libvirt in the F15 and F16 branches, as well as Rawhide. The intent is to prepare everyone for qemu's removal of support for the "fedora-13" machine type, which will happen in F17 (already done in Rawhide)).
This addresses https://bugzilla.redhat.com/show_bug.cgi?id=754772 . It should only be applied to Fedora builds of libvirt (it seems appropriate to apply it to all versions currently in support).
src/conf/domain_conf.c | 52 ++++++++++++++++++- .../qemuxml2argv-encrypted-disk.args | 2 +- .../qemuxml2argv-encrypted-disk.xml | 2 +-
The 4 line change to the testsuite is probably worth proposing upstream (upstream should try to favor upstream qemu names, rather than a Fedora-specific name),
I thought of that, but then realized that it's irrelevant, since that config / commandline is never passed to an actual qemu, and the tests still pass.
I can still send it separately upstream, but it needs to stay here, since make check will otherwise fail once the rest of the patch is applied. (I suppose I can put it in a separate patch, so that the main patch can be the same for rawhide and F16.)
Yes, the tests fix needs to be present in Fedora, whether bundled in this Fedora-specific patch or as a backport of the upstream tweak.
This looks really ugly to me. Is it really necessary, seeing that none of the other hypervisor types uses the machine name "fedora-13"? As a matter of fact it looks like os.machine is only used in 2 places outside of domain_conf.c and the qemu directory:
*** src/xenxs/xen_xm.c: xenParseXM[271] if (!(def->os.machine = strdup(defaultMachine)))
*** src/vbox/vbox_tmpl.c: <global>[3746] VIR_DEBUG("def->os.machine %s", def->os.machine);
Okay, you're starting to convince me that it probably won't hurt xen or vbox, and no one else cared. If no one else speaks up in the next day or so, then I won't mind the simpler patch that skips the gating.
Same comment about gating this to just qemu. Also, do we want to print the VIR_WARN on every dumpxml, or do we want to modify def->os.machine in-place so that the remaining uses of the def are already at pc-0.13?
I had thought about that, but the idea of modifying an object that the caller is expecting to not change wasn't very appealing to me. I could be persuaded otherwise, though.
(Actually I thought I'd just removed that VIR_WARN(), because I'd discovered that during the managedsave of a single 1GB domain, virDomainFormatDef was called 234 times!)
I'm okay with not doing the in-place modification, but then you are correct that we shouldn't VIR_WARN in that case.
Except for the missing gating, it looks reasonable; I'll take some time to test a domain snapshot taken before the patch and loaded after and report back with that (but at first glance, it should work).
It may also be useful to start a domain running prior to the upgrade, then do a snapshot after the upgrade is finished and see if it gets the right machine type in the xml that's stored in the snapshot file.
Yep, I'll include that in my testing results.
On 12/09/2011 09:33 AM, Eric Blake wrote:
Except for the missing gating, it looks reasonable; I'll take some time to test a domain snapshot taken before the patch and loaded after and report back with that (but at first glance, it should work).
It may also be useful to start a domain running prior to the upgrade, then do a snapshot after the upgrade is finished and see if it gets the right machine type in the xml that's stored in the snapshot file.
Yep, I'll include that in my testing results.
Testing results - upgrading libvirtd with a running domain of type fedora-13 passed, and I noticed all persistent configurations were properly upgraded as part of the new libvirtd. All attempts to modify a running fedora-13 machine were converted to pc-0.13, as expected by the code.
HOWEVER, pc-0.13 is wrong. I had one rather significant test failure:
I did this on a running domain: old libvirt: virsh save dom file.sav upgrade libvirtd new libvirt: virsh restore file.sav
and got a cryptic failure:
error: Failed to restore domain from file.sav error: internal error process exited while connecting to monitor: char device redirected to /dev/pts/5 Unknown savevm section or instance 'kvmclock' 0 load of migration failed
But I then hand-hacked the save file (well, actually, I did:
sed "s/='fedora-13'/='pc-0.14' /" file.sav > file.sav2
at which point: new libvirt: virsh restore file.sav2
succeeded. My conclusion? Qemu must be saving the machine type somewhere in the migration file format (virsh save is a migration to file, which means you'd see the same behavior when migrating between two machines); and pc-0.13 is missing something related to kvmclock, and is thus unable to reparse the migration data. But pc-0.14 works just fine, which means your patch is _almost_ right - just change fedora-13 to pc-0.14, and I think we're set.
On (Mon) 12 Dec 2011 [17:00:57], Eric Blake wrote:
Testing results - upgrading libvirtd with a running domain of type fedora-13 passed, and I noticed all persistent configurations were properly upgraded as part of the new libvirtd. All attempts to modify a running fedora-13 machine were converted to pc-0.13, as expected by the code.
HOWEVER, pc-0.13 is wrong. I had one rather significant test failure:
I did this on a running domain: old libvirt: virsh save dom file.sav upgrade libvirtd new libvirt: virsh restore file.sav
and got a cryptic failure:
error: Failed to restore domain from file.sav error: internal error process exited while connecting to monitor: char device redirected to /dev/pts/5 Unknown savevm section or instance 'kvmclock' 0 load of migration failed
But I then hand-hacked the save file (well, actually, I did:
sed "s/='fedora-13'/='pc-0.14' /" file.sav > file.sav2
at which point: new libvirt: virsh restore file.sav2
succeeded. My conclusion? Qemu must be saving the machine type somewhere in the migration file format (virsh save is a migration to file, which means you'd see the same behavior when migrating between two machines); and pc-0.13 is missing something related to kvmclock, and is thus unable to reparse the migration data. But pc-0.14 works just fine, which means your patch is _almost_ right - just change fedora-13 to pc-0.14, and I think we're set.
The fedora-13 machine type was just a slightly modified pc-0.12 machine type, so using pc-0.12 should be the safest.
Amit
On 12/12/2011 11:35 PM, Amit Shah wrote:
On (Mon) 12 Dec 2011 [17:00:57], Eric Blake wrote:
Testing results - upgrading libvirtd with a running domain of type fedora-13 passed, and I noticed all persistent configurations were properly upgraded as part of the new libvirtd. All attempts to modify a running fedora-13 machine were converted to pc-0.13, as expected by the code.
HOWEVER, pc-0.13 is wrong. I had one rather significant test failure:
succeeded. My conclusion? Qemu must be saving the machine type somewhere in the migration file format (virsh save is a migration to file, which means you'd see the same behavior when migrating between two machines); and pc-0.13 is missing something related to kvmclock, and is thus unable to reparse the migration data. But pc-0.14 works just fine, which means your patch is _almost_ right - just change fedora-13 to pc-0.14, and I think we're set.
The fedora-13 machine type was just a slightly modified pc-0.12 machine type, so using pc-0.12 should be the safest.
That's wrong. The whole reason fedora-13 was introduced was because it exposed MORE features than stock pc-0.12, so it is closer either to pc-0.13 or to pc-0.14. I already proved that we can't do migration from fedora-13 to pc-0.13, but can do migration to pc-0.14, which means that, at least in F16, fedora-13 is a synonym to pc-0.14, not pc-0.12.
On (Tue) 13 Dec 2011 [06:09:01], Eric Blake wrote:
On 12/12/2011 11:35 PM, Amit Shah wrote:
On (Mon) 12 Dec 2011 [17:00:57], Eric Blake wrote:
Testing results - upgrading libvirtd with a running domain of type fedora-13 passed, and I noticed all persistent configurations were properly upgraded as part of the new libvirtd. All attempts to modify a running fedora-13 machine were converted to pc-0.13, as expected by the code.
HOWEVER, pc-0.13 is wrong. I had one rather significant test failure:
succeeded. My conclusion? Qemu must be saving the machine type somewhere in the migration file format (virsh save is a migration to file, which means you'd see the same behavior when migrating between two machines); and pc-0.13 is missing something related to kvmclock, and is thus unable to reparse the migration data. But pc-0.14 works just fine, which means your patch is _almost_ right - just change fedora-13 to pc-0.14, and I think we're set.
The fedora-13 machine type was just a slightly modified pc-0.12 machine type, so using pc-0.12 should be the safest.
That's wrong. The whole reason fedora-13 was introduced was because it exposed MORE features than stock pc-0.12, so it is closer either to pc-0.13 or to pc-0.14.
I wonder how you say that. It's closer to pc-0.12 since there was exactly one feature additional in fedora-13 since pc-0.12. The features added in pc-0.13 and pc-0.14 are not assessed for their impact in all the cases. The fact that we saw kvmclock-related changes break things should be a red flag. For the test that failed, loading from a saved image, did you try using pc-0.12 and see if that breaks anything? If it doesn't, won't you say it's better to use pc-0.12?
I already proved that we can't do migration from fedora-13 to pc-0.13, but can do migration to pc-0.14, which means that, at least in F16, fedora-13 is a synonym to pc-0.14, not pc-0.12.
These really are things that we can't exhaustively test. Eg., do Windows guests complain of device changes? Do they need re-activation?
Amit
On 12/13/2011 10:15 AM, Amit Shah wrote:
The fedora-13 machine type was just a slightly modified pc-0.12 machine type, so using pc-0.12 should be the safest.
That's wrong. The whole reason fedora-13 was introduced was because it exposed MORE features than stock pc-0.12, so it is closer either to pc-0.13 or to pc-0.14.
I wonder how you say that. It's closer to pc-0.12 since there was exactly one feature additional in fedora-13 since pc-0.12.
Maybe in F13 and F14 that was the case, but we're talking about F15 and F16. In those builds, qemu ALREADY modified (gasp!) the 'fedora-13' machine name to be an alias for 'pc-0.14', not 'pc-0.12' or 'pc-0.13'. If Windows required reactivation when upgrading from F14 to F15, no one has mentioned it. But the point remains that _any_ VM state saved of a running guest, while using F15 or F16 as the host and using 'fedora-13' as the guest machine type, whether by migration (including migration to file in 'virsh save' and 'virsh managedsave' or migration across hosts in 'virsh migrate') or by snapshots (via the savevm monitor command in 'virsh snapshot-create'), contains subsections which qemu refuses to parse if the destination qemu is running 'pc-0.12' or 'pc-0.13', but which work just fine if the destination qemu is running 'pc-0.14'.
The features added in pc-0.13 and pc-0.14 are not assessed for their impact in all the cases.
Yes, and that is the unfortunate baggage of creating 'fedora-13' which we are trying to undo, rather than carry it around forever.
The fact that we saw kvmclock-related changes break things should be a red flag.
Yes, it IS a red flag - it means that between F14 and F15, when we added the 'fedora-13' machine hack into F15 qemu, we did it wrong, and made 'fedora-13' an alias for 'pc-0.14' where used to be an alias closer to 'pc-0.12' or 'pc-0.13'. But the damage is already done, and no one has complained about it. Rather, what they are complaining about is that any saved machine state is getting lost if qemu is not started with a machine type that can understand the saved state file; silently converting 'fedora-13' to 'pc-0.13' creates this situation, which is bad for the end user, while silently converting 'fedora-13' to 'pc-0.14' allows the end user to still load their VM state, even if we drop the 'fedora-13' hack from qemu.
For the test that failed, loading from a saved image, did you try using pc-0.12 and see if that breaks anything? If it doesn't, won't you say it's better to use pc-0.12?
On your suggestion, I repeated my testing. Both 'pc-0.12' and 'pc-0.13' fail miserably; 'virsh restore' (which maps to qemu -incoming migration) complains about kvmclock, and 'virsh snapshot-revert' (which maps to loadvm) booted fresh rather than restoring machine state.
I already proved that we can't do migration from fedora-13 to pc-0.13, but can do migration to pc-0.14, which means that, at least in F16, fedora-13 is a synonym to pc-0.14, not pc-0.12.
These really are things that we can't exhaustively test. Eg., do Windows guests complain of device changes? Do they need re-activation?
We've already lost that battle. Anyone who upgraded from F14 to F15 has already forced their guest to see new machine characteristics, because 'fedora-13' accidentally changed meaning from 'pc-0.13' to 'pc-0.14'. At this point, the best we can do is prevent further problems by converting to 'pc-0.14', which won't cause any reactivation for a guest whose state was saved by F16, and probably won't cause any reactivation for a guest whose state was saved by F14 (although we don't know that, we do know that no one complained about reactivation being required when they upgraded from F14 to F15).
On (Tue) 13 Dec 2011 [10:44:48], Eric Blake wrote:
On 12/13/2011 10:15 AM, Amit Shah wrote:
The fedora-13 machine type was just a slightly modified pc-0.12 machine type, so using pc-0.12 should be the safest.
That's wrong. The whole reason fedora-13 was introduced was because it exposed MORE features than stock pc-0.12, so it is closer either to pc-0.13 or to pc-0.14.
I wonder how you say that. It's closer to pc-0.12 since there was exactly one feature additional in fedora-13 since pc-0.12.
Maybe in F13 and F14 that was the case, but we're talking about F15 and F16. In those builds, qemu ALREADY modified (gasp!) the 'fedora-13' machine name to be an alias for 'pc-0.14', not 'pc-0.12' or 'pc-0.13'.
[snip more info about the disaster]
OK, I agree pc-0.14 seems to be the best fit we have given all the info you've collected...
Thanks! Amit
On Tue, 2011-12-13 at 10:44 -0700, Eric Blake wrote:
On 12/13/2011 10:15 AM, Amit Shah wrote:
The fedora-13 machine type was just a slightly modified pc-0.12 machine type, so using pc-0.12 should be the safest.
That's wrong. The whole reason fedora-13 was introduced was because it exposed MORE features than stock pc-0.12, so it is closer either to pc-0.13 or to pc-0.14.
I wonder how you say that. It's closer to pc-0.12 since there was exactly one feature additional in fedora-13 since pc-0.12.
Maybe in F13 and F14 that was the case, but we're talking about F15 and F16. In those builds, qemu ALREADY modified (gasp!) the 'fedora-13' machine name to be an alias for 'pc-0.14', not 'pc-0.12' or 'pc-0.13'. If Windows required reactivation when upgrading from F14 to F15, no one has mentioned it. But the point remains that _any_ VM state saved of a running guest, while using F15 or F16 as the host and using 'fedora-13' as the guest machine type, whether by migration (including migration to file in 'virsh save' and 'virsh managedsave' or migration across hosts in 'virsh migrate') or by snapshots (via the savevm monitor command in 'virsh snapshot-create'), contains subsections which qemu refuses to parse if the destination qemu is running 'pc-0.12' or 'pc-0.13', but which work just fine if the destination qemu is running 'pc-0.14'.
The features added in pc-0.13 and pc-0.14 are not assessed for their impact in all the cases.
Yes, and that is the unfortunate baggage of creating 'fedora-13' which we are trying to undo, rather than carry it around forever.
The fact that we saw kvmclock-related changes break things should be a red flag.
Yes, it IS a red flag - it means that between F14 and F15, when we added the 'fedora-13' machine hack into F15 qemu, we did it wrong, and made 'fedora-13' an alias for 'pc-0.14' where used to be an alias closer to 'pc-0.12' or 'pc-0.13'. But the damage is already done, and no one has complained about it. Rather, what they are complaining about is that any saved machine state is getting lost if qemu is not started with a machine type that can understand the saved state file; silently converting 'fedora-13' to 'pc-0.13' creates this situation, which is bad for the end user, while silently converting 'fedora-13' to 'pc-0.14' allows the end user to still load their VM state, even if we drop the 'fedora-13' hack from qemu.
For the test that failed, loading from a saved image, did you try using pc-0.12 and see if that breaks anything? If it doesn't, won't you say it's better to use pc-0.12?
On your suggestion, I repeated my testing. Both 'pc-0.12' and 'pc-0.13' fail miserably; 'virsh restore' (which maps to qemu -incoming migration) complains about kvmclock, and 'virsh snapshot-revert' (which maps to loadvm) booted fresh rather than restoring machine state.
I already proved that we can't do migration from fedora-13 to pc-0.13, but can do migration to pc-0.14, which means that, at least in F16, fedora-13 is a synonym to pc-0.14, not pc-0.12.
These really are things that we can't exhaustively test. Eg., do Windows guests complain of device changes? Do they need re-activation?
We've already lost that battle. Anyone who upgraded from F14 to F15 has already forced their guest to see new machine characteristics, because 'fedora-13' accidentally changed meaning from 'pc-0.13' to 'pc-0.14'. At this point, the best we can do is prevent further problems by converting to 'pc-0.14', which won't cause any reactivation for a guest whose state was saved by F16, and probably won't cause any reactivation for a guest whose state was saved by F14 (although we don't know that, we do know that no one complained about reactivation being required when they upgraded from F14 to F15).
Indeed, and this is my fault for not following up on the patch which was a quick hack once F14 came out. I should have followed up, but in the meantime, anyone who has used F15 with that patch is already running the guest as a pc-0.14 machine type. Now that the patch is in F16, this applies there as well. Strangely no one seems to complain at all in this case, but we do get complaints when the guest will not start because fedora-13 is not a supported machine type. Judging from how quickly we saw these after F16 launch, I would be willing to bet that most of these cases were running F15 as well, so have been operating as a pc-0.14 guest for some time.
Justin
(V2 changes "pc-0.13" to "pc-0.14" per the discussion in email. Also, the change to the test files was moved to a separate patch which was also pushed upstream.)
This addresses https://bugzilla.redhat.com/show_bug.cgi?id=754772 . It should only be applied to Fedora builds of libvirt, F15 and later.
Background:
During the lifetime of Fedora 13, some features were backported into the F13 build of qemu-kvm from upstream. These features were part of the functionality of machine type "pc-0.13" in upstream qemu-kvm, so a special "fedora-13" machine type was created for the F13 qemu-kvm. Since "fedora-13" became the new "canonical machine type", all new domains created with F13 libvirt tools by default contained that machine type in their configuration file.
In Fedora 14, a patch was made to qemu to treat the fedora-13 machine type as equivalent to "pc-0.13". When Fedora 15 was released, this was inadvertantly changed to make it equivalent to "pc-0.14".
With the release of Fedora 16, qemu-kvm initially removed support for this machine type, which caused failure of many guest configurations to start. qemu-kvm subsequently re-added the patch to support fedora-13 (as equivalent to pc-0.14), but with the promise that they could remove it with the release of Fedora 17. (see https://bugzilla.redhat.com/show_bug.cgi?id=748218 ).
Solution:
In order to create a repeat of the recent problems, prior to F17 existing guest configurations need to be updated to change fedora-13 to pc-0.14 (which has been determined to be equivalent for all practical purposes in both F15 and F16). That's what this patch does:
1) Each time libvirtd is started, it calls virDomainLoadAllConfigs() which calls virDomainLoadConfig(); this function has been modified to check for os.machine == "fedora-13", and change it to "pc-0.14" then write the updated config back to disk.
2) Also, any other time a domain definition is parsed, the parsed version in memory is changed to turn "fedora-13" into "pc-0.14". This handles domains that had been saved to disk prior to the upgrade, and are subsequently restarted.
3) Finally, whenever a domain definition is formatted into a string, any occurence of fedora-13 is replaced with pc-0.14 *directly in the virDomainDef* (to avoid multiple warning messages for the same object when it's formatted multiple times). This should deal with those cases where a domain was running at the time of upgrade, and is later saved/snapshotted.
I had considered doing this with some sed commands in the specfile, but that wouldn't do anything to help the xml saved in image files.
(Also, one of the xml tests was using the machine type "fedora-13", and since that machine type is treated specially by the rest of this patch, it was failing. That has been changed in a separate patch, which must be applied with this patch, and which *is* also upstream). --- src/conf/domain_conf.c | 52 +++++++++++++++++++++++++++++++++++++++++++++-- 1 files changed, 49 insertions(+), 3 deletions(-)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 318f523..4f4163e 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -7500,7 +7500,25 @@ virDomainDefPtr virDomainDefParseString(virCapsPtr caps, unsigned int expectedVirtTypes, unsigned int flags) { - return virDomainDefParse(xmlStr, NULL, caps, expectedVirtTypes, flags); + virDomainDefPtr def + = virDomainDefParse(xmlStr, NULL, caps, expectedVirtTypes, flags); + + /* Fedora-specific HACK - treat fedora-13 and pc-0.14 as equivalent. + * This handles the case of domains that had been saved to an image file + * prior to upgrade (save or snapshot), then restarted/reverted. + */ + if (def && STREQ_NULLABLE(def->os.machine, "fedora-13")) { + VIR_FREE(def->os.machine); + if (!(def->os.machine = strdup("pc-0.14"))) { + virReportOOMError(); + virDomainDefFree(def); + def = NULL; + } else { + VIR_WARN("Replacing deprecated 'fedora-13' machine type " + "with equivalent 'pc-0.14' in domain %s xml", def->name); + } + } + return def; }
virDomainDefPtr virDomainDefParseFile(virCapsPtr caps, @@ -10648,8 +10666,20 @@ virDomainDefFormatInternal(virDomainDefPtr def, virBufferAddLit(buf, " <type"); if (def->os.arch) virBufferAsprintf(buf, " arch='%s'", def->os.arch); - if (def->os.machine) - virBufferAsprintf(buf, " machine='%s'", def->os.machine); + if (def->os.machine) { + /* Fedora-specific HACK - replace "fedora-13" with "pc-0.14". + * This will catch XML being written to save/migration images + * of domains that were running when libvirtd was restarted at + * the time of upgrade. + */ + if (STREQ_NULLABLE(def->os.machine, "fedora-13")) { + virBufferAddLit(buf, " machine='pc-0.14'"); + VIR_WARN("substituting machine type 'fedora-13' with 'pc-0.14' " + "in domain %s", def->name); + } else { + virBufferAsprintf(buf, " machine='%s'", def->os.machine); + } + } /* * HACK: For xen driver we previously used bogus 'linux' as the * os type for paravirt, whereas capabilities declare it to @@ -11100,6 +11130,22 @@ static virDomainObjPtr virDomainLoadConfig(virCapsPtr caps, VIR_DOMAIN_XML_INACTIVE))) goto error;
+ /* Fedora-specific HACK - replace "fedora-13" with "pc-0.14". + * This updates all config files at the first restart of libvirt + * after upgrade. + */ + if (STREQ_NULLABLE(def->os.machine, "fedora-13")) { + VIR_FREE(def->os.machine); + if (!(def->os.machine = strdup("pc-0.14"))) { + virReportOOMError(); + goto error; + } + VIR_WARN("Replacing deprecated 'fedora-13' machine type " + "with equivalent 'pc-0.14' in domain %s configuration file", name); + if (virDomainSaveConfig(configDir, def) < 0) + goto error; + } + if ((autostartLink = virDomainConfigFile(autostartDir, name)) == NULL) goto error;
On 12/16/2011 12:20 PM, Laine Stump wrote:
(V2 changes "pc-0.13" to "pc-0.14" per the discussion in email. Also, the change to the test files was moved to a separate patch which was also pushed upstream.)
This addresses https://bugzilla.redhat.com/show_bug.cgi?id=754772 . It should only be applied to Fedora builds of libvirt, F15 and later.
ACK.
Background:
During the lifetime of Fedora 13, some features were backported into the F13 build of qemu-kvm from upstream. These features were part of the functionality of machine type "pc-0.13" in upstream qemu-kvm, so a special "fedora-13" machine type was created for the F13 qemu-kvm. Since "fedora-13" became the new "canonical machine type", all new domains created with F13 libvirt tools by default contained that machine type in their configuration file.
In Fedora 14, a patch was made to qemu to treat the fedora-13 machine type as equivalent to "pc-0.13". When Fedora 15 was released, this was inadvertantly changed to make it equivalent to "pc-0.14".
s/inadvertantly/inadvertently/
With the release of Fedora 16, qemu-kvm initially removed support for this machine type, which caused failure of many guest configurations to start. qemu-kvm subsequently re-added the patch to support fedora-13 (as equivalent to pc-0.14), but with the promise that they could remove it with the release of Fedora 17. (see https://bugzilla.redhat.com/show_bug.cgi?id=748218 ).
Solution:
In order to create a repeat of the recent problems, prior to F17 existing guest configurations need to be updated to change fedora-13 to pc-0.14 (which has been determined to be equivalent for all practical purposes in both F15 and F16). That's what this patch does:
- Each time libvirtd is started, it calls virDomainLoadAllConfigs()
which calls virDomainLoadConfig(); this function has been modified to check for os.machine == "fedora-13", and change it to "pc-0.14" then write the updated config back to disk.
- Also, any other time a domain definition is parsed, the parsed
version in memory is changed to turn "fedora-13" into "pc-0.14". This handles domains that had been saved to disk prior to the upgrade, and are subsequently restarted.
- Finally, whenever a domain definition is formatted into a string,
any occurence of fedora-13 is replaced with pc-0.14 *directly in the
s/occurence/occurrence/