[fedora-virt] live migration from CentOS 6.2 to 6.1 fails

Gianluca Cecchi gianluca.cecchi at gmail.com
Sat Jan 14 14:00:09 UTC 2012


Hello,
posting here as it could have occurred in Fedora too, during crossing
versions of components...
hope anyone could drive me some hints..
Also because I test qemu/kvm ibvirt in Fedora too...

I have 3 nodes with CentOS, two with 6.1 version + some updates (but below 6.2)
and one with 6.2.

I have a vm on 6.1 hypervisor and I'm able to live migrate it to the
6.2 host, but then I am not able to migrate from 6.2 to either one of
the 6.1....

relevant components:

6.1 hosts
qemu-kvm-0.12.1.2-2.160.el6_1.8.x86_64
libvirt-0.8.7-18.el6_1.1.x86_64
kernel-2.6.32-131.17.1.el6.x86_64

6.2 hosts
qemu-kvm-0.12.1.2-2.209.el6_2.1.x86_64
libvirt-0.9.4-23.el6_2.1.x86_64
kernel-2.6.32-220.2.1.el6.x86_64

guest is named dacsmaster and it is rh el 5.4
the migration from 6.2 to 6.1 fails with

# clusvcadm -M vm:dacsmaster -m intrarhev2
Trying to migrate vm:dacsmaster to intrarhev2...Failed; service
running on original owner

/var/log/messages information:
on source hypervisor:
Jan 12 15:45:17 rhev1 rgmanager[9267]: Migrating vm:dacsmaster to intrarhev2
Jan 12 15:45:18 rhev1 rgmanager[22193]: [vm] Migrate dacsmaster to
intrarhev2 failed:
Jan 12 15:45:18 rhev1 rgmanager[22215]: [vm] error: internal error
missing hostuuid element in migration data
Jan 12 15:45:18 rhev1 rgmanager[9267]: migrate on vm "dacsmaster"
returned 150 (unspecified)
Jan 12 15:45:18 rhev1 rgmanager[9267]: Migration of vm:dacsmaster to
intrarhev2 failed; return code 150

on target hypervisor:
Jan 12 15:45:18 rhev2 kernel: device vnet4 entered promiscuous mode
Jan 12 15:45:18 rhev2 kernel: brvlan65: topology change detected, propagating
Jan 12 15:45:18 rhev2 kernel: brvlan65: port 3(vnet4) entering forwarding state
Jan 12 15:45:18 rhev2 libvirtd: 15:45:18.113: 31182: warning :
qemudStartVMDaemon:3336 : Executing /usr/libexec/qemu-kvm
Jan 12 15:45:18 rhev2 libvirtd: 15:45:18.119: 31182: warning :
qemudStartVMDaemon:3346 : Executing done /usr/libexec/qemu-kvm
Jan 12 15:45:18 rhev2 qemu-kvm: Could not find keytab file:
/etc/qemu/krb5.tab: No such file or directory
Jan 12 15:45:18 rhev2 libvirtd: 15:45:18.332: 8787: error :
virCgroupRemoveRecursively:679 : Unable to remove
/cgroup/cpu/libvirt/qemu/dacsmaster/ (16)
Jan 12 15:45:18 rhev2 libvirtd: 15:45:18.332: 8787: error :
virCgroupRemoveRecursively:679 : Unable to remove
/cgroup/cpuacct/libvirt/qemu/dacsmaster/ (16)
Jan 12 15:45:18 rhev2 libvirtd: 15:45:18.332: 8787: error :
virCgroupRemoveRecursively:679 : Unable to remove
/cgroup/cpuset/libvirt/qemu/dacsmaster/ (16)
Jan 12 15:45:18 rhev2 libvirtd: 15:45:18.332: 8787: error :
virCgroupRemoveRecursively:679 : Unable to remove
/cgroup/memory/libvirt/qemu/dacsmaster/ (16)
Jan 12 15:45:18 rhev2 libvirtd: 15:45:18.332: 8787: error :
virCgroupRemoveRecursively:679 : Unable to remove
/cgroup/devices/libvirt/qemu/dacsmaster/ (16)
Jan 12 15:45:18 rhev2 libvirtd: 15:45:18.332: 8787: error :
virCgroupRemoveRecursively:679 : Unable to remove
/cgroup/freezer/libvirt/qemu/dacsmaster/ (16)
Jan 12 15:45:18 rhev2 libvirtd: 15:45:18.332: 8787: error :
virCgroupRemoveRecursively:679 : Unable to remove
/cgroup/blkio/libvirt/qemu/dacsmaster/ (16)
Jan 12 15:45:18 rhev2 kernel: brvlan65: port 3(vnet4) entering disabled state
Jan 12 15:45:18 rhev2 kernel: device vnet4 left promiscuous mode
Jan 12 15:45:18 rhev2 kernel: brvlan65: port 3(vnet4) entering disabled state

powering off the guest (through "disable" action of the related rhcs
service) and powering it on on 6.1 host, I'm able to live migrate it
either to the other 6.1 host and to the 6.2 one...

/etc/libvirt/libvirtd.conf doesn't contain anything for the host_uuid
part as I find in messages of source 6.2 hypervisor.....
and dmidecode command succeeds on all the three systems.

they are identical servers from an hw point of view and for example
output on one of them gives:
# dmidecode -s system-uuid
34353439-3036-435A-4A38-303330393332

Some additional notes:
If I freeze the vm related cluster service and on source 6.2 host I
manually run the virsh command that took place in 6.1 successfully, it
fails (as expected). This let me keep away considering the different
release versions of cluster related components between 6.1 and 6.2,
concentrating on different releaases of kernel/libvirt/qemu-kvm (at
least I
think so...)

intrarhevN is the host name on custer lan that is also used for migration
(the migration fails also using the service network, the only
difference being it asks a password as
on service network there is no ssh host equivalence)

# virsh migrate --live dacsmaster qemu+ssh://intrarhev2/system tcp:intrarhev2
error: internal error missing hostuuid element in migration data

It seems I'm able to connect through virsh from rhev1 (6.2) to rhev2
(6.1) as I can run

rhev1 prompt--  # virsh
Welcome to virsh, the virtualization interactive terminal.

Type:  'help' for help with commands
       'quit' to quit

virsh # connect qemu+ssh://intrarhev2/system

virsh # list

and this command shows me the domains currently running on rhev2 as expected

So I presume there is something related to migration itself and
possibly different needed parameters in 6.2 shipped version of libvirt
(0.9.4 vs 0.8.7).

Any hint on these and/or possible debug options?
The error seems quite obscure, to me at least

Gianluca


More information about the virt mailing list