Francesco Romani has uploaded a new change for review.
Change subject: virt: sampling: more cautious disk stats check ......................................................................
virt: sampling: more cautious disk stats check
In commit d6d0eb763e623 we changed the disk stats collection to deal with the benign case of known-missing disk stats samples, to avoid add bogus error on the logs.
Unfortunately a case was missing in the live storage migration flow, which causes
TypeError: argument of type 'NoneType' is not iterable
in the logs defeating the purposes of the originating change. This patch fixes this issue by adding the missing check.
Change-Id: I059b69c33d45950f8377597ee8c6e7824e1ec223 Signed-off-by: Francesco Romani fromani@redhat.com --- M vdsm/virt/vm.py 1 file changed, 3 insertions(+), 2 deletions(-)
git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/82/33482/1
diff --git a/vdsm/virt/vm.py b/vdsm/virt/vm.py index ad8cb0e..0f8845e 100644 --- a/vdsm/virt/vm.py +++ b/vdsm/virt/vm.py @@ -539,7 +539,7 @@ dStats['imageID'] = vmDrive.imageID elif "GUID" in vmDrive: dStats['lunGUID'] = vmDrive.GUID - if sInfo is not None and ( + if sInfo is not None and eInfo is not None and ( vmDrive.name in sInfo and vmDrive.name in eInfo): # will be None if sampled during recovery dStats.update(self._calcDiskRate(vmDrive, sInfo, eInfo, @@ -594,7 +594,8 @@ 'writeLatency': '0', 'flushLatency': '0'}
- if sInfo is not None and (dName in sInfo and dName in eInfo): + if sInfo is not None and eInfo is not None and ( + dName in sInfo and dName in eInfo): # will be None if sampled during recovery # in case of hotplugged disk, start samples will # be missed until the sampling code catches up.
oVirt Jenkins CI Server has posted comments on this change.
Change subject: virt: sampling: more cautious disk stats check ......................................................................
Patch Set 1:
Build Failed
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit_el/11700/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_unit-tests_created/12644/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_virt_functional_tests_gerrit/1695/ : There was an infra issue, please contact infra@ovirt.org
http://jenkins.ovirt.org/job/vdsm_master_pep8_gerrit/12489/ : SUCCESS
Nir Soffer has posted comments on this change.
Change subject: virt: sampling: more cautious disk stats check ......................................................................
Patch Set 1:
(2 comments)
http://gerrit.ovirt.org/#/c/33482/1/vdsm/virt/vm.py File vdsm/virt/vm.py:
Line 538: if isVdsmImage(vmDrive): Line 539: dStats['imageID'] = vmDrive.imageID Line 540: elif "GUID" in vmDrive: Line 541: dStats['lunGUID'] = vmDrive.GUID Line 542: if sInfo is not None and eInfo is not None and ( Just to make sure we are not hiding real errors - is eInfo expected to be None in normal flow? Line 543: vmDrive.name in sInfo and vmDrive.name in eInfo): Line 544: # will be None if sampled during recovery Line 545: dStats.update(self._calcDiskRate(vmDrive, sInfo, eInfo, Line 546: sampleInterval))
Line 539: dStats['imageID'] = vmDrive.imageID Line 540: elif "GUID" in vmDrive: Line 541: dStats['lunGUID'] = vmDrive.GUID Line 542: if sInfo is not None and eInfo is not None and ( Line 543: vmDrive.name in sInfo and vmDrive.name in eInfo): The original code had unneeded parenthesis around the last two checks - this will be more clear as:
if (sInfo is not None and vmDrive.name in sInfo and eInfo is not None and vmDrive.name in eInfo):
And since {} is ok with this code, we can make much nicer:
if (sInfo and vmDrive.name in sInfo and eInfo and vmDrive.name in eInfo):
Since we do treat {} just like None, I would check if we can return {} instead of None which will make this condition:
if vmDrive.name in sInfo and vmDrive.name in eInfo: ...
I think this is good enough as is since we must have this in the next build, but please consider the first two cleanups for this patch. Line 544: # will be None if sampled during recovery Line 545: dStats.update(self._calcDiskRate(vmDrive, sInfo, eInfo, Line 546: sampleInterval)) Line 547: except (AttributeError, TypeError, ZeroDivisionError):
Francesco Romani has posted comments on this change.
Change subject: virt: sampling: more cautious disk stats check ......................................................................
Patch Set 1:
(2 comments)
http://gerrit.ovirt.org/#/c/33482/1/vdsm/virt/vm.py File vdsm/virt/vm.py:
Line 538: if isVdsmImage(vmDrive): Line 539: dStats['imageID'] = vmDrive.imageID Line 540: elif "GUID" in vmDrive: Line 541: dStats['lunGUID'] = vmDrive.GUID Line 542: if sInfo is not None and eInfo is not None and (
Just to make sure we are not hiding real errors - is eInfo expected to be N
Yes, if a disk disappears - hotunplug being the easiest case I think. A disk disappearing is most likely the cause of the issue you faced - can you describe the flow so I can test this fix? Line 543: vmDrive.name in sInfo and vmDrive.name in eInfo): Line 544: # will be None if sampled during recovery Line 545: dStats.update(self._calcDiskRate(vmDrive, sInfo, eInfo, Line 546: sampleInterval))
Line 539: dStats['imageID'] = vmDrive.imageID Line 540: elif "GUID" in vmDrive: Line 541: dStats['lunGUID'] = vmDrive.GUID Line 542: if sInfo is not None and eInfo is not None and ( Line 543: vmDrive.name in sInfo and vmDrive.name in eInfo):
The original code had unneeded parenthesis around the last two checks - thi
No big deal aout the first two cleanups, will push them. Line 544: # will be None if sampled during recovery Line 545: dStats.update(self._calcDiskRate(vmDrive, sInfo, eInfo, Line 546: sampleInterval)) Line 547: except (AttributeError, TypeError, ZeroDivisionError):
oVirt Jenkins CI Server has posted comments on this change.
Change subject: virt: sampling: more cautious disk stats check ......................................................................
Patch Set 2:
Build Failed
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit_el/11730/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_unit-tests_created/12674/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_virt_functional_tests_gerrit/1700/ : There was an infra issue, please contact infra@ovirt.org
http://jenkins.ovirt.org/job/vdsm_master_pep8_gerrit/12519/ : SUCCESS
Nir Soffer has posted comments on this change.
Change subject: virt: sampling: more cautious disk stats check ......................................................................
Patch Set 1:
(1 comment)
http://gerrit.ovirt.org/#/c/33482/1/vdsm/virt/vm.py File vdsm/virt/vm.py:
Line 538: if isVdsmImage(vmDrive): Line 539: dStats['imageID'] = vmDrive.imageID Line 540: elif "GUID" in vmDrive: Line 541: dStats['lunGUID'] = vmDrive.GUID Line 542: if sInfo is not None and eInfo is not None and (
Yes, if a disk disappears - hotunplug being the easiest case I think. A dis
I don't remember what was the flow. I saw this when testing live storage migration (move disk while vm is running) with iscsi storage domains. Line 543: vmDrive.name in sInfo and vmDrive.name in eInfo): Line 544: # will be None if sampled during recovery Line 545: dStats.update(self._calcDiskRate(vmDrive, sInfo, eInfo, Line 546: sampleInterval))
Nir Soffer has posted comments on this change.
Change subject: virt: sampling: more cautious disk stats check ......................................................................
Patch Set 2: Code-Review+1
(1 comment)
http://gerrit.ovirt.org/#/c/33482/2/vdsm/virt/vm.py File vdsm/virt/vm.py:
Line 539: dStats['imageID'] = vmDrive.imageID Line 540: elif "GUID" in vmDrive: Line 541: dStats['lunGUID'] = vmDrive.GUID Line 542: if (sInfo and vmDrive.name in sInfo and Line 543: eInfo and vmDrive.name in eInfo): I think pep8 would not like this, eInfo should be indented one extra space. Line 544: # will be None if sampled during recovery Line 545: dStats.update(self._calcDiskRate(vmDrive, sInfo, eInfo, Line 546: sampleInterval)) Line 547: except (AttributeError, TypeError, ZeroDivisionError):
Francesco Romani has posted comments on this change.
Change subject: virt: sampling: more cautious disk stats check ......................................................................
Patch Set 2:
(1 comment)
http://gerrit.ovirt.org/#/c/33482/2/vdsm/virt/vm.py File vdsm/virt/vm.py:
Line 539: dStats['imageID'] = vmDrive.imageID Line 540: elif "GUID" in vmDrive: Line 541: dStats['lunGUID'] = vmDrive.GUID Line 542: if (sInfo and vmDrive.name in sInfo and Line 543: eInfo and vmDrive.name in eInfo):
I think pep8 would not like this, eInfo should be indented one extra space.
I thought the same, but seems like pep8 thinks in reverse and complains if I *add* the extra space:
1009 23:32:50 fromani@musashi ~/Projects/upstream/vdsm $ git diff diff --git a/vdsm/virt/vm.py b/vdsm/virt/vm.py index 7f21085..10cd21e 100644 --- a/vdsm/virt/vm.py +++ b/vdsm/virt/vm.py @@ -540,7 +540,7 @@ class VmStatsThread(AdvancedStatsThread): elif "GUID" in vmDrive: dStats['lunGUID'] = vmDrive.GUID if (sInfo and vmDrive.name in sInfo and - eInfo and vmDrive.name in eInfo): + eInfo and vmDrive.name in eInfo): # will be None if sampled during recovery dStats.update(self._calcDiskRate(vmDrive, sInfo, eInfo, sampleInterval)) 1010 23:32:52 fromani@musashi ~/Projects/upstream/vdsm $ make pep8 /usr/bin/pep8 --version 1.5.6 for x in config.py constants.py crossImportsTests.py vdsm.py ; do \ exclude="${exclude},${x}" ; \ done ; \ /usr/bin/pep8 --exclude="${exclude}" --filename '*.py,*.py.in' . ./vdsm/virt/vm.py:543:21: E129 visually indented line with same indent as next logical line make: *** [pep8] Error 1 Line 544: # will be None if sampled during recovery Line 545: dStats.update(self._calcDiskRate(vmDrive, sInfo, eInfo, Line 546: sampleInterval)) Line 547: except (AttributeError, TypeError, ZeroDivisionError):
Nir Soffer has posted comments on this change.
Change subject: virt: sampling: more cautious disk stats check ......................................................................
Patch Set 2:
(1 comment)
http://gerrit.ovirt.org/#/c/33482/2/vdsm/virt/vm.py File vdsm/virt/vm.py:
Line 539: dStats['imageID'] = vmDrive.imageID Line 540: elif "GUID" in vmDrive: Line 541: dStats['lunGUID'] = vmDrive.GUID Line 542: if (sInfo and vmDrive.name in sInfo and Line 543: eInfo and vmDrive.name in eInfo):
I thought the same, but seems like pep8 thinks in reverse and complains if
:-) Line 544: # will be None if sampled during recovery Line 545: dStats.update(self._calcDiskRate(vmDrive, sInfo, eInfo, Line 546: sampleInterval)) Line 547: except (AttributeError, TypeError, ZeroDivisionError):
Francesco Romani has posted comments on this change.
Change subject: virt: sampling: more cautious disk stats check ......................................................................
Patch Set 2: Verified+1
Verification (against locally patched vdsm-4.16.5-5.gitb16b036.el6.x86_64 , but relevant code path are identical)
- live storage migration flow is broken (https://bugzilla.redhat.com/show_bug.cgi?id=1147971#c0) - live merge not available due to libvirt limitations - so I had to play with disk hotplug/hotunplug, which is good enough as soon as a diks disappears under VDSM's nose. - tuned VDSM to do more frequent polling: vm_sample_disk_interval = 1
vm_sample_disk_latency_interval = 1 - added watch to exercise the code path: - watch vdsClient -s 0 getAllVmStats
the two above will significantly increase the chance to trigger the error being addressed by this change.
On a VM with three disks (main plus two auxiliary) - plugged/unplugged at random sequence/times the two auxiliary disk - snooped the vdsm logs for errors - verified the entries appears and disappears from vdsClient output
did the above for ~10 cycles
Dan Kenigsberg has posted comments on this change.
Change subject: virt: sampling: more cautious disk stats check ......................................................................
Patch Set 2: Code-Review+2
Dan Kenigsberg has submitted this change and it was merged.
Change subject: virt: sampling: more cautious disk stats check ......................................................................
virt: sampling: more cautious disk stats check
In commit d6d0eb763e623 we changed the disk stats collection to deal with the benign case of known-missing disk stats samples, to avoid add bogus error on the logs.
Unfortunately a case was missing in the live storage migration flow, which causes
TypeError: argument of type 'NoneType' is not iterable
in the logs defeating the purposes of the originating change. This patch fixes this issue by adding the missing check.
Change-Id: I059b69c33d45950f8377597ee8c6e7824e1ec223 Signed-off-by: Francesco Romani fromani@redhat.com Reviewed-on: http://gerrit.ovirt.org/33482 Reviewed-by: Nir Soffer nsoffer@redhat.com Reviewed-by: Dan Kenigsberg danken@redhat.com --- M vdsm/virt/vm.py 1 file changed, 3 insertions(+), 3 deletions(-)
Approvals: Nir Soffer: Looks good to me, but someone else must approve Dan Kenigsberg: Looks good to me, approved Francesco Romani: Verified
oVirt Jenkins CI Server has posted comments on this change.
Change subject: virt: sampling: more cautious disk stats check ......................................................................
Patch Set 3:
Build Failed
http://jenkins.ovirt.org/job/vdsm_master_create-rpms_merged_test_debug/245/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_verify-error-codes_merged/5878/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_unit-tests_merged/4038/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_create-rpms-el7-x86_64_merged/48/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_create-rpms-fc20-x86_64_merged/44/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_create-rpms-el6-x86_64_merged/50/ : FAILURE
http://jenkins.ovirt.org/job/vdsm_master_create-rpms-fc21-x86_64_merged/24/ : SUCCESS
vdsm-patches@lists.fedorahosted.org