Nir Soffer has uploaded a new change for review.
Change subject: vm: Require format attribute for drives
......................................................................
vm: Require format attribute for drives
We have seen sampling errors where a drive has no "format" attribute.
These errors spam the system logs, and does not help to debug the real
issue - why the required format attribute is not set?
This patch ensure that a drive cannot be created without a format
attribute, avoding log spam by sampling errors, and hopefully revealing
the real reason for this bug.
One broken test creating a drive without a format was fixed and new test
ensure that creating drive without a format raises.
Change-Id: I01ab1e071ecb76f383cc6dc7d99782e10cc90136
Relates-To: http://gerrit.ovirt.org/22551
Relates-To: https://bugzilla.redhat.com/994534
Relates-To: http://lists.ovirt.org/pipermail/users/2014-February/021007.html
Bug-Url: https://bugzilla.redhat.com/1055437
Signed-off-by: Nir Soffer <nsoffer(a)redhat.com>
---
M tests/vmTests.py
M vdsm/vm.py
2 files changed, 28 insertions(+), 3 deletions(-)
git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/34/24234/1
diff --git a/tests/vmTests.py b/tests/vmTests.py
index 1e0a3f6..724d458 100644
--- a/tests/vmTests.py
+++ b/tests/vmTests.py
@@ -449,6 +449,18 @@
redir = vm.RedirDevice(self.conf, self.log, **dev)
self.assertXML(redir.getXML(), redirXML)
+ def testDriveRequiredParameters(self):
+ # TODO: It is not clear what are the other required parameters, and it
+ # the parameters depend on the type of drive. We probbaly need bigger
+ # test here that test many combinations.
+ # Currently this test only missing "format" attribute.
+ conf = {'index': '2', 'propagateErrors': 'off', 'iface': 'ide',
+ 'name': 'hdc', 'device': 'cdrom', 'path': '/tmp/fedora.iso',
+ 'type': 'disk', 'readonly': 'True', 'shared': 'none',
+ 'serial': '54-a672-23e5b495a9ea'}
+ self.assertRaises(vm.InvalidDeviceParameters, vm.Drive, {}, self.log,
+ **conf)
+
def testDriveSharedStatus(self):
sharedConfigs = [
# Backward compatibility
@@ -470,7 +482,8 @@
'exclusive', 'shared', 'none', 'transient',
]
- driveConfig = {'index': '0', 'iface': 'virtio', 'device': 'disk'}
+ driveConfig = {'index': '0', 'iface': 'virtio', 'device': 'disk',
+ 'format': 'raw'}
for driveInput, driveOutput in zip(sharedConfigs, expectedStates):
driveInput.update(driveConfig)
diff --git a/vdsm/vm.py b/vdsm/vm.py
index aae8bd6..b7151b1 100644
--- a/vdsm/vm.py
+++ b/vdsm/vm.py
@@ -80,6 +80,10 @@
SMARTCARD_DEVICES = 'smartcard'
+class InvalidDeviceParameters(Exception):
+ """ Raised when creating device with invalid parameters """
+
+
def isVdsmImage(drive):
"""
Tell if drive looks like a vdsm image
@@ -1438,6 +1442,9 @@
VOLWM_CHUNK_REPLICATE_MULT = 2 # Chunk multiplier during replication
def __init__(self, conf, log, **kwargs):
+ if 'format' not in kwargs:
+ raise InvalidDeviceParameters('"format" attribute is required:'
+ ' %r' % kwargs)
if not kwargs.get('serial'):
self.serial = kwargs.get('imageID'[-20:]) or ''
VmDevice.__init__(self, conf, log, **kwargs)
@@ -3108,8 +3115,13 @@
for devType, devClass in self.DeviceMapping:
for dev in devices[devType]:
- self._devices[devType].append(devClass(self.conf, self.log,
- **dev))
+ try:
+ drive = devClass(self.conf, self.log, **dev)
+ except InvalidDeviceParameters as e:
+ self.log.error('Ignoring device with invalid parameters:'
+ ' %s', e, exc_info=True)
+ else:
+ self._devices[devType].append(drive)
# We should set this event as a last part of drives initialization
self._pathsPreparedEvent.set()
--
To view, visit http://gerrit.ovirt.org/24234
To unsubscribe, visit http://gerrit.ovirt.org/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: I01ab1e071ecb76f383cc6dc7d99782e10cc90136
Gerrit-PatchSet: 1
Gerrit-Project: vdsm
Gerrit-Branch: master
Gerrit-Owner: Nir Soffer <nsoffer(a)redhat.com>
Nir Soffer has uploaded a new change for review.
Change subject: resourceManager: Keep resource state if registerResource fails
......................................................................
resourceManager: Keep resource state if registerResource fails
Previous code was increasing resource activeUsers counter, but
exceptions raised after that caused the method to fail, leaving a locked
resources that nobody can release. Such locked resource may lead to
failure of any pool operation, making the host non-operational, and
requiring a restart of vdsm.
The failure in the field was caused by Python logging bug, raising
OSError when message was logged when log file was rotated. However, such
failure can happen everywhere, and locking code must be written in such
way that failure would never leave a resource locked.
This patch ensure that resource is added and activeUsers counter is
increased only if everything else was fine.
Since simulating logging error is hard, the tests monkeypatch the
RequestRef class to simulate a failure. This is little ugly, depending
on internal implementation detail, but I could not find a cleaner way.
Change-Id: I16abf41ebc8a8a99b292d38c945074752254a34b
Relates-To: https://bugzilla.redhat.com/1065650
Signed-off-by: Nir Soffer <nsoffer(a)redhat.com>
---
M tests/resourceManagerTests.py
M vdsm/storage/resourceManager.py
2 files changed, 50 insertions(+), 9 deletions(-)
git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/84/25284/1
diff --git a/tests/resourceManagerTests.py b/tests/resourceManagerTests.py
index 01b0669..e2b6461 100644
--- a/tests/resourceManagerTests.py
+++ b/tests/resourceManagerTests.py
@@ -29,6 +29,7 @@
import storage.resourceManager as resourceManager
from testrunner import VdsmTestCase as TestCaseBase
from testValidation import slowtest, stresstest
+import monkeypatch
class NullResourceFactory(resourceManager.SimpleResourceFactory):
@@ -209,6 +210,32 @@
ex.__class__.__name__)
self.fail("Managed to access an attribute not exposed by wrapper")
+
+ def testRegisterResourceFailureExclusive(self):
+ # This regeisterion must fail
+ with monkeypatch.MonkeyPatchScope(
+ [(resourceManager, 'RequestRef', FakeRequestRef)]):
+ self.assertRaises(Failure, self.manager.registerResource, "string",
+ "test", resourceManager.LockType.exclusive, None)
+
+ # And it should not leave a locked resource
+ with self.manager.acquireResource("string", "test",
+ resourceManager.LockType.exclusive,
+ 0):
+ pass
+
+ def testRegisterResourceFailureShared(self):
+ # This regeisterion must fail
+ with monkeypatch.MonkeyPatchScope(
+ [(resourceManager, 'RequestRef', FakeRequestRef)]):
+ self.assertRaises(Failure, self.manager.registerResource, "string",
+ "test", resourceManager.LockType.shared, None)
+
+ # And it should not leave a locked resource
+ with self.manager.acquireResource("string", "test",
+ resourceManager.LockType.exclusive,
+ 0):
+ pass
def testAccessAttributeNotExposedByRequestRef(self):
resources = []
@@ -705,3 +732,14 @@
except:
resourceManager.ResourceManager._instance = None
raise
+
+# Helpers
+
+
+class Failure(Exception):
+ """ Unique exception for testing """
+
+
+def FakeRequestRef(*a, **kw):
+ """ Used to simulate failures when registering a resource """
+ raise Failure()
diff --git a/vdsm/storage/resourceManager.py b/vdsm/storage/resourceManager.py
index 1be1450..ce144cf 100644
--- a/vdsm/storage/resourceManager.py
+++ b/vdsm/storage/resourceManager.py
@@ -559,23 +559,25 @@
if len(resource.queue) == 0 and \
resource.currentLock == LockType.shared and \
request.lockType == LockType.shared:
- resource.activeUsers += 1
self._log.debug("Resource '%s' found in shared state "
"and queue is empty, Joining current "
"shared lock (%d active users)",
- fullName, resource.activeUsers)
+ fullName, resource.activeUsers + 1)
request.grant()
+ ref = RequestRef(request)
contextCleanup.defer(request.emit,
ResourceRef(namespace, name,
resource.realObj,
request.reqID))
- return RequestRef(request)
+ resource.activeUsers += 1
+ return ref
- resource.queue.insert(0, request)
self._log.debug("Resource '%s' is currently locked, "
"Entering queue (%d in queue)",
- fullName, len(resource.queue))
- return RequestRef(request)
+ fullName, len(resource.queue) + 1)
+ ref = RequestRef(request)
+ resource.queue.insert(0, request)
+ return ref
# TODO : Creating the object inside the namespace lock causes
# the entire namespace to lock and might cause
@@ -592,19 +594,20 @@
contextCleanup.defer(request.cancel)
return RequestRef(request)
- resource = resources[name] = ResourceManager.ResourceInfo(
- obj, namespace, name)
+ resource = ResourceManager.ResourceInfo(obj, namespace, name)
resource.currentLock = request.lockType
resource.activeUsers += 1
self._log.debug("Resource '%s' is free. Now locking as '%s' "
"(1 active user)", fullName, request.lockType)
request.grant()
+ ref = RequestRef(request)
contextCleanup.defer(request.emit,
ResourceRef(namespace, name,
resource.realObj,
request.reqID))
- return RequestRef(request)
+ resources[name] = resource
+ return ref
def releaseResource(self, namespace, name):
# WARN : unlike in resource acquire the user now has the request
--
To view, visit http://gerrit.ovirt.org/25284
To unsubscribe, visit http://gerrit.ovirt.org/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: I16abf41ebc8a8a99b292d38c945074752254a34b
Gerrit-PatchSet: 1
Gerrit-Project: vdsm
Gerrit-Branch: master
Gerrit-Owner: Nir Soffer <nsoffer(a)redhat.com>
Nir Soffer has uploaded a new change for review.
Change subject: vm: Continue to sample after errors
......................................................................
vm: Continue to sample after errors
When vm is running, we monitor disk usage, and if the disk becomes too
full, we extend the disk. This avoid pausing of the vm after io errors.
However, when sampling vm with multiple disks, an error when sampling
one disk exit the sampling function and skip the next disks, making this
machnisim useless.
This patch logs exceptions raised when sampling one disk and continue to
sample others.
This patch is for ovirt-3.3.1 only - master patch must be different
because of recent refactoring in this area.
Change-Id: I8dbe60a4d3b216a5cd998d163407c09b12f2f28c
Signed-off-by: Nir Soffer <nsoffer(a)redhat.com>
---
M vdsm/vm.py
1 file changed, 14 insertions(+), 10 deletions(-)
git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/75/22575/1
diff --git a/vdsm/vm.py b/vdsm/vm.py
index 90ab5a7..2d0cece 100644
--- a/vdsm/vm.py
+++ b/vdsm/vm.py
@@ -506,20 +506,24 @@
return
for vmDrive in self._vm._devices[DISK_DEVICES]:
- if not vmDrive.isExtendable():
- continue
+ try:
+ if not vmDrive.isExtendable():
+ continue
- capacity, alloc, physical = \
- self._vm._dom.blockInfo(vmDrive.path, 0)
+ capacity, alloc, physical = \
+ self._vm._dom.blockInfo(vmDrive.path, 0)
- if physical - alloc >= vmDrive.watermarkLimit:
- continue
+ if physical - alloc >= vmDrive.watermarkLimit:
+ continue
- self._log.info('%s/%s apparent: %s capacity: %s, alloc: %s, '
- 'phys: %s', vmDrive.domainID, vmDrive.volumeID,
- vmDrive.apparentsize, capacity, alloc, physical)
+ self._log.info('%s/%s apparent: %s capacity: %s, alloc: %s, '
+ 'phys: %s', vmDrive.domainID, vmDrive.volumeID,
+ vmDrive.apparentsize, capacity, alloc, physical)
- self._vm.extendDriveVolume(vmDrive)
+ self._vm.extendDriveVolume(vmDrive)
+ except Exception:
+ self._log.exception("%s/%s", vmDrive.domainID,
+ vmDrive.volumeID)
def _updateVolumes(self):
if not self._vm.isDisksStatsCollectionEnabled():
--
To view, visit http://gerrit.ovirt.org/22575
To unsubscribe, visit http://gerrit.ovirt.org/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: I8dbe60a4d3b216a5cd998d163407c09b12f2f28c
Gerrit-PatchSet: 1
Gerrit-Project: vdsm
Gerrit-Branch: ovirt-3.3
Gerrit-Owner: Nir Soffer <nsoffer(a)redhat.com>
Nir Soffer has uploaded a new change for review.
Change subject: vm: Fix attribute error when accessing drive in sampling method
......................................................................
vm: Fix attribute error when accessing drive in sampling method
Du to race when migration is finished and monitoring, drive may not have
a format attribute when accessing it from the monitor. This patch use
getattr to log spam.
Change-Id: Ia50e8af94b9c9b54332066a3f30999ce73e7a56f
Signed-off-by: Nir Soffer <nsoffer(a)redhat.com>
---
M vdsm/vm.py
1 file changed, 2 insertions(+), 1 deletion(-)
git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/18/22518/1
diff --git a/vdsm/vm.py b/vdsm/vm.py
index bb4a7ec..7e2d220 100644
--- a/vdsm/vm.py
+++ b/vdsm/vm.py
@@ -506,7 +506,8 @@
return
for vmDrive in self._vm._devices[DISK_DEVICES]:
- if not vmDrive.blockDev or vmDrive.format != 'cow':
+ # Note: drive may not have a format attribute during migration
+ if not vmDrive.blockDev or getattr(vmDrive, 'format', None) != 'cow':
continue
capacity, alloc, physical = \
--
To view, visit http://gerrit.ovirt.org/22518
To unsubscribe, visit http://gerrit.ovirt.org/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia50e8af94b9c9b54332066a3f30999ce73e7a56f
Gerrit-PatchSet: 1
Gerrit-Project: vdsm
Gerrit-Branch: ovirt-3.3
Gerrit-Owner: Nir Soffer <nsoffer(a)redhat.com>