Nir Soffer has uploaded a new change for review.
Change subject: ipwrapper: Use zombiereaper to wait for process
......................................................................
ipwrapper: Use zombiereaper to wait for process
Commit aea8aab95f eliminated ip process zombies by waiting for ip
process. This is may introduce unlimited wait if the process is not
responsive. We can avoid this issue by using zombiereaper.
Note that zombiereaper requires explicit initialization (registering
signal handler for SIGCHLD), so this make ipwrapper less useful as
generic utility.
Change-Id: I26263c5b4811d7f46f7c65d1d1b00bd96af02eb8
Signed-off-by: Nir Soffer <nsoffer(a)redhat.com>
---
M lib/vdsm/ipwrapper.py
1 file changed, 2 insertions(+), 1 deletion(-)
git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/16/32216/1
diff --git a/lib/vdsm/ipwrapper.py b/lib/vdsm/ipwrapper.py
index af31159..781e95b 100644
--- a/lib/vdsm/ipwrapper.py
+++ b/lib/vdsm/ipwrapper.py
@@ -27,6 +27,7 @@
import signal
import socket
import struct
+import zombiereaper
from netaddr.core import AddrFormatError
from netaddr import IPAddress
@@ -638,7 +639,7 @@
def stop(self):
self.proc.kill()
- self.proc.wait()
+ zombiereaper.autoReapPID(self.proc.pid)
@classmethod
def _parseLine(cls, line):
--
To view, visit http://gerrit.ovirt.org/32216
To unsubscribe, visit http://gerrit.ovirt.org/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: I26263c5b4811d7f46f7c65d1d1b00bd96af02eb8
Gerrit-PatchSet: 1
Gerrit-Project: vdsm
Gerrit-Branch: master
Gerrit-Owner: Nir Soffer <nsoffer(a)redhat.com>
Nir Soffer has uploaded a new change for review.
Change subject: schema: Fix asserts that had side effect
......................................................................
schema: Fix asserts that had side effect
Asserts should never have side effects, so optimizing them out will not
change the semantics of the code. Running with optimization would break
parsing, eliminating line.pop(0) operations. Now the only difference is
the warning if the line is not empty.
Change-Id: I59dbb1c8d3760af7d3f3049227f6ad8ceeaa402a
Signed-off-by: Nir Soffer <nsoffer(a)redhat.com>
---
M vdsm/rpc/process-schema.py
1 file changed, 4 insertions(+), 2 deletions(-)
git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/62/34362/1
diff --git a/vdsm/rpc/process-schema.py b/vdsm/rpc/process-schema.py
index 4b1c3cf..8e56a1a 100755
--- a/vdsm/rpc/process-schema.py
+++ b/vdsm/rpc/process-schema.py
@@ -103,7 +103,8 @@
'xxx': []})
# Pop a blank line
- assert('' == lines.pop(0))
+ line = lines.pop(0)
+ assert line == '', "Expected empty line: %r" % line
# Grab the entity description. It might span multiple lines.
symbol['desc'] = lines.pop(0)
@@ -111,7 +112,8 @@
symbol['desc'] += lines.pop(0)
# Pop a blank line
- assert ('' == lines.pop(0))
+ line = lines.pop(0)
+ assert line == '', "Expected empty line: %r" % line
# Populate the rest of the human-readable data.
# First try to read the parameters/members information. We are finished
--
To view, visit http://gerrit.ovirt.org/34362
To unsubscribe, visit http://gerrit.ovirt.org/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: I59dbb1c8d3760af7d3f3049227f6ad8ceeaa402a
Gerrit-PatchSet: 1
Gerrit-Project: vdsm
Gerrit-Branch: master
Gerrit-Owner: Nir Soffer <nsoffer(a)redhat.com>
Nir Soffer has uploaded a new change for review.
Change subject: misc: Replace assert with AssetionError
......................................................................
misc: Replace assert with AssetionError
The code was wrongly assumed that asserts are always available. When
running in optimized mode, the check would be skipped, leading to
disastrous results.
The assert was correct in that this is something that should never
happen, unless someone is using the barrier incorrectly, or there is
a bug in the code.
Change-Id: I68f57e393423dc4faf58763a036220849a968e70
Signed-off-by: Nir Soffer <nsoffer(a)redhat.com>
---
M vdsm/storage/misc.py
1 file changed, 3 insertions(+), 1 deletion(-)
git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/63/34363/1
diff --git a/vdsm/storage/misc.py b/vdsm/storage/misc.py
index ab2b4fd..5abcdcb 100644
--- a/vdsm/storage/misc.py
+++ b/vdsm/storage/misc.py
@@ -696,7 +696,9 @@
def exit(self):
with self._cond:
- assert self._busy, "Attempt to exit a barrier without entering"
+ if not self._busy:
+ raise AssertionError("Attempt to exit a barrier without "
+ "entering")
self._busy = False
self._cond.notifyAll()
--
To view, visit http://gerrit.ovirt.org/34363
To unsubscribe, visit http://gerrit.ovirt.org/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: I68f57e393423dc4faf58763a036220849a968e70
Gerrit-PatchSet: 1
Gerrit-Project: vdsm
Gerrit-Branch: master
Gerrit-Owner: Nir Soffer <nsoffer(a)redhat.com>
Nir Soffer has uploaded a new change for review.
Change subject: Configure iscsi_session recovery_tmo for multipath devices
......................................................................
Configure iscsi_session recovery_tmo for multipath devices
This configuration is done by device-mapper-multipath in version
0.4.9-76 and later. However, device looses configuration after recovery
from network errors.
This patch adds the recovery_tmo configuration to vdsm-multipath udev
rules. Since setting the timeout requires a script, the dmsetup command
was moved into a new multipath-configure-device script.
When we require a multipath version fixing this issue, we can remove the
timeout handling code.
Change-Id: Iaaa40fa5e6dc86beef501ef4aaefa17c4c1574c1
Signed-off-by: Nir Soffer <nsoffer(a)redhat.com>
---
M .gitignore
M configure.ac
M debian/vdsm.install
M vdsm.spec.in
M vdsm/storage/Makefile.am
A vdsm/storage/multipath-configure-device.in
A vdsm/storage/vdsm-multipath.rules
D vdsm/storage/vdsm-multipath.rules.in
8 files changed, 53 insertions(+), 22 deletions(-)
git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/82/32582/1
diff --git a/.gitignore b/.gitignore
index fca4f2e..5890806 100644
--- a/.gitignore
+++ b/.gitignore
@@ -61,7 +61,6 @@
vdsm/storage/protect/safelease
vdsm/storage/lvm.env
vdsm/storage/vdsm-lvm.rules
-vdsm/storage/vdsm-multipath.rules
vdsm/sudoers.vdsm
vdsm/svdsm.logger.conf
vdsm/vdscli.py
diff --git a/configure.ac b/configure.ac
index f40d57a..c467c37 100644
--- a/configure.ac
+++ b/configure.ac
@@ -335,10 +335,10 @@
vdsm/rpc/Makefile
vdsm/sos/Makefile
vdsm/storage/Makefile
+ vdsm/storage/multipath-configure-device
vdsm/storage/imageRepository/Makefile
vdsm/storage/protect/Makefile
vdsm/storage/vdsm-lvm.rules
- vdsm/storage/vdsm-multipath.rules
vdsm/virt/Makefile
vdsm_hooks/Makefile
vdsm_hooks/checkimages/Makefile
diff --git a/debian/vdsm.install b/debian/vdsm.install
index bd63c72..759383d 100644
--- a/debian/vdsm.install
+++ b/debian/vdsm.install
@@ -25,6 +25,7 @@
./usr/lib/python2.7/dist-packages/sos/plugins/vdsm.py
./usr/lib/python2.7/dist-packages/vdsmapi.py
./usr/libexec/vdsm/curl-img-wrap
+./usr/libexec/vdsm/multiapth-configure-device
./usr/libexec/vdsm/ovirt_functions.sh
./usr/libexec/vdsm/persist-vdsm-hooks
./usr/libexec/vdsm/safelease
diff --git a/vdsm.spec.in b/vdsm.spec.in
index a5b354c..c912150 100644
--- a/vdsm.spec.in
+++ b/vdsm.spec.in
@@ -993,6 +993,7 @@
%{_libexecdir}/%{vdsm_name}/persist-vdsm-hooks
%{_libexecdir}/%{vdsm_name}/unpersist-vdsm-hook
%{_libexecdir}/%{vdsm_name}/ovirt_functions.sh
+%{_libexecdir}/%{vdsm_name}/multipath-configure-device
%{_libexecdir}/%{vdsm_name}/vdsm-gencerts.sh
%{_libexecdir}/%{vdsm_name}/vdsmd_init_common.sh
%{_datadir}/%{vdsm_name}/network/__init__.py*
diff --git a/vdsm/storage/Makefile.am b/vdsm/storage/Makefile.am
index 99b1460..91c2f9c 100644
--- a/vdsm/storage/Makefile.am
+++ b/vdsm/storage/Makefile.am
@@ -72,7 +72,8 @@
volume.py
dist_vdsmexec_SCRIPTS = \
- curl-img-wrap
+ curl-img-wrap \
+ multipath-configure-device
nodist_vdsmstorage_DATA = \
lvm.env \
diff --git a/vdsm/storage/multipath-configure-device.in b/vdsm/storage/multipath-configure-device.in
new file mode 100644
index 0000000..dcdfea5
--- /dev/null
+++ b/vdsm/storage/multipath-configure-device.in
@@ -0,0 +1,33 @@
+#!/bin/sh
+#
+# Copyright 2014 Red Hat, Inc. and/or its affiliates.
+#
+# Licensed to you under the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version. See the files README and
+# LICENSE_GPL_v2 which accompany this distribution.
+#
+
+set -e
+
+# Ensure that multipath devices use no_path_retry fail, instead of
+# no_path_retry queue, which is hardcoded in multipath configuration for many
+# devices.
+#
+# See https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/ht…
+
+@DMSETUP_PATH@ message $DM_NAME 0 fail_if_no_path
+
+# Set iscsi_session recovery_tmo. This parameter is configured by multipath on
+# startup, but after a device fail and become active again, the configuration
+# is lost and revert back to iscsid.conf default (120).
+
+timeout=5
+
+for slave in /sys${DEVPATH}/slaves/*; do
+ path=$(@UDEVADM_PATH@ info --query=path --path="$slave")
+ session=$(echo "$path" | @SED_PATH@ -rn s'|^/devices/platform/host[0-9]+/(session[0-9]+)/.+$|\1|p')
+ if [ -n "$session" ]; then
+ echo $timeout > "/sys/class/iscsi_session/${session}/recovery_tmo"
+ fi
+done
diff --git a/vdsm/storage/vdsm-multipath.rules b/vdsm/storage/vdsm-multipath.rules
new file mode 100644
index 0000000..d4d0dd7
--- /dev/null
+++ b/vdsm/storage/vdsm-multipath.rules
@@ -0,0 +1,15 @@
+#
+# Copyright 2014 Red Hat, Inc. and/or its affiliates.
+#
+# Licensed to you under the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version. See the files README and
+# LICENSE_GPL_v2 which accompany this distribution.
+#
+
+# Vdsm rules for multipath devices.
+#
+# This rule runs vdsm-mutipath-configure tool to configure multiapth devices to
+# work with vdsm.
+
+ACTION=="add|change", ENV{DM_UUID}=="mpath-?*", RUN+="/usr/libexec/vdsm/multipath-configure-device"
diff --git a/vdsm/storage/vdsm-multipath.rules.in b/vdsm/storage/vdsm-multipath.rules.in
deleted file mode 100644
index cc2a878..0000000
--- a/vdsm/storage/vdsm-multipath.rules.in
+++ /dev/null
@@ -1,19 +0,0 @@
-#
-# Copyright 2014 Red Hat, Inc. and/or its affiliates.
-#
-# Licensed to you under the GNU General Public License as published by
-# the Free Software Foundation; either version 2 of the License, or
-# (at your option) any later version. See the files README and
-# LICENSE_GPL_v2 which accompany this distribution.
-#
-
-# Vdsm rules for multipath devices.
-#
-# Ensure that multipath devices use no_path_retry fail, instead of
-# no_path_retry queue, which is hardcoded in multipath configuration for many
-# devices.
-#
-# See https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/ht…
-#
-
-ACTION=="add|change", ENV{DM_UUID}=="mpath-?*", RUN+="@DMSETUP_PATH@ message $env{DM_NAME} 0 fail_if_no_path"
--
To view, visit http://gerrit.ovirt.org/32582
To unsubscribe, visit http://gerrit.ovirt.org/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iaaa40fa5e6dc86beef501ef4aaefa17c4c1574c1
Gerrit-PatchSet: 1
Gerrit-Project: vdsm
Gerrit-Branch: master
Gerrit-Owner: Nir Soffer <nsoffer(a)redhat.com>
Nir Soffer has uploaded a new change for review.
Change subject: vm: Require format attribute for drives
......................................................................
vm: Require format attribute for drives
We have seen sampling errors where a drive has no "format" attribute.
These errors spam the system logs, and does not help to debug the real
issue - why the required format attribute is not set?
This patch ensure that a drive cannot be created without a format
attribute, avoding log spam by sampling errors, and hopefully revealing
the real reason for this bug.
One broken test creating a drive without a format was fixed and new test
ensure that creating drive without a format raises.
Change-Id: I01ab1e071ecb76f383cc6dc7d99782e10cc90136
Relates-To: http://gerrit.ovirt.org/22551
Relates-To: https://bugzilla.redhat.com/994534
Relates-To: http://lists.ovirt.org/pipermail/users/2014-February/021007.html
Bug-Url: https://bugzilla.redhat.com/1055437
Signed-off-by: Nir Soffer <nsoffer(a)redhat.com>
---
M tests/vmTests.py
M vdsm/vm.py
2 files changed, 28 insertions(+), 3 deletions(-)
git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/34/24234/1
diff --git a/tests/vmTests.py b/tests/vmTests.py
index 1e0a3f6..724d458 100644
--- a/tests/vmTests.py
+++ b/tests/vmTests.py
@@ -449,6 +449,18 @@
redir = vm.RedirDevice(self.conf, self.log, **dev)
self.assertXML(redir.getXML(), redirXML)
+ def testDriveRequiredParameters(self):
+ # TODO: It is not clear what are the other required parameters, and it
+ # the parameters depend on the type of drive. We probbaly need bigger
+ # test here that test many combinations.
+ # Currently this test only missing "format" attribute.
+ conf = {'index': '2', 'propagateErrors': 'off', 'iface': 'ide',
+ 'name': 'hdc', 'device': 'cdrom', 'path': '/tmp/fedora.iso',
+ 'type': 'disk', 'readonly': 'True', 'shared': 'none',
+ 'serial': '54-a672-23e5b495a9ea'}
+ self.assertRaises(vm.InvalidDeviceParameters, vm.Drive, {}, self.log,
+ **conf)
+
def testDriveSharedStatus(self):
sharedConfigs = [
# Backward compatibility
@@ -470,7 +482,8 @@
'exclusive', 'shared', 'none', 'transient',
]
- driveConfig = {'index': '0', 'iface': 'virtio', 'device': 'disk'}
+ driveConfig = {'index': '0', 'iface': 'virtio', 'device': 'disk',
+ 'format': 'raw'}
for driveInput, driveOutput in zip(sharedConfigs, expectedStates):
driveInput.update(driveConfig)
diff --git a/vdsm/vm.py b/vdsm/vm.py
index aae8bd6..b7151b1 100644
--- a/vdsm/vm.py
+++ b/vdsm/vm.py
@@ -80,6 +80,10 @@
SMARTCARD_DEVICES = 'smartcard'
+class InvalidDeviceParameters(Exception):
+ """ Raised when creating device with invalid parameters """
+
+
def isVdsmImage(drive):
"""
Tell if drive looks like a vdsm image
@@ -1438,6 +1442,9 @@
VOLWM_CHUNK_REPLICATE_MULT = 2 # Chunk multiplier during replication
def __init__(self, conf, log, **kwargs):
+ if 'format' not in kwargs:
+ raise InvalidDeviceParameters('"format" attribute is required:'
+ ' %r' % kwargs)
if not kwargs.get('serial'):
self.serial = kwargs.get('imageID'[-20:]) or ''
VmDevice.__init__(self, conf, log, **kwargs)
@@ -3108,8 +3115,13 @@
for devType, devClass in self.DeviceMapping:
for dev in devices[devType]:
- self._devices[devType].append(devClass(self.conf, self.log,
- **dev))
+ try:
+ drive = devClass(self.conf, self.log, **dev)
+ except InvalidDeviceParameters as e:
+ self.log.error('Ignoring device with invalid parameters:'
+ ' %s', e, exc_info=True)
+ else:
+ self._devices[devType].append(drive)
# We should set this event as a last part of drives initialization
self._pathsPreparedEvent.set()
--
To view, visit http://gerrit.ovirt.org/24234
To unsubscribe, visit http://gerrit.ovirt.org/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: I01ab1e071ecb76f383cc6dc7d99782e10cc90136
Gerrit-PatchSet: 1
Gerrit-Project: vdsm
Gerrit-Branch: master
Gerrit-Owner: Nir Soffer <nsoffer(a)redhat.com>
Nir Soffer has uploaded a new change for review.
Change subject: vm: Prevent multiple threads blocking on same libvirt domain
......................................................................
vm: Prevent multiple threads blocking on same libvirt domain
If a libvirt call get stuck because a vm is not responding, we could
have multiple threads blocked on same vm without any limit, using
precious libvirt resources that could be used to run other vms.
This patch adds a new TimedLock lock, that raise a LockTimeout if the
lock cannot be acquired after configured timeout. Using this lock, a vm
allow now only one concurrent libvirt call. If a libvirt call get stuck,
and other threads are tyring to invoke libvirt calls on the same
vm, they will wait until the current call finish, or fail with a
timeout.
This should slow down calls for single vm, since each call is invoked
only when the previous call returns. However, when using many vms, this
creates natural round-robin scheduling, giving each vm equal chance to
make progress, and limiting the load on libvirt.
Change-Id: Ib459697b8688ebcba987cd6b9e11815826e92990
Signed-off-by: Nir Soffer <nsoffer(a)redhat.com>
---
M debian/vdsm-python.install
M lib/vdsm/Makefile.am
A lib/vdsm/locking.py
M tests/Makefile.am
A tests/lockingTests.py
M vdsm.spec.in
M vdsm/virt/vm.py
7 files changed, 172 insertions(+), 8 deletions(-)
git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/72/30772/1
diff --git a/debian/vdsm-python.install b/debian/vdsm-python.install
index 9d6a99c..b80fd4f 100644
--- a/debian/vdsm-python.install
+++ b/debian/vdsm-python.install
@@ -8,6 +8,7 @@
./usr/lib/python2.7/dist-packages/vdsm/exception.py
./usr/lib/python2.7/dist-packages/vdsm/ipwrapper.py
./usr/lib/python2.7/dist-packages/vdsm/libvirtconnection.py
+./usr/lib/python2.7/dist-packages/vdsm/locking.py
./usr/lib/python2.7/dist-packages/vdsm/netconfpersistence.py
./usr/lib/python2.7/dist-packages/vdsm/netinfo.py
./usr/lib/python2.7/dist-packages/vdsm/netlink/__init__.py
diff --git a/lib/vdsm/Makefile.am b/lib/vdsm/Makefile.am
index 8e90e6e..ffb642c 100644
--- a/lib/vdsm/Makefile.am
+++ b/lib/vdsm/Makefile.am
@@ -28,6 +28,7 @@
exception.py \
ipwrapper.py \
libvirtconnection.py \
+ locking.py \
netconfpersistence.py \
netinfo.py \
profile.py \
diff --git a/lib/vdsm/locking.py b/lib/vdsm/locking.py
new file mode 100644
index 0000000..9e92ca3
--- /dev/null
+++ b/lib/vdsm/locking.py
@@ -0,0 +1,54 @@
+#
+# Copyright 2014 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+#
+# Refer to the README and COPYING files for full details of the license
+#
+
+import threading
+import time
+
+
+class LockTimeout(Exception):
+ """ Timeout acquiring a TimedLock """
+
+
+class TimedLock(object):
+ """
+ A lock raising a LockTimeout if it cannot be acquired within timeout.
+ """
+
+ def __init__(self, name, timeout):
+ self._name = name
+ self._timeout = timeout
+ self._cond = threading.Condition(threading.Lock())
+ self._busy = False
+
+ def __enter__(self):
+ deadline = time.time() + self._timeout
+ with self._cond:
+ while self._busy:
+ now = time.time()
+ if now >= deadline:
+ raise LockTimeout("Timeout acquiring lock %r" % self._name)
+ self._cond.wait(deadline - now)
+ self._busy = True
+ return self
+
+ def __exit__(self, *args):
+ with self._cond:
+ self._busy = False
+ self._cond.notify()
diff --git a/tests/Makefile.am b/tests/Makefile.am
index aa4a45e..10a2727 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -44,6 +44,7 @@
jsonRpcTests.py \
ksmTests.py \
libvirtconnectionTests.py \
+ lockingTests.py \
lvmTests.py \
main.py \
miscTests.py \
diff --git a/tests/lockingTests.py b/tests/lockingTests.py
new file mode 100644
index 0000000..9ac8a05
--- /dev/null
+++ b/tests/lockingTests.py
@@ -0,0 +1,89 @@
+#
+# Copyright 2014 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+#
+# Refer to the README and COPYING files for full details of the license
+#
+
+import time
+import threading
+from vdsm import locking
+from testlib import VdsmTestCase
+
+
+class TimedLockTests(VdsmTestCase):
+
+ def test_free(self):
+ lock = locking.TimedLock("xxx-yyy-zzz", 0)
+ with self.assertNotRaises():
+ with lock:
+ pass
+
+ def test_busy(self):
+ lock = locking.TimedLock("xxx-yyy-zzz", 0)
+ with self.assertRaises(locking.LockTimeout):
+ with lock:
+ with lock:
+ pass
+
+ def test_serialize(self):
+ lock = locking.TimedLock("xxx-yyy-zzz", 0.4)
+ single_thread = threading.BoundedSemaphore(1)
+ passed = [0]
+ timedout = [0]
+
+ def run():
+ try:
+ with lock:
+ with single_thread:
+ time.sleep(0.1)
+ passed[0] += 1
+ except locking.LockTimeout:
+ timedout[0] += 1
+
+ self.run_threads(3, run)
+ self.assertEquals(passed[0], 3)
+ self.assertEquals(timedout[0], 0)
+
+ def test_timeout(self):
+ lock = locking.TimedLock("xxx-yyy-zzz", 0.1)
+ single_thread = threading.BoundedSemaphore(1)
+ passed = [0]
+ timedout = [0]
+
+ def run():
+ try:
+ with lock:
+ with single_thread:
+ time.sleep(0.2)
+ passed[0] += 1
+ except locking.LockTimeout:
+ timedout[0] += 1
+
+ self.run_threads(3, run)
+ self.assertEquals(passed[0], 1)
+ self.assertEquals(timedout[0], 2)
+
+ def run_threads(self, count, target):
+ threads = []
+ for i in range(count):
+ t = threading.Thread(target=target)
+ t.daemon = True
+ threads.append(t)
+ for t in threads:
+ t.start()
+ for t in threads:
+ t.join()
diff --git a/vdsm.spec.in b/vdsm.spec.in
index 2d371b1..c7a2b0b 100644
--- a/vdsm.spec.in
+++ b/vdsm.spec.in
@@ -1159,6 +1159,7 @@
%{python_sitearch}/%{vdsm_name}/exception.py*
%{python_sitearch}/%{vdsm_name}/ipwrapper.py*
%{python_sitearch}/%{vdsm_name}/libvirtconnection.py*
+%{python_sitearch}/%{vdsm_name}/locking.py*
%{python_sitearch}/%{vdsm_name}/netinfo.py*
%{python_sitearch}/%{vdsm_name}/netlink/__init__.py*
%{python_sitearch}/%{vdsm_name}/netlink/addr.py*
diff --git a/vdsm/virt/vm.py b/vdsm/virt/vm.py
index 25eec71..9e9a844 100644
--- a/vdsm/virt/vm.py
+++ b/vdsm/virt/vm.py
@@ -37,6 +37,7 @@
# vdsm imports
from vdsm import constants
from vdsm import libvirtconnection
+from vdsm import locking
from vdsm import netinfo
from vdsm import qemuimg
from vdsm import utils
@@ -619,12 +620,24 @@
class NotifyingVirDomain:
- # virDomain wrapper that notifies vm when a method raises an exception with
- # get_error_code() = VIR_ERR_OPERATION_TIMEOUT
+ """
+ Wrap virDomain object for limiting concurrent calls and reporting timeouts.
- def __init__(self, dom, tocb):
+ The wrapper allows only one concurrent call per vm, to prevent blocking of
+ multiple threads when underlying libvirt call get stuck. Invoking a domain
+ method will block if the domain is busy with another call. If the domain is
+ not available after timeout seconds, a timeout is reported and a
+ TimeoutError is raised.
+
+ If a domain method was invoked, and the libvirt call failed with with
+ VIR_ERR_OPERATION_TIMEOUT error code, the timeout is reported, and
+ TimeoutError is raised.
+ """
+
+ def __init__(self, dom, tocb, vmid, timeout=30):
self._dom = dom
self._cb = tocb
+ self._timedlock = locking.TimedLock(vmid, timeout)
def __getattr__(self, name):
attr = getattr(self._dom, name)
@@ -633,7 +646,8 @@
def f(*args, **kwargs):
try:
- ret = attr(*args, **kwargs)
+ with self._timedlock:
+ ret = attr(*args, **kwargs)
self._cb(False)
return ret
except libvirt.libvirtError as e:
@@ -643,6 +657,9 @@
toe.err = e.err
raise toe
raise
+ except locking.LockTimeout as e:
+ self._cb(True)
+ raise TimeoutError(e)
return f
@@ -2892,7 +2909,7 @@
if self.recovering:
self._dom = NotifyingVirDomain(
self._connection.lookupByUUIDString(self.id),
- self._timeoutExperienced)
+ self._timeoutExperienced, self.id)
elif 'restoreState' in self.conf:
fromSnapshot = self.conf.get('restoreFromSnapshot', False)
srcDomXML = self.conf.pop('_srcDomXML')
@@ -2912,7 +2929,7 @@
self._dom = NotifyingVirDomain(
self._connection.lookupByUUIDString(self.id),
- self._timeoutExperienced)
+ self._timeoutExperienced, self.id)
else:
flags = libvirt.VIR_DOMAIN_NONE
if 'launchPaused' in self.conf:
@@ -2921,7 +2938,7 @@
del self.conf['launchPaused']
self._dom = NotifyingVirDomain(
self._connection.createXML(domxml, flags),
- self._timeoutExperienced)
+ self._timeoutExperienced, self.id)
hooks.after_vm_start(self._dom.XMLDesc(0), self.conf)
for dev in self._customDevices():
hooks.after_device_create(dev._deviceXML, self.conf,
@@ -3690,7 +3707,7 @@
# or restart vdsm if connection to libvirt was lost
self._dom = NotifyingVirDomain(
self._connection.lookupByUUIDString(self.id),
- self._timeoutExperienced)
+ self._timeoutExperienced, self.id)
if not self._incomingMigrationFinished.isSet():
state = self._dom.state(0)
--
To view, visit http://gerrit.ovirt.org/30772
To unsubscribe, visit http://gerrit.ovirt.org/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib459697b8688ebcba987cd6b9e11815826e92990
Gerrit-PatchSet: 1
Gerrit-Project: vdsm
Gerrit-Branch: master
Gerrit-Owner: Nir Soffer <nsoffer(a)redhat.com>
Nir Soffer has uploaded a new change for review.
Change subject: resourceManager: Keep resource state if registerResource fails
......................................................................
resourceManager: Keep resource state if registerResource fails
Previous code was increasing resource activeUsers counter, but
exceptions raised after that caused the method to fail, leaving a locked
resources that nobody can release. Such locked resource may lead to
failure of any pool operation, making the host non-operational, and
requiring a restart of vdsm.
The failure in the field was caused by Python logging bug, raising
OSError when message was logged when log file was rotated. However, such
failure can happen everywhere, and locking code must be written in such
way that failure would never leave a resource locked.
This patch ensure that resource is added and activeUsers counter is
increased only if everything else was fine.
Since simulating logging error is hard, the tests monkeypatch the
RequestRef class to simulate a failure. This is little ugly, depending
on internal implementation detail, but I could not find a cleaner way.
Change-Id: I16abf41ebc8a8a99b292d38c945074752254a34b
Relates-To: https://bugzilla.redhat.com/1065650
Signed-off-by: Nir Soffer <nsoffer(a)redhat.com>
---
M tests/resourceManagerTests.py
M vdsm/storage/resourceManager.py
2 files changed, 50 insertions(+), 9 deletions(-)
git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/84/25284/1
diff --git a/tests/resourceManagerTests.py b/tests/resourceManagerTests.py
index 01b0669..e2b6461 100644
--- a/tests/resourceManagerTests.py
+++ b/tests/resourceManagerTests.py
@@ -29,6 +29,7 @@
import storage.resourceManager as resourceManager
from testrunner import VdsmTestCase as TestCaseBase
from testValidation import slowtest, stresstest
+import monkeypatch
class NullResourceFactory(resourceManager.SimpleResourceFactory):
@@ -209,6 +210,32 @@
ex.__class__.__name__)
self.fail("Managed to access an attribute not exposed by wrapper")
+
+ def testRegisterResourceFailureExclusive(self):
+ # This regeisterion must fail
+ with monkeypatch.MonkeyPatchScope(
+ [(resourceManager, 'RequestRef', FakeRequestRef)]):
+ self.assertRaises(Failure, self.manager.registerResource, "string",
+ "test", resourceManager.LockType.exclusive, None)
+
+ # And it should not leave a locked resource
+ with self.manager.acquireResource("string", "test",
+ resourceManager.LockType.exclusive,
+ 0):
+ pass
+
+ def testRegisterResourceFailureShared(self):
+ # This regeisterion must fail
+ with monkeypatch.MonkeyPatchScope(
+ [(resourceManager, 'RequestRef', FakeRequestRef)]):
+ self.assertRaises(Failure, self.manager.registerResource, "string",
+ "test", resourceManager.LockType.shared, None)
+
+ # And it should not leave a locked resource
+ with self.manager.acquireResource("string", "test",
+ resourceManager.LockType.exclusive,
+ 0):
+ pass
def testAccessAttributeNotExposedByRequestRef(self):
resources = []
@@ -705,3 +732,14 @@
except:
resourceManager.ResourceManager._instance = None
raise
+
+# Helpers
+
+
+class Failure(Exception):
+ """ Unique exception for testing """
+
+
+def FakeRequestRef(*a, **kw):
+ """ Used to simulate failures when registering a resource """
+ raise Failure()
diff --git a/vdsm/storage/resourceManager.py b/vdsm/storage/resourceManager.py
index 1be1450..ce144cf 100644
--- a/vdsm/storage/resourceManager.py
+++ b/vdsm/storage/resourceManager.py
@@ -559,23 +559,25 @@
if len(resource.queue) == 0 and \
resource.currentLock == LockType.shared and \
request.lockType == LockType.shared:
- resource.activeUsers += 1
self._log.debug("Resource '%s' found in shared state "
"and queue is empty, Joining current "
"shared lock (%d active users)",
- fullName, resource.activeUsers)
+ fullName, resource.activeUsers + 1)
request.grant()
+ ref = RequestRef(request)
contextCleanup.defer(request.emit,
ResourceRef(namespace, name,
resource.realObj,
request.reqID))
- return RequestRef(request)
+ resource.activeUsers += 1
+ return ref
- resource.queue.insert(0, request)
self._log.debug("Resource '%s' is currently locked, "
"Entering queue (%d in queue)",
- fullName, len(resource.queue))
- return RequestRef(request)
+ fullName, len(resource.queue) + 1)
+ ref = RequestRef(request)
+ resource.queue.insert(0, request)
+ return ref
# TODO : Creating the object inside the namespace lock causes
# the entire namespace to lock and might cause
@@ -592,19 +594,20 @@
contextCleanup.defer(request.cancel)
return RequestRef(request)
- resource = resources[name] = ResourceManager.ResourceInfo(
- obj, namespace, name)
+ resource = ResourceManager.ResourceInfo(obj, namespace, name)
resource.currentLock = request.lockType
resource.activeUsers += 1
self._log.debug("Resource '%s' is free. Now locking as '%s' "
"(1 active user)", fullName, request.lockType)
request.grant()
+ ref = RequestRef(request)
contextCleanup.defer(request.emit,
ResourceRef(namespace, name,
resource.realObj,
request.reqID))
- return RequestRef(request)
+ resources[name] = resource
+ return ref
def releaseResource(self, namespace, name):
# WARN : unlike in resource acquire the user now has the request
--
To view, visit http://gerrit.ovirt.org/25284
To unsubscribe, visit http://gerrit.ovirt.org/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: I16abf41ebc8a8a99b292d38c945074752254a34b
Gerrit-PatchSet: 1
Gerrit-Project: vdsm
Gerrit-Branch: master
Gerrit-Owner: Nir Soffer <nsoffer(a)redhat.com>
Dan Kenigsberg has uploaded a new change for review.
Change subject: call stop_event_loop upon exit
......................................................................
call stop_event_loop upon exit
For cleanliness, whomever starts a thread should stop it when it is no
longer needed.
Change-Id: I9ab0d9b7be976e37a89a96d2f09a353186008731
Signed-off-by: Dan Kenigsberg <danken(a)redhat.com>
---
M vdsm/vdsm
1 file changed, 1 insertion(+), 0 deletions(-)
git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/32/26532/1
diff --git a/vdsm/vdsm b/vdsm/vdsm
index 652797c..fd9b3f8 100755
--- a/vdsm/vdsm
+++ b/vdsm/vdsm
@@ -81,6 +81,7 @@
signal.pause()
finally:
cif.prepareForShutdown()
+ libvirtconnection.stop_event_loop()
def run(pidfile=None):
--
To view, visit http://gerrit.ovirt.org/26532
To unsubscribe, visit http://gerrit.ovirt.org/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: I9ab0d9b7be976e37a89a96d2f09a353186008731
Gerrit-PatchSet: 1
Gerrit-Project: vdsm
Gerrit-Branch: master
Gerrit-Owner: Dan Kenigsberg <danken(a)redhat.com>