Nir Soffer has uploaded a new change for review.
Change subject: udevadm: More precise error handling ......................................................................
udevadm: More precise error handling
udevadm provides a --timeout option, but there is no robust way to detect a timeout in EL6, EL7, and Fedora 20. In Fedora 21 and upstream, udevadm ignores the timeout option. This patch improves error handling by using our own timeout.
udevadm.settle() raises now udevadm.Failure or udevadm.Timeout, and the caller is responsible to handle the error.
In both multipath.rescan() and IscsiConnection.connect(), we warn about timeout but do not handle other errors, so real errors in udevadm will fail loudly.
Change-Id: Ia0a7380b1b181ec93399ea741122cfa2e98086fb Relates-To: https://bugzilla.redhat.com/1209474 Signed-off-by: Nir Soffer nsoffer@redhat.com --- A tests/udevadmTests.py M vdsm/storage/multipath.py M vdsm/storage/storageServer.py M vdsm/storage/udevadm.py 4 files changed, 106 insertions(+), 21 deletions(-)
git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/40/39740/1
diff --git a/tests/udevadmTests.py b/tests/udevadmTests.py new file mode 100644 index 0000000..90841b2 --- /dev/null +++ b/tests/udevadmTests.py @@ -0,0 +1,52 @@ +# +# Copyright 2015 Red Hat, Inc. +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA +# +# Refer to the README and COPYING files for full details of the license +# + + +from monkeypatch import MonkeyPatch +from testlib import VdsmTestCase + +from vdsm import utils +from storage import udevadm + +TRUE = utils.CommandPath("true", "/bin/true", "/usr/bin/true") +FALSE = utils.CommandPath("false", "/bin/false", "/usr/bin/false") +READ = utils.CommandPath("read", "/bin/read", "/usr/bin/read") + + +class UdevadmSettleTests(VdsmTestCase): + + @MonkeyPatch(udevadm, "_UDEVADM", TRUE) + def test_success(self): + udevadm.settle(5) + + @MonkeyPatch(udevadm, "_UDEVADM", FALSE) + def test_error(self): + try: + udevadm.settle(5) + except udevadm.Failure as e: + self.assertEqual(e.rc, 1) + self.assertEqual(e.out, "") + self.assertEqual(e.err, "") + else: + self.fail("Failure not raised") + + @MonkeyPatch(udevadm, "_UDEVADM", READ) + def test_timeout(self): + self.assertRaises(udevadm.Timeout, udevadm.settle, 1) diff --git a/vdsm/storage/multipath.py b/vdsm/storage/multipath.py index a1c42b3..925c411 100644 --- a/vdsm/storage/multipath.py +++ b/vdsm/storage/multipath.py @@ -73,7 +73,10 @@ # events are processed, ensuring detection of new devices and creation or # update of multipath devices. timeout = config.getint('irs', 'scsi_settle_timeout') - udevadm.settle(timeout) + try: + udevadm.settle(timeout) + except udevadm.Timeout as e: + log.warning("Timeout waiting for udev events: %s", e)
def deduceType(a, b): diff --git a/vdsm/storage/storageServer.py b/vdsm/storage/storageServer.py index 22a90d1..c19fb8d 100644 --- a/vdsm/storage/storageServer.py +++ b/vdsm/storage/storageServer.py @@ -382,7 +382,10 @@ def connect(self): iscsi.addIscsiNode(self._iface, self._target, self._cred) timeout = config.getint("irs", "scsi_settle_timeout") - udevadm.settle(timeout) + try: + udevadm.settle(timeout) + except udevadm.Timeout as e: + self.log.warning("Timeout waiting for udev events: %s", e)
def _match(self, session): target = session.target diff --git a/vdsm/storage/udevadm.py b/vdsm/storage/udevadm.py index 4b4b54a..a2afd04 100644 --- a/vdsm/storage/udevadm.py +++ b/vdsm/storage/udevadm.py @@ -18,22 +18,39 @@ # Refer to the README and COPYING files for full details of the license #
-import logging +import errno +import signal + from vdsm import utils +from vdsm.infra import zombiereaper
_UDEVADM = utils.CommandPath("udevadm", "/sbin/udevadm", "/usr/sbin/udevadm")
class Error(Exception): + message = None
- def __init__(self, rc, out, err): + def __str__(self): + return self.message.format(self=self) + + +class Failure(Error): + message = ("udevadm failed cmd={self.cmd} rc={self.rc} out={self.out!r} " + "err={self.err!r}") + + def __init__(self, cmd, rc, out, err): + self.cmd = cmd self.rc = rc self.out = out self.err = err
- def __str__(self): - return "Process failed with rc=%d out=%r err=%r" % ( - self.rc, self.out, self.err) + +class Timeout(Error): + message = ("udevadm timed out cmd={self.cmd} timeout={self.timeout}") + + def __init__(self, cmd, timeout): + self.cmd = cmd + self.timeout = timeout
def settle(timeout, exit_if_exists=None): @@ -44,25 +61,35 @@ Arguments:
timeout Maximum number of seconds to wait for the event queue to - become empty. A value of 0 will check if the queue is empty - and always return immediately. + become empty.
exit_if_exists Stop waiting if file exists. + + Raises Failure if udevadm failed, or Timeout if udevadm did not terminate + within the requested timeout. """ - args = ["settle", "--timeout=%s" % timeout] + cmd = [_UDEVADM.cmd, "settle"]
if exit_if_exists: - args.append("--exit-if-exists=%s" % exit_if_exists) + cmd.append("--exit-if-exists=%s" % exit_if_exists)
- try: - _run_command(args) - except Error as e: - logging.error("%s", e) + _run_command(cmd, timeout)
-def _run_command(args): - cmd = [_UDEVADM.cmd] - cmd.extend(args) - rc, out, err = utils.execCmd(cmd, raw=True) - if rc != 0: - raise Error(rc, out, err) +def _run_command(cmd, timeout=None): + proc = utils.execCmd(cmd, sync=False, deathSignal=signal.SIGKILL) + + if not proc.wait(timeout): + try: + proc.kill() + except OSError as e: + if e.errno != errno.ESRCH: + raise + finally: + zombiereaper.autoReapPID(proc.pid) + raise Timeout(cmd, timeout) + + if proc.returncode != 0: + out = "".join(proc.stdout) + err = "".join(proc.stderr) + raise Failure(cmd, proc.returncode, out, err)
automation@ovirt.org has posted comments on this change.
Change subject: udevadm: More precise error handling ......................................................................
Patch Set 1:
* Update tracker::IGNORE, no Bug-Url found * Check Bug-Url::WARN, no bug url found, make sure header matches 'Bug-Url: ' and is a valid url. * Check merged to previous::IGNORE, Not in stable branch (['ovirt-3.5', 'ovirt-3.4', 'ovirt-3.3'])
oVirt Jenkins CI Server has posted comments on this change.
Change subject: udevadm: More precise error handling ......................................................................
Patch Set 1:
Build Started (1/2) -> http://jenkins.ovirt.org/job/vdsm_master_unit-tests_created/17765/
oVirt Jenkins CI Server has posted comments on this change.
Change subject: udevadm: More precise error handling ......................................................................
Patch Set 1:
Build Started (2/2) -> http://jenkins.ovirt.org/job/vdsm_master_pep8_gerrit/17591/
Nir Soffer has posted comments on this change.
Change subject: udevadm: More precise error handling ......................................................................
Patch Set 1: Verified+1
Tested on Fedora 21
oVirt Jenkins CI Server has posted comments on this change.
Change subject: udevadm: More precise error handling ......................................................................
Patch Set 1:
Build Successful
http://jenkins.ovirt.org/job/vdsm_master_pep8_gerrit/17591/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_unit-tests_created/17765/ : SUCCESS
Nir Soffer has posted comments on this change.
Change subject: udevadm: More precise error handling ......................................................................
Patch Set 1:
(1 comment)
https://gerrit.ovirt.org/#/c/39740/1//COMMIT_MSG Commit Message:
Line 15: caller is responsible to handle the error. Line 16: Line 17: In both multipath.rescan() and IscsiConnection.connect(), we warn about Line 18: timeout but do not handle other errors, so real errors in udevadm will Line 19: fail loudly. Failing loudly when udevadm settle fails is not very useful, maybe we should instead log the error and wait couple of seconds.
So I think we should handle all errors inside settle like this:
deadline = monotonic_time + timeout try: _run_until(cmd, deadline) except Timeout: log warning except Failure: log error time.sleep(deadline - monotonic_time()) Line 20: Line 21: Change-Id: Ia0a7380b1b181ec93399ea741122cfa2e98086fb Line 22: Relates-To: https://bugzilla.redhat.com/1209474
Allon Mureinik has posted comments on this change.
Change subject: udevadm: More precise error handling ......................................................................
Patch Set 1:
Are we moving forwards with this patch?
Jenkins CI RO has abandoned this change.
Change subject: udevadm: More precise error handling ......................................................................
Abandoned
Abandoned due to no activity - please restore if still relevant
gerrit-hooks has posted comments on this change.
Change subject: udevadm: More precise error handling ......................................................................
Patch Set 1:
* Update tracker: IGNORE, no Bug-Url found
Nir Soffer has restored this change.
Change subject: udevadm: More precise error handling ......................................................................
Restored
vdsm-patches@lists.fedorahosted.org