From Dan Kenigsberg <danken(a)redhat.com>:
Dan Kenigsberg has submitted this change and it was merged.
Change subject: virt: host: stats: do not replace HostMonitor
......................................................................
virt: host: stats: do not replace HostMonitor
Some time ago we moved the HostMonitor operation inside the periodic
framework. This was made out of convenience, to kill one extra thread,
under the assumption that HostMonitor never blocks.
But from
https://bugzilla.redhat.com/show_bug.cgi?id=1419856#c15
we learn that indeed HostMonitor can indeed block under rare, still
not completely clear, circumstances.
If more than two thread enter the netlink interface, all of them block.
There is perhaps a bug in libnl, but we should never access concurrently
this interface in the first place. So, if one HostMonitor is slow, we
must never replace it, so we can never access the same interface more
than once.
This is a partial fix. It is still unclear why the first HostMonitor is
slow, so why we end up with a timeout in the first place; but still,
given
1. the original intentions for moving HostMonitor in the periodic
framework
2. the fact that we don't need the host stats in the critical flow
3. the fact that we should never access concurrently the netlink
interface
then it is safe to never replace the HostMonitor operation. If it is
blocked, it will stay stuck unless it unblocks, or Vdsm is restarted.
Change-Id: I60ccd4e0e239ce5dfa2c90947bd6cd59a23e51b3
Related-To:
https://bugzilla.redhat.com/1419856
Backport-To: 4.1
Backport-To: 4.0
Signed-off-by: Francesco Romani <fromani(a)redhat.com>
---
M lib/vdsm/virt/periodic.py
1 file changed, 3 insertions(+), 1 deletion(-)
Approvals:
Piotr Kliczewski: Looks good to me, but someone else must approve
Yaniv Bronhaim: Looks good to me, but someone else must approve
Jenkins CI: Passed CI tests
Dan Kenigsberg: Looks good to me, approved
Francesco Romani: Verified
Milan Zamazal: Looks good to me, but someone else must approve
--
To view, visit
https://gerrit.ovirt.org/73133
To unsubscribe, visit
https://gerrit.ovirt.org/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: I60ccd4e0e239ce5dfa2c90947bd6cd59a23e51b3
Gerrit-PatchSet: 12
Gerrit-Project: vdsm
Gerrit-Branch: master
Gerrit-Owner: Francesco Romani <fromani(a)redhat.com>
Gerrit-Reviewer: Dan Kenigsberg <danken(a)redhat.com>
Gerrit-Reviewer: Edward Haas <edwardh(a)redhat.com>
Gerrit-Reviewer: Francesco Romani <fromani(a)redhat.com>
Gerrit-Reviewer: Jenkins CI
Gerrit-Reviewer: Martin Polednik <mpolednik(a)redhat.com>
Gerrit-Reviewer: Milan Zamazal <mzamazal(a)redhat.com>
Gerrit-Reviewer: Piotr Kliczewski <piotr.kliczewski(a)gmail.com>
Gerrit-Reviewer: Yaniv Bronhaim <ybronhei(a)redhat.com>
Gerrit-Reviewer: gerrit-hooks <automation(a)ovirt.org>