Francesco Romani has posted comments on this change.
Change subject: schedule: Introduce scheduling libary ......................................................................
Patch Set 4:
(4 comments)
This could be an useful building block, but I have a couple of initial concerns, one minor and one major.
The minor one is this API, albeit nice, is reminiscent of the concurrent.futures (https://docs.python.org/3/library/concurrent.futures.html - backport for py2 available). Can be probably considered a subset of it. Maybe I'm biased towards this API, but I don't see the need to have a different one, and to carry more code on our shoulders; then, I'd like either to build on that API/code, maybe depending on the backport for the time being, or ot build a compatibility layer to behave like the futures (see http://gerrit.ovirt.org/#/c/29424/)
However, this is a minor concern. The major one is sadly we have to deal with possibly blocking calls, as I said inline, and this very code is built on the assumption that Scheduled code executes fast and do not block. I think it is hard to fix a blocked call from the outside without support of the executor code, this is why I added the supervisor and the thread replacement on my threadpool. Maybe this was a wrong approach, but this problem is real and I don't see yet how to tackle it with this code.
Please, let's discuss those major design goals/problems on devel@ovirt.org, as mailing lists is a more friendly to articulate thoughts.
http://gerrit.ovirt.org/#/c/29607/4/lib/vdsm/schedule.py File lib/vdsm/schedule.py:
Line 61: """ Line 62: Schedule calls for future execution in a background thread. Line 63: Line 64: This class is thread safe; multiple threads can schedule calls or cancel Line 65: the scheudler. typo: scheduler Line 66: """ Line 67: Line 68: DEFAULT_DELAY = 30.0 # Used if no call are scheduled Line 69:
Line 77: t = threading.Thread(target=self._run, name=name) Line 78: t.daemon = True Line 79: t.start() Line 80: Line 81: def schedule(self, delay, callable): 'callable' shadows a builtin, that is usually a thing to avoid Line 82: """ Line 83: Schedule callable to be called after delay seconds on the scheduler Line 84: thread. Line 85:
Line 82: """ Line 83: Schedule callable to be called after delay seconds on the scheduler Line 84: thread. Line 85: Line 86: Callable must not block or take excessive time to complete. It it does Still, in sampling we can have calls that block, because some calls may need to access storage and/or the QEMU monitor and we cannot known in advance which one will block
Actually we can get useful information from a blocked call, this can be an useful additional indicator that the QEMU process is blocked/not responding, and this information must be reported
Even considering a single scheduler per sampling type -to minimize interferences between samplings- the fist storage-blocking call will hurt all the others. Line 87: not finish quickly, it may delay other scheduled calls on the scheduler Line 88: thread. Line 89: Line 90: Returns a ScheduledCall that may be canceled if callable was not called
Line 91: yet. Line 92: """ Line 93: deadline = time.time() + delay Line 94: call = _Call(deadline, callable) Line 95: self._log.debug("Schedulng %s", call) typo: scheduling Line 96: with self._cond: Line 97: if self._running: Line 98: heapq.heappush(self._calls, call) Line 99: self._cond.notify()