Francesco Romani has uploaded a new change for review.
Change subject: vm: safe(r) Vm.conf update. ......................................................................
vm: safe(r) Vm.conf update.
Vm.conf is updated in unsafe in quite too many creation flows. The effect is that we can get RuntimeError due to the concurrent access of two threads, like
Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/yajsonrpc/__init__.py", line 242, in sendReply encodedObjects.append(response.encode()) File "/usr/lib/python2.6/site-packages/yajsonrpc/__init__.py", line 140, in encode return json.dumps(res, 'utf-8') File "/usr/lib64/python2.6/json/__init__.py", line 237, in dumps **kw).encode(obj) File "/usr/lib64/python2.6/json/encoder.py", line 367, in encode chunks = list(self.iterencode(o)) File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode for chunk in self._iterencode_dict(o, markers): File "/usr/lib64/python2.6/json/encoder.py", line 275, in _iterencode_dict for chunk in self._iterencode(value, markers): File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode for chunk in self._iterencode_dict(o, markers): File "/usr/lib64/python2.6/json/encoder.py", line 275, in _iterencode_dict for chunk in self._iterencode(value, markers): File "/usr/lib64/python2.6/json/encoder.py", line 306, in _iterencode for chunk in self._iterencode_list(o, markers): File "/usr/lib64/python2.6/json/encoder.py", line 204, in _iterencode_list for chunk in self._iterencode(value, markers): File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode for chunk in self._iterencode_dict(o, markers): File "/usr/lib64/python2.6/json/encoder.py", line 247, in _iterencode_dict for key, value in items: RuntimeError: dictionary changed size during iteration
One of the most prominent issues is that the Device hierarchy (Vm.conf['devices']) is updated in the Vm.run() method, after the Vm object is registered into vmContainer, thus is queryable.
This is performed in an unsafe way, by doing a shallow copy of the device container. This patch fixes this problem by doing a deepcopy in the _run() flow. Doing so, the device conf appears to be atomically updated from the perspective of a client of the Vm class.
There are other, more rare, known failures like the one being fixed here, still caused by uncareful update of the Vm.conf data, in this flow on in other creation flows, but they will be addressed by separate patches once pinpointed.
Finally, this issue is yet another reminder that an overhaul of the handling of the conf data and of the device tree is long due and really needed, but there is still a long road ahead.
Bug-Url: https://bugzilla.redhat.com/1143968 Change-Id: Ie9dee2aa01b2c231b99e02a879dcfbd7ecc7f70a Signed-off-by: Francesco Romani fromani@redhat.com --- M vdsm/virt/vm.py 1 file changed, 5 insertions(+), 2 deletions(-)
git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/13/34813/1
diff --git a/vdsm/virt/vm.py b/vdsm/virt/vm.py index 9740ab3..c17327a 100644 --- a/vdsm/virt/vm.py +++ b/vdsm/virt/vm.py @@ -1395,7 +1395,10 @@ # For BC we need to save previous behaviour for old type parameters. # The new/old type parameter will be distinguished # by existence/absence of the 'devices' key - if self.conf.get('devices') is None: + with self._confLock: + devConf = deepcopy(self.conf.get('devices')) + + if devConf is None: devices[DISK_DEVICES] = self.getConfDrives() devices[NIC_DEVICES] = self.getConfNetworkInterfaces() devices[SOUND_DEVICES] = self.getConfSound() @@ -1403,7 +1406,7 @@ devices[GRAPHICS_DEVICES] = self.getConfGraphics() devices[CONTROLLER_DEVICES] = self.getConfController() else: - for dev in self.conf.get('devices'): + for dev in devConf: try: devices[dev['type']].append(dev) except KeyError: