Dan Kenigsberg has posted comments on this change.
Change subject: VDSM: implement nodeDeviceMapper
......................................................................
Patch Set 15: Code-Review-1
(15 comments)
http://gerrit.ovirt.org/#/c/30209/15//COMMIT_MSG
Commit Message:
Line 8:
Line 9: NodeDeviceMapper is structure to keep track of available node devices
Line 10: and provide easy access to querying and managing them. The mapper
Line 11: is needed in order to allow VDSM to meaningfully report the usage
Line 12: of these devices, along with managing their availability.
This patch introduces APIs. It should first introduce them. And only later describe the
implementation.
Please discuss this design. It inherently duplicates information that Vdsm and libvirt has
elsewhere. Why not extract info just when it's needed?
Consider dropping the mapper object, and first implement your API without a cache.
Line 13:
Line 14: Change-Id: I6f21d465a90cfe2eb16ba70943e5fcf1683f1656
http://gerrit.ovirt.org/#/c/30209/15/vdsm/clientIF.py
File vdsm/clientIF.py:
Line 90: else:
Line 91: self.gluster = None
Line 92: try:
Line 93: self.vmContainer = {}
Line 94: self.hostDeviceMapper = hostdev.HostDeviceMapper(self,
why is it useful to keep a cached instance here? why should the mapper object use
clientIF's logger?
Line 95: log)
Line 96: self._hostStats = sampling.HostStatsThread(log=log)
Line 97: self._hostStats.start()
Line 98: self.lastRemoteAccess = 0
http://gerrit.ovirt.org/#/c/30209/15/vdsm/hostdev.py
File vdsm/hostdev.py:
Line 1: #
Line 2: # Copyright 2008-2014 Red Hat, Inc.
this code is brand new.
Line 3: #
Line 4: # This program is free software; you can redistribute it and/or modify
Line 5: # it under the terms of the GNU General Public License as published by
Line 6: # the Free Software Foundation; either version 2 of the License, or
Line 40:
Line 41: class HostDevice(object):
Line 42:
Line 43: def __init__(self, node):
Line 44: self.node = node
that's not a "node", but a nodedevice, or nodedev.
Line 45: self.capability = ''
Line 46: self.vendor = ''
Line 47: self.vendor_id = ''
Line 48: self.product = ''
Line 93: vendorXML = caps.getElementsByTagName('vendor')[0]
Line 94: if vendorXML.hasAttribute('id'):
Line 95: self.vendor_id = vendorXML.getAttribute('id')
Line 96: self.vendor = vendorXML.childNodes[0].data
Line 97: except IndexError:
The try-block is too wide - it's not clear which IndexError you are expecting.
Line 98: pass
Line 99:
Line 100: try:
Line 101: productXML = caps.getElementsByTagName('product')[0]
Line 110: try:
Line 111: return os.path.basename(
Line 112: os.readlink('/sys/bus/pci/devices/%s/iommu_group' %
Line 113: self._getSysAddress(self.node.name())))
Line 114: except OSError:
I guess you should be more specific, and ignore only ENOENT.
If no group is found, returning None makes more sense.
Line 115: return ''
Line 116:
Line 117: def _getSysAddress(self, deviceName):
Line 118: try:
Line 153: self._vmContainer = cif.vmContainer
Line 154: self._instance = self
Line 155: self.populate()
Line 156:
Line 157: def populate(self):
do you plan to call populate() outside of this class? if not, define it as _private.
Line 158: """
Line 159: Method that scans all VMs in supported vmContainer, locates node
Line 160: devices and pairs them with VM in internal mapping, removing VMs or
Line 161: devices that no longer exist
Line 165:
Line 166: # add previously unknown devices
Line 167: for (deviceName, device) in hostDevices:
Line 168: if deviceName not in self._map:
Line 169: self._log.info('Found new host device: %s', deviceName)
do you plan to support hot-unplugging of host devices? when are they going to be dropped
from this map?
another problem is that when libvirtd dies, all the objects created using it become stale
and cannot be used. When we find this condition with dom objects, Vdsm commits suicide.
This may be another reason not to keep a long-living set of node devices. Instead, a new
list should be gathered in each call to hostdevListByCaps.
Line 170: self._map[deviceName] = [HostDevice(device), '']
Line 171:
Line 172: # remove devices that are no longer found on host and remove no longer
Line 173: # existing hosts from devices
Line 169: self._log.info('Found new host device: %s', deviceName)
Line 170: self._map[deviceName] = [HostDevice(device), '']
Line 171:
Line 172: # remove devices that are no longer found on host and remove no longer
Line 173: # existing hosts from devices
/me does not understand the English of the comment.
Line 174: for deviceName, assignment in self._map.items():
Line 175: if assignment[1] and assignment[1] not in self._vmContainer:
Line 176: self._log.info('Removing host %s from host device: %s',
Line 177: assignment[1], deviceName)
Line 170: self._map[deviceName] = [HostDevice(device), '']
Line 171:
Line 172: # remove devices that are no longer found on host and remove no longer
Line 173: # existing hosts from devices
Line 174: for deviceName, assignment in self._map.items():
"assignment" is a very odd name for a list of nodedevice and its owning VM.
Line 175: if assignment[1] and assignment[1] not in self._vmContainer:
Line 176: self._log.info('Removing host %s from host device: %s',
Line 177: assignment[1], deviceName)
Line 178: self.release(deviceName)
Line 186: try:
Line 187: for device in params.conf['devices']:
Line 188: if device['device'] == 'hostdev':
Line 189: try:
Line 190: self._map[device['name']][1] = vmId
each VM holds the name of the devices it owns. why do we need the reverse mapping here?
The only reason I can think of is optimization - but I suggest to have a simple
implementation first, that has no data duplication. If we find it slow, we can add a
cache.
Line 191: except KeyError:
Line 192: # VM has device which is no (longer) present
Line 193: # on the host - althought this should be handled by
Line 194: # user, warning can help debugging libvirt errors
Line 198: # needed as hostdev was not supported
Line 199: except KeyError:
Line 200: pass
Line 201:
Line 202: def acquire(self, deviceName, vmId):
when is this function going to be called? is it related to the currently-inttoduced API?
Line 203: """
Line 204: Method to account ownership of device to specified vmId and detach
Line 205: it from the host if it isn't already acquired
Line 206: """
Line 225: self._map[deviceName][1]))
Line 226: except KeyError:
Line 227: raise NotPresent('The device %s is not present on the '
Line 228: 'host!' % deviceName)
Line 229: except:
except:
raise
does nothign. please drop.
Line 230: raise
Line 231:
Line 232: return self._map[deviceName][0]
Line 233:
Line 262: 'product_id': '',
'iommu_group': ''},
Line 263: 'vmId': vmId]}
Line 264: """
Line 265: # device configuration may have changed: repopulate the map
Line 266: self.populate()
populate must not be called concurrently. Pleasae use something like _networkSemaphore
(but don't place it in clientIF, a module-local variable is much better)
Line 267:
Line 268: devices = {}
Line 269:
Line 270: for deviceName, assignment in self._map.items():
http://gerrit.ovirt.org/#/c/30209/15/vdsm/supervdsmServer
File vdsm/supervdsmServer:
Line 199: def prepareIommuGroup(self, iommu_group):
Line 200: """
Line 201: Sets iommu group in /dev/vfio/$GROUP to root:qemu 660 permissions
Line 202: """
Line 203: path = ''.join((_VFIO_DIR, iommu_group))
use udev rules instead.
Line 204: os.chown(path, 0, grp.getgrnam('qemu')[2])
Line 205: os.chmod(path, stat.S_IRUSR | stat.S_IWUSR |
Line 206: stat.S_IRGRP | stat.S_IWGRP)
Line 207:
--
To view, visit
http://gerrit.ovirt.org/30209
To unsubscribe, visit
http://gerrit.ovirt.org/settings
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f21d465a90cfe2eb16ba70943e5fcf1683f1656
Gerrit-PatchSet: 15
Gerrit-Project: vdsm
Gerrit-Branch: master
Gerrit-Owner: Martin Polednik <mpolednik(a)redhat.com>
Gerrit-Reviewer: Antoni Segura Puimedon <asegurap(a)redhat.com>
Gerrit-Reviewer: Dan Kenigsberg <danken(a)redhat.com>
Gerrit-Reviewer: Francesco Romani <fromani(a)redhat.com>
Gerrit-Reviewer: Martin Betak <mbetak(a)redhat.com>
Gerrit-Reviewer: Martin Polednik <mpolednik(a)redhat.com>
Gerrit-Reviewer: Michal Skrivanek <michal.skrivanek(a)redhat.com>
Gerrit-Reviewer: Vinzenz Feenstra <vfeenstr(a)redhat.com>
Gerrit-Reviewer: automation(a)ovirt.org
Gerrit-Reviewer: jian wang <wjian84(a)gmail.com>
Gerrit-Reviewer: oVirt Jenkins CI Server
Gerrit-HasComments: Yes