Dan Kenigsberg has posted comments on this change.
Change subject: sp: fix spm start when failing to produce domain ......................................................................
Patch Set 2:
(1 comment)
http://gerrit.ovirt.org/#/c/25424/2/vdsm/storage/sp.py File vdsm/storage/sp.py:
Line 209: Line 210: self._backend.setDomainRegularRole(domain) Line 211: except Exception: Line 212: # log any exception, but keep going Line 213: self.log.error("Error trying to check/update domain %s role",
Dan,
Comparing the masterVersion is the way to tell which is your most-recent master domain. That's not a good-enough reason to erase the master role. And note again that you do not always erase it: if there has been a temporary IO failure you say "never mind", and go on as if nothing happened.
Why do we want to assume spm role if we cannot access our storage domains? Why such an spm is useful? Engine should select another spm in such condition, or reconstruct the master domain without the inaccessible domains.
It boils down to a simple condition: * If we must have a single master role, the suggested code gets us into a bad state. * if we may live with multiple master roles, we do not need _updateDomainsRole. Line 214: sdUUID, exc_info=True) Line 215: Line 216: @unsecured Line 217: def startSpm(self, prevID, prevLVER, maxHostID, expectedDomVersion=None):