Nir Soffer has uploaded a new change for review.
Change subject: utils: Add changehash function for change detection ......................................................................
utils: Add changehash function for change detection
We use Python built-in hash to detect changes in vm state without sending the state in each response. This function is not suitable for this usage. Now we use generic utils.changehash(), implemented using md5 hexdigest.
Change-Id: I2242a594383e2d2fe64e3a581f18b8ac662648b0 Signed-off-by: Nir Soffer nsoffer@redhat.com --- M lib/vdsm/utils.py M vdsm/virt/vm.py 2 files changed, 13 insertions(+), 2 deletions(-)
git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/45/33045/1
diff --git a/lib/vdsm/utils.py b/lib/vdsm/utils.py index 23c63e8..1b4a9d5 100644 --- a/lib/vdsm/utils.py +++ b/lib/vdsm/utils.py @@ -37,6 +37,7 @@ import glob import io import itertools +import hashlib import logging import re import sys @@ -1133,3 +1134,13 @@ flags = fcntl.fcntl(fd, fcntl.F_GETFL) flags |= os.O_NONBLOCK fcntl.fcntl(fd, fcntl.F_SETFL, flags) + + +def changehash(s): + """ + Returns a hash of string s, suitable for change detection. + + Tipically changehash(s) is sent to client frequently. When a client detect + that changehash(s) changed, it ask for s itself, which may be much bigger. + """ + return hashlib.md5(s).hexdigest() diff --git a/vdsm/virt/vm.py b/vdsm/virt/vm.py index 941f283..b1567f9 100644 --- a/vdsm/virt/vm.py +++ b/vdsm/virt/vm.py @@ -1500,7 +1500,7 @@ self.guestAgent = guestagent.GuestAgent( self._guestSocketFile, self.cif.channelListener, self.log) self._lastXMLDesc = '<domain><uuid>%s</uuid></domain>' % self.id - self._devXmlHash = '0' + self._devXmlHash = utils.changehash('') self._released = False self._releaseLock = threading.Lock() self.saveState() @@ -4495,7 +4495,7 @@ self._lastXMLDesc = self._dom.XMLDesc(0) devxml = _domParseStr(self._lastXMLDesc).childNodes[0]. \ getElementsByTagName('devices')[0] - self._devXmlHash = str(hash(devxml.toxml())) + self._devXmlHash = utils.changehash(devxml.toxml())
return self._lastXMLDesc
oVirt Jenkins CI Server has posted comments on this change.
Change subject: utils: Add changehash function for change detection ......................................................................
Patch Set 1:
Build Failed
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit_el/11582/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_unit-tests_created/12526/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_virt_functional_tests_gerrit/1690/ : There was an infra issue, please contact infra@ovirt.org
http://jenkins.ovirt.org/job/vdsm_master_pep8_gerrit/12371/ : SUCCESS
Francesco Romani has posted comments on this change.
Change subject: utils: Add changehash function for change detection ......................................................................
Patch Set 1:
neat! The only thing that worries me is how Engine is supposed to use this value. I mean, it is considered just as an opaque string that it is stored and checked for equality? If so it doesn't really matter how long it is and how it looks like, and we can happily go ahead.
As for the change itself solid +1.
Nir Soffer has posted comments on this change.
Change subject: utils: Add changehash function for change detection ......................................................................
Patch Set 1:
I hope that engine treat this as opaque string, hopefully one of the reviewers can confirm that.
Dan Kenigsberg has posted comments on this change.
Change subject: utils: Add changehash function for change detection ......................................................................
Patch Set 1:
(1 comment)
http://gerrit.ovirt.org/#/c/33045/1//COMMIT_MSG Commit Message:
Line 6: Line 7: utils: Add changehash function for change detection Line 8: Line 9: We use Python built-in hash to detect changes in vm state without sending Line 10: the state in each response. This function is not suitable for this Could you explain why the lightweight hash() is not good enough for our use case, and why an expensive crypto hash is favorable? Line 11: usage. Now we use generic utils.changehash(), implemented using md5 Line 12: hexdigest. Line 13: Line 14: Change-Id: I2242a594383e2d2fe64e3a581f18b8ac662648b0
Nir Soffer has posted comments on this change.
Change subject: utils: Add changehash function for change detection ......................................................................
Patch Set 1:
(1 comment)
http://gerrit.ovirt.org/#/c/33045/1//COMMIT_MSG Commit Message:
Line 6: Line 7: utils: Add changehash function for change detection Line 8: Line 9: We use Python built-in hash to detect changes in vm state without sending Line 10: the state in each response. This function is not suitable for this
Could you explain why the lightweight hash() is not good enough for our use
Using the built-in hash may be good enough, specially on 64 bit systems. The reason I suggest to use md5 instead is that this is the right tool for this job.
Why md5 is favorable:
- Output is stable, will not change between versions of Python, or even between runs. For example, when using the -R option, Python will randomize the hash on each run, invalidating engine hashes without change in vm state. - Easier to compose multiple hashes. For example, when we hash multiple hashes using a tuple (http://gerrit.ovirt.org/#/c/31701/13/vdsm/virt/vm.py), we are using different hash, is is good enough as the string hash? - Standard - Documented
The built-in hash is much faster then md5, but we don't use it in a tight loop, and md5 should be fast enough to use when devices change on a vm. Computing the digest will probably be hidden by the cost of parsing and formatting the xml anyway.
For example, on a minidell (9020): >>> timeit.timeit('hashlib.md5(s).hexdigest()', 'import hashlib; s = open("domain.xml").read()', number=1000) 0.007387876510620117 Line 11: usage. Now we use generic utils.changehash(), implemented using md5 Line 12: hexdigest. Line 13: Line 14: Change-Id: I2242a594383e2d2fe64e3a581f18b8ac662648b0
Roy Golan has posted comments on this change.
Change subject: utils: Add changehash function for change detection ......................................................................
Patch Set 1:
engine checks if the string has changed. agnostic to how it is generated of course
Vinzenz Feenstra has posted comments on this change.
Change subject: utils: Add changehash function for change detection ......................................................................
Patch Set 1: Code-Review+1
Francesco Romani has posted comments on this change.
Change subject: utils: Add changehash function for change detection ......................................................................
Patch Set 1: Code-Review+1
Francesco Romani has posted comments on this change.
Change subject: utils: Add changehash function for change detection ......................................................................
Patch Set 1:
(1 comment)
http://gerrit.ovirt.org/#/c/33045/1//COMMIT_MSG Commit Message:
Line 6: Line 7: utils: Add changehash function for change detection Line 8: Line 9: We use Python built-in hash to detect changes in vm state without sending Line 10: the state in each response. This function is not suitable for this
Using the built-in hash may be good enough, specially on 64 bit systems. Th
I suggest to augment the commit message with this comment to make it obvious at first glance why this change is good. Line 11: usage. Now we use generic utils.changehash(), implemented using md5 Line 12: hexdigest. Line 13: Line 14: Change-Id: I2242a594383e2d2fe64e3a581f18b8ac662648b0
oVirt Jenkins CI Server has posted comments on this change.
Change subject: utils: Add changehash function for change detection ......................................................................
Patch Set 2:
Build Failed
http://jenkins.ovirt.org/job/vdsm_master_unit-tests_created/13268/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_pep8_gerrit/13108/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit_el/12318/ : FAILURE
http://jenkins.ovirt.org/job/vdsm_master_virt_functional_tests_gerrit/1869/ : There was an infra issue, please contact infra@ovirt.org
Nir Soffer has posted comments on this change.
Change subject: utils: Add changehash function for change detection ......................................................................
Patch Set 2:
This version is rebased and supports hashing of multiple values.
Vinzenz Feenstra has posted comments on this change.
Change subject: utils: Add changehash function for change detection ......................................................................
Patch Set 2: Code-Review+1
Nir Soffer has posted comments on this change.
Change subject: utils: Add changehash function for change detection ......................................................................
Patch Set 2: Verified-1
It does not work - fucking engine wants up to 30 characters for the hash, and md5 in hex is 32 :-)
2014-10-30 11:19:49,778 ERROR [org.ovirt.engine.core.dal.dbbroker.BatchProcedureExecutionConnectionCallback] (DefaultQuartzScheduler_Worker-92) Can't execute batch. Next exception is: : org.postgresql.util.PSQLException: ERROR: value too long for type character varying(30)
Nir Soffer has posted comments on this change.
Change subject: utils: Add changehash function for change detection ......................................................................
Patch Set 3: Verified+1
Version 3 fix hash limit to 30 characters, required to work with Engine.
Verified on rhel6 and rhel7 hosts, by creating and provisioning vms and migration between hosts.
oVirt Jenkins CI Server has posted comments on this change.
Change subject: utils: Add changehash function for change detection ......................................................................
Patch Set 3:
Build Failed
http://jenkins.ovirt.org/job/vdsm_master_unit-tests_created/13273/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_pep8_gerrit/13113/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit_el/12323/ : FAILURE
http://jenkins.ovirt.org/job/vdsm_master_virt_functional_tests_gerrit/1872/ : There was an infra issue, please contact infra@ovirt.org
Nir Soffer has posted comments on this change.
Change subject: utils: Add changehash function for change detection ......................................................................
Patch Set 3:
Verified on rhel6 and rhel7 hosts, by creating and provisioning vms and migration between hosts.
Francesco Romani has posted comments on this change.
Change subject: utils: Add changehash function for change detection ......................................................................
Patch Set 3: Code-Review+1
(1 comment)
http://gerrit.ovirt.org/#/c/33045/3/lib/vdsm/utils.py File lib/vdsm/utils.py:
Line 1169: h = hashlib.md5() Line 1170: for s in strings: Line 1171: h.update(s) Line 1172: # Engine stores this in varchar(30) Line 1173: return h.hexdigest()[:30] I wonder if this alter the properties of the hash like the distribution. IIRC it should not, and a quick google search seems to confirm http://stackoverflow.com/questions/8184941/uniform-distribution-of-truncated...
Vinzenz Feenstra has posted comments on this change.
Change subject: utils: Add changehash function for change detection ......................................................................
Patch Set 3: Code-Review+1
Adam Litke has posted comments on this change.
Change subject: utils: Add changehash function for change detection ......................................................................
Patch Set 3: Code-Review+1
Federico Simoncelli has posted comments on this change.
Change subject: utils: Add changehash function for change detection ......................................................................
Patch Set 3: Code-Review+2
Nir Soffer has posted comments on this change.
Change subject: utils: Add changehash function for change detection ......................................................................
Patch Set 3: Code-Review-1
Replace md5 with faster sha1.
Nir Soffer has posted comments on this change.
Change subject: utils: Add changehash function for change detection ......................................................................
Patch Set 4:
This version replaces md5 with sha1, which is little bit faster.
oVirt Jenkins CI Server has posted comments on this change.
Change subject: utils: Add changehash function for change detection ......................................................................
Patch Set 4:
Build Failed
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit_el/12862/ : FAILURE
http://jenkins.ovirt.org/job/vdsm_master_unit-tests_created/13814/ : FAILURE
http://jenkins.ovirt.org/job/vdsm_master_virt_functional_tests_gerrit/1970/ : There was an infra issue, please contact infra@ovirt.org
http://jenkins.ovirt.org/job/vdsm_master_pep8_gerrit/13651/ : SUCCESS
Nir Soffer has posted comments on this change.
Change subject: utils: Add changehash function for change detection ......................................................................
Patch Set 4:
ping?
Jenkins CI RO has abandoned this change.
Change subject: utils: Add changehash function for change detection ......................................................................
Abandoned
Abandoned due to no activity - please restore if still relevant
gerrit-hooks has posted comments on this change.
Change subject: utils: Add changehash function for change detection ......................................................................
Patch Set 4:
* Update tracker: IGNORE, no Bug-Url found
vdsm-patches@lists.fedorahosted.org