Nir Soffer has uploaded a new change for review.
Change subject: utils: Faster version of list2cmdline ......................................................................
utils: Faster version of list2cmdline
The original version from the subprocess module show on the top of a profile on idle vdsm, taking more then 10% of cpu time. This patch replaces it with a much simpler and faster version, which does not try to be correct for all inputs on all platforms, but should be good enough for vdsm commands on Linux.
Before this change, list2cmdline is the second most expensive function in the profile:
$ profile-stats -c vdsmd.prof Fri May 9 20:53:16 2014 vdsmd.prof
14005829 function calls (14064289 primitive calls) in 83.586 CPU seconds
Ordered by: internal time List reduced from 1701 to 20 due to restriction <20>
ncalls tottime percall cumtime percall filename:lineno(function) 73/103 12.781 0.175 17.255 0.168 threading.py:481(Thread.run) 3230 6.309 0.002 8.907 0.003 subprocess.py:509(list2cmdline)
Using this version, list2cmdline dropped to 91th place:
2435 0.155 0.000 0.171 0.000 utils.py:646(list2cmdline)
Normalizing the cumtime by number of calls, this change saved 8.6 cpu seconds of 83 seconds, more then 10% saving.
Change-Id: Ibfc7819c126fe51cc068cc6f9f85b1002878385b Signed-off-by: Nir Soffer nsoffer@redhat.com --- M lib/vdsm/utils.py 1 file changed, 21 insertions(+), 2 deletions(-)
git pull ssh://gerrit.ovirt.org:29418/vdsm refs/changes/48/27548/1
diff --git a/lib/vdsm/utils.py b/lib/vdsm/utils.py index 59b021d..bacfb8b 100644 --- a/lib/vdsm/utils.py +++ b/lib/vdsm/utils.py @@ -49,7 +49,6 @@ import signal import socket import stat -import subprocess import threading import time import zombiereaper @@ -644,6 +643,26 @@ self._poller.close()
+def list2cmdline(seq): + """ + Convert argument list for exeCmd to string that can be used in the command + line to perform the same command. + + The builtin Python version in the subprocess module is slow, responsible + for 10% or cpu usage on an idle vdsm. This version may not be correct for + all inputs on all platforms, but it should be ok with all commands we use + on vdsm on Linux. + """ + parts = [] + for arg in seq: + if "'" in arg: + arg = arg.replace("'", "'") + if ' ' in arg: + arg = "'" + arg + "'" + parts.append(arg) + return ' '.join(parts) + + def execCmd(command, sudo=False, cwd=None, data=None, raw=False, logErr=True, printable=None, env=None, sync=True, nice=None, ioclass=None, ioclassdata=None, setsid=False, execCmdLogger=logging.root, @@ -680,7 +699,7 @@ if not printable: printable = command
- cmdline = repr(subprocess.list2cmdline(printable)) + cmdline = repr(list2cmdline(printable)) execCmdLogger.debug("%s (cwd %s)", cmdline, cwd)
p = CPopen(command, close_fds=True, cwd=cwd, env=env,
oVirt Jenkins CI Server has posted comments on this change.
Change subject: utils: Faster version of list2cmdline ......................................................................
Patch Set 1:
Build Failed
http://jenkins.ovirt.org/job/vdsm_master_pep8_gerrit/8709/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit_el/7919/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit/8840/ : FAILURE
Nir Soffer has posted comments on this change.
Change subject: utils: Faster version of list2cmdline ......................................................................
Patch Set 1:
This is draft because the implementation is maybe too simple and may not be correct for some inputs. I wrote it in few minutes just to see the effect on the profile. And of course it has no tests. Otherwise, looking at the log it seems to work.
oVirt Jenkins CI Server has posted comments on this change.
Change subject: utils: Faster version of list2cmdline ......................................................................
Patch Set 1:
Build Failed
http://jenkins.ovirt.org/job/vdsm_master_pep8_gerrit/8709/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit_el/7919/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit/8841/ : FAILURE
Nir Soffer has posted comments on this change.
Change subject: utils: Faster version of list2cmdline ......................................................................
Patch Set 1:
jenkins is sick again.
Dan Kenigsberg has posted comments on this change.
Change subject: utils: Faster version of list2cmdline ......................................................................
Patch Set 1:
(2 comments)
http://gerrit.ovirt.org/#/c/27548/1/lib/vdsm/utils.py File lib/vdsm/utils.py:
Line 656: parts = [] Line 657: for arg in seq: Line 658: if "'" in arg: Line 659: arg = arg.replace("'", "'") Line 660: if ' ' in arg: doublequotes is intentionaly ignored? If so, pls mention in docstring Line 661: arg = "'" + arg + "'" Line 662: parts.append(arg) Line 663: return ' '.join(parts) Line 664:
Line 699: if not printable: Line 700: printable = command Line 701: Line 702: cmdline = repr(list2cmdline(printable)) Line 703: execCmdLogger.debug("%s (cwd %s)", cmdline, cwd) Using %r here (and elsewhere) would save a couple of cycles if non-debug log level is set. (Another patch?) Line 704: Line 705: p = CPopen(command, close_fds=True, cwd=cwd, env=env, Line 706: deathSignal=deathSignal, childUmask=childUmask) Line 707: p = AsyncProc(p)
Nir Soffer has posted comments on this change.
Change subject: utils: Faster version of list2cmdline ......................................................................
Patch Set 1:
(2 comments)
http://gerrit.ovirt.org/#/c/27548/1/lib/vdsm/utils.py File lib/vdsm/utils.py:
Line 656: parts = [] Line 657: for arg in seq: Line 658: if "'" in arg: Line 659: arg = arg.replace("'", "'") Line 660: if ' ' in arg:
doublequotes is intentionaly ignored? If so, pls mention in docstring
I assume that quoting with single quote is the simplest and safest way, so I can ignore anything "special" inside the string, except another single quote. Line 661: arg = "'" + arg + "'" Line 662: parts.append(arg) Line 663: return ' '.join(parts) Line 664:
Line 699: if not printable: Line 700: printable = command Line 701: Line 702: cmdline = repr(list2cmdline(printable)) Line 703: execCmdLogger.debug("%s (cwd %s)", cmdline, cwd)
Using %r here (and elsewhere) would save a couple of cycles if non-debug lo
How %r would be less expensive then %s? Line 704: Line 705: p = CPopen(command, close_fds=True, cwd=cwd, env=env, Line 706: deathSignal=deathSignal, childUmask=childUmask) Line 707: p = AsyncProc(p)
oVirt Jenkins CI Server has posted comments on this change.
Change subject: utils: Faster version of list2cmdline ......................................................................
Patch Set 2:
Build Failed
http://jenkins.ovirt.org/job/vdsm_master_pep8_gerrit/8719/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit/8851/ : FAILURE
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit_el/7929/ : SUCCESS
Nir Soffer has posted comments on this change.
Change subject: utils: Faster version of list2cmdline ......................................................................
Patch Set 2:
No code change, I think this is only a branch name change.
oVirt Jenkins CI Server has posted comments on this change.
Change subject: utils: Faster version of list2cmdline ......................................................................
Patch Set 2:
Build Successful
http://jenkins.ovirt.org/job/vdsm_master_pep8_gerrit/8719/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit_el/7929/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit/8854/ : SUCCESS
Yaniv Bronhaim has posted comments on this change.
Change subject: utils: Faster version of list2cmdline ......................................................................
Patch Set 2:
(2 comments)
http://gerrit.ovirt.org/#/c/27548/2/lib/vdsm/utils.py File lib/vdsm/utils.py:
Line 652: or s/or/of
Line 659: arg = arg.replace("'", "'") Line 660: if ' ' in arg: Line 661: arg = "'" + arg + "'" Line 662: parts.append(arg) Line 663: return ' '.join(parts) did you compare the cpu usage with this implementation ? Line 664: Line 665: Line 666: def execCmd(command, sudo=False, cwd=None, data=None, raw=False, logErr=True, Line 667: printable=None, env=None, sync=True, nice=None, ioclass=None,
Yaniv Bronhaim has posted comments on this change.
Change subject: utils: Faster version of list2cmdline ......................................................................
Patch Set 2:
(1 comment)
http://gerrit.ovirt.org/#/c/27548/2/lib/vdsm/utils.py File lib/vdsm/utils.py:
Line 659: arg = arg.replace("'", "'") Line 660: if ' ' in arg: Line 661: arg = "'" + arg + "'" Line 662: parts.append(arg) Line 663: return ' '.join(parts)
did you compare the cpu usage with this implementation ?
oh.. forgot the commit message part after reading your long description here Line 664: Line 665: Line 666: def execCmd(command, sudo=False, cwd=None, data=None, raw=False, logErr=True, Line 667: printable=None, env=None, sync=True, nice=None, ioclass=None,
Alon Bar-Lev has posted comments on this change.
Change subject: utils: Faster version of list2cmdline ......................................................................
Patch Set 2: Code-Review-1
any attempt to convert arguments to string and escape will result in security issues.
I am not sure why we do not use the args=() parameter of subprocess which is the safest approach.
if any solution should be applied is to call the subprocess with explicit list, if not already done so.
Nir Soffer has posted comments on this change.
Change subject: utils: Faster version of list2cmdline ......................................................................
Patch Set 2:
Alon, we do use args to run subprocess and we do not use shell=True, so there is no security issue here.
We use subprocess.list2cmdline to write the commands we run to the log. This make it very easy to debug the system by coying commands that vdsm run from the log and running them in the shell.
The original function is little expensive, and this patch suggest to to replace it with a quicker version.
I hope it is more clear now.
Alon Bar-Lev has posted comments on this change.
Change subject: utils: Faster version of list2cmdline ......................................................................
Patch Set 2: -Code-Review
ok, if it is only for log.
Alon Bar-Lev has posted comments on this change.
Change subject: utils: Faster version of list2cmdline ......................................................................
Patch Set 2:
Provided this is only for[1]
Maybe a solution of a parameter to just not log cmdline is enough?
[1] http://gerrit.ovirt.org/#/c/27553/
oVirt Jenkins CI Server has posted comments on this change.
Change subject: utils: Faster version of list2cmdline ......................................................................
Patch Set 3:
Build Successful
http://jenkins.ovirt.org/job/vdsm_master_pep8_gerrit/8768/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit/8904/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit_el/7978/ : SUCCESS
Dan Kenigsberg has posted comments on this change.
Change subject: utils: Faster version of list2cmdline ......................................................................
Patch Set 1:
(2 comments)
http://gerrit.ovirt.org/#/c/27548/1/lib/vdsm/utils.py File lib/vdsm/utils.py:
Line 656: parts = [] Line 657: for arg in seq: Line 658: if "'" in arg: Line 659: arg = arg.replace("'", "'") Line 660: if ' ' in arg:
I assume that quoting with single quote is the simplest and safest way, so
doh, my mixup. Line 661: arg = "'" + arg + "'" Line 662: parts.append(arg) Line 663: return ' '.join(parts) Line 664:
Line 699: if not printable: Line 700: printable = command Line 701: Line 702: cmdline = repr(list2cmdline(printable)) Line 703: execCmdLogger.debug("%s (cwd %s)", cmdline, cwd)
How %r would be less expensive then %s?
with %r, repr() would not be called at all (if log level is higher then DEBUG. Line 704: Line 705: p = CPopen(command, close_fds=True, cwd=cwd, env=env, Line 706: deathSignal=deathSignal, childUmask=childUmask) Line 707: p = AsyncProc(p)
Nir Soffer has posted comments on this change.
Change subject: utils: Faster version of list2cmdline ......................................................................
Patch Set 1:
(1 comment)
http://gerrit.ovirt.org/#/c/27548/1/lib/vdsm/utils.py File lib/vdsm/utils.py:
Line 699: if not printable: Line 700: printable = command Line 701: Line 702: cmdline = repr(list2cmdline(printable)) Line 703: execCmdLogger.debug("%s (cwd %s)", cmdline, cwd)
with %r, repr() would not be called at all (if log level is higher then DEB
Correct, I missed that line with the repr(). I don't think this will make any difference but it does not make sense to use repr in this context. Line 704: Line 705: p = CPopen(command, close_fds=True, cwd=cwd, env=env, Line 706: deathSignal=deathSignal, childUmask=childUmask) Line 707: p = AsyncProc(p)
Nir Soffer has posted comments on this change.
Change subject: utils: Faster version of list2cmdline ......................................................................
Patch Set 4: Verified+1
This version fixes the naive implementation and I believe it is correct for all inputs on Linux, so any command that vdsm runs will never require modification for copying the command from the log and running it in the shell.
Verified by profiling as described in the commit message.
oVirt Jenkins CI Server has posted comments on this change.
Change subject: utils: Faster version of list2cmdline ......................................................................
Patch Set 4:
Build Successful
http://jenkins.ovirt.org/job/vdsm_master_pep8_gerrit/8807/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit/8943/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit_el/8017/ : SUCCESS
Dan Kenigsberg has posted comments on this change.
Change subject: utils: Faster version of list2cmdline ......................................................................
Patch Set 4: Code-Review-1
(2 comments)
http://gerrit.ovirt.org/#/c/27548/4/lib/vdsm/utils.py File lib/vdsm/utils.py:
Line 644: self._poller.close() Line 645: Line 646: Line 647: # ASCII alphanumeric characters and some other safe characters do not need Line 648: # quoting. Anything else will be quoted in _list2cmdline. pipes.quote has
_safechars = frozenset(string.ascii_letters + string.digits + '@%_-+=:,./')
I trust that re is faster, but please match the set of chars. Line 649: _unsafe = re.compile(r'[^a-zA-Z0-9_-=/:,]').search Line 650: Line 651: Line 652: def _list2cmdline(args):
Line 645: Line 646: Line 647: # ASCII alphanumeric characters and some other safe characters do not need Line 648: # quoting. Anything else will be quoted in _list2cmdline. Line 649: _unsafe = re.compile(r'[^a-zA-Z0-9_-=/:,]').search utils is growing big; please use a more specific function name (_requires_quote ?) so as not to confuse future reviewers. Line 650: Line 651: Line 652: def _list2cmdline(args): Line 653: """
Nir Soffer has posted comments on this change.
Change subject: utils: Faster version of list2cmdline ......................................................................
Patch Set 4:
(3 comments)
http://gerrit.ovirt.org/#/c/27548/4/lib/vdsm/utils.py File lib/vdsm/utils.py:
Line 644: self._poller.close() Line 645: Line 646: Line 647: # ASCII alphanumeric characters and some other safe characters do not need Line 648: # quoting. Anything else will be quoted in _list2cmdline.
pipes.quote has
Nice, I'll update the list. Line 649: _unsafe = re.compile(r'[^a-zA-Z0-9_-=/:,]').search Line 650: Line 651: Line 652: def _list2cmdline(args):
Line 645: Line 646: Line 647: # ASCII alphanumeric characters and some other safe characters do not need Line 648: # quoting. Anything else will be quoted in _list2cmdline. Line 649: _unsafe = re.compile(r'[^a-zA-Z0-9_-=/:,]').search
utils is growing big; please use a more specific function name (_requires_q
Good idea. Line 650: Line 651: Line 652: def _list2cmdline(args): Line 653: """
Line 703: Line 704: if not printable: Line 705: printable = command Line 706: Line 707: cmdline = repr(_list2cmdline(printable)) Will replace repr with %r in a separate patch Line 708: execCmdLogger.debug("%s (cwd %s)", cmdline, cwd) Line 709: Line 710: p = CPopen(command, close_fds=True, cwd=cwd, env=env, Line 711: deathSignal=deathSignal, childUmask=childUmask)
Nir Soffer has posted comments on this change.
Change subject: utils: Faster version of list2cmdline ......................................................................
Patch Set 4: Verified-1
Testing copy and paste complex commands from the log reveal that the quoting is wrong.
Nir Soffer has posted comments on this change.
Change subject: utils: Correct and faster list2cmdline ......................................................................
Patch Set 5: Verified+1 Code-Review+1
Changes:
- Address Dan comments - Fix quoting of single quote - Remove the repr() which render the output of the new function useless
Verified on Fedora 19 system by activating/deactivating the host and inspecting the log. I tested that lvm commamnds can be copy and run in the shell correctly.
Nir Soffer has posted comments on this change.
Change subject: utils: Correct and faster list2cmdline ......................................................................
Patch Set 5: -Code-Review
oVirt Jenkins CI Server has posted comments on this change.
Change subject: utils: Correct and faster list2cmdline ......................................................................
Patch Set 5:
Build Successful
http://jenkins.ovirt.org/job/vdsm_master_pep8_gerrit/8903/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit/9039/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit_el/8113/ : SUCCESS
Nir Soffer has posted comments on this change.
Change subject: utils: Correct and faster list2cmdline ......................................................................
Patch Set 6: Verified+1
Rebase and update profiling results with current master.
Verified by the tests and by inspecting the logs and copying complext lvm commands from the log and running them in the shell.
Here is an example complex lvm command that use correct quoting now:
/usr/bin/sudo -n /usr/sbin/lvm pvs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ '''a|/dev/mapper/1IET_00010001|/dev/mapper/1IET_00010002|/dev/mapper/1IET_00010003|/dev/mapper/1IET_00010004|/dev/mapper/1IET_00010005|/dev/mapper/1IET_00010006|/dev/mapper/1IET_00010007|/dev/mapper/1IET_00010008|/dev/mapper/1IET_00010009|/dev/mapper/1IET_0001000a|/dev/mapper/1IET_00020001|/dev/mapper/1IET_00020002|/dev/mapper/1IET_00020003|/dev/mapper/1IET_00020004|/dev/mapper/1IET_00020005|/dev/mapper/1IET_00020006|/dev/mapper/1IET_00020007|/dev/mapper/1IET_00020008|/dev/mapper/1IET_00020009|/dev/mapper/1IET_0002000a|/dev/mapper/3600144f09dbd05000000522dde040011|/dev/mapper/3600144f09dbd05000000522dde050012|/dev/mapper/3600144f09dbd05000000522dde070013|/dev/mapper/3600144f09dbd05000000522dde090014|/dev/mapper/3600144f09dbd05000000522dde0a0015|/dev/mapper/3600144f! 09dbd05000000522dde0c0016|/dev/mapper/3600144f09dbd05000000522dde0d0017|/dev/mapper/3600144f09dbd05000000522dde0f0018|/dev/mapper/3600144f09dbd05000000522dde100019|/dev/mapper/3600144f09dbd05000000522dde12001a|''', '''r|.*|''' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,size,vg_name,vg_uuid,pe_start,pe_count,pe_alloc_count,mda_count,dev_size
oVirt Jenkins CI Server has posted comments on this change.
Change subject: utils: Correct and faster list2cmdline ......................................................................
Patch Set 6:
Build Failed
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit/9743/ : FAILURE
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit_el/8810/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_pep8_gerrit/9596/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_storage_functional_tests_localfs_ge... : FAILURE
oVirt Jenkins CI Server has posted comments on this change.
Change subject: utils: Correct and faster list2cmdline ......................................................................
Patch Set 6:
Build Failed
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit_el/8810/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_pep8_gerrit/9596/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_storage_functional_tests_localfs_ge... : FAILURE
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit/9744/ : FAILURE
oVirt Jenkins CI Server has posted comments on this change.
Change subject: utils: Correct and faster list2cmdline ......................................................................
Patch Set 6:
Build Successful
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit/9746/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit_el/8810/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_pep8_gerrit/9596/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_storage_functional_tests_localfs_ge... : SUCCESS
Federico Simoncelli has posted comments on this change.
Change subject: utils: Correct and faster list2cmdline ......................................................................
Patch Set 6: Code-Review-1
(1 comment)
Marking -1 for visibility.
http://gerrit.ovirt.org/#/c/27548/6/lib/vdsm/utils.py File lib/vdsm/utils.py:
Line 657: parts = [] Line 658: for arg in args: Line 659: if _needs_quoting(arg) or arg == '': Line 660: quoted = ("'" + s + "'" for s in arg.split("'")) Line 661: arg = r"'".join(quoted) I suggest:
def _list2cmdline(args): """...""" parts = [] for arg in args: if _needs_quoting(arg) or arg == '': arg = "'" + arg.replace("'", r"'''") + "'" parts.append(arg) return ' '.join(parts)
It's clearer, it passes the tests and profiling it seems also faster (-45%). Line 662: parts.append(arg) Line 663: return ' '.join(parts) Line 664: Line 665:
Nir Soffer has posted comments on this change.
Change subject: utils: Correct and faster list2cmdline ......................................................................
Patch Set 6:
(1 comment)
http://gerrit.ovirt.org/#/c/27548/6/lib/vdsm/utils.py File lib/vdsm/utils.py:
Line 657: parts = [] Line 658: for arg in args: Line 659: if _needs_quoting(arg) or arg == '': Line 660: quoted = ("'" + s + "'" for s in arg.split("'")) Line 661: arg = r"'".join(quoted)
I suggest:
Nice!
Since you already tested and profiled it, I suggest that you upload the next patch with this change and updated commit message and mark it as verified. Line 662: parts.append(arg) Line 663: return ' '.join(parts) Line 664: Line 665:
Federico Simoncelli has posted comments on this change.
Change subject: utils: Correct and faster list2cmdline ......................................................................
Patch Set 6: -Code-Review
(1 comment)
http://gerrit.ovirt.org/#/c/27548/6/lib/vdsm/utils.py File lib/vdsm/utils.py:
Line 657: parts = [] Line 658: for arg in args: Line 659: if _needs_quoting(arg) or arg == '': Line 660: quoted = ("'" + s + "'" for s in arg.split("'")) Line 661: arg = r"'".join(quoted)
Nice!
Babysitting other people's patches obviously doesn't scale. Everyone should be proud of trying and improve his own patches and must take responsibility for them. I'll remove my -1 since it is obvious that the "visibility" part worked. Line 662: parts.append(arg) Line 663: return ' '.join(parts) Line 664: Line 665:
Yaniv Bronhaim has posted comments on this change.
Change subject: utils: Correct and faster list2cmdline ......................................................................
Patch Set 6: Code-Review+1
good and better as is, but updating it with federico's suggestion could be even better
Nir Soffer has posted comments on this change.
Change subject: utils: Correct and faster list2cmdline ......................................................................
Patch Set 7: Verified+1
This version use simpler and faster quoting suggested by Federico. I did not profiled again since the difference is negligible.
The original version was verified during profiling, this version was verified by running the tests.
oVirt Jenkins CI Server has posted comments on this change.
Change subject: utils: Correct and faster list2cmdline ......................................................................
Patch Set 7:
Build Successful
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit_el/9261/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_pep8_gerrit/10045/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit/10200/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_verify-error-codes_merged/5127/ : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_storage-functional-tests-localfs_ge... : SUCCESS
http://jenkins.ovirt.org/job/vdsm_master_unit-tests_merged/3284/ : SUCCESS
Saggi Mizrahi has posted comments on this change.
Change subject: utils: Correct and faster list2cmdline ......................................................................
Patch Set 7:
(1 comment)
Minor suggestion
http://gerrit.ovirt.org/#/c/27548/7/lib/vdsm/utils.py File lib/vdsm/utils.py:
Line 731: Line 732: if not printable: Line 733: printable = command Line 734: Line 735: execCmdLogger.debug("%s (cwd %s)", _list2cmdline(printable), cwd) Even though the new function is a significant improvement I suggest we add:
if execCmdLogger.isEnabledFor(logging.DEBUG):
So that we don't even run _list2cmdline() if debug logging is turned off. Line 736: Line 737: p = CPopen(command, close_fds=True, cwd=cwd, env=env, Line 738: deathSignal=deathSignal, childUmask=childUmask) Line 739: if not sync:
Nir Soffer has posted comments on this change.
Change subject: utils: Correct and faster list2cmdline ......................................................................
Patch Set 7:
(1 comment)
http://gerrit.ovirt.org/#/c/27548/7/lib/vdsm/utils.py File lib/vdsm/utils.py:
Line 731: Line 732: if not printable: Line 733: printable = command Line 734: Line 735: execCmdLogger.debug("%s (cwd %s)", _list2cmdline(printable), cwd)
Even though the new function is a significant improvement I suggest we add:
Sure we can, but using this all over the code is little ugly.
How about something more general that we can reuse a lot:
class LazyString(object):
__slots__ = ("func", "args", "kwargs")
def __init__(self, func, *args, **kwargs): self.func = func self.args = args self.kwargs = kwargs
def __str__(self): return self.func(*args, **kwargs)
So now this becomes:
cmdline = LazyString(_list2cmdline, printable) execCmdLogger.debug("%s (cwd %s)", cmdline, cwd)
Not sure about the LazyString, just the first name that I could think about. Line 736: Line 737: p = CPopen(command, close_fds=True, cwd=cwd, env=env, Line 738: deathSignal=deathSignal, childUmask=childUmask) Line 739: if not sync:
Saggi Mizrahi has posted comments on this change.
Change subject: utils: Correct and faster list2cmdline ......................................................................
Patch Set 7:
(1 comment)
http://gerrit.ovirt.org/#/c/27548/7/lib/vdsm/utils.py File lib/vdsm/utils.py:
Line 731: Line 732: if not printable: Line 733: printable = command Line 734: Line 735: execCmdLogger.debug("%s (cwd %s)", _list2cmdline(printable), cwd)
Sure we can, but using this all over the code is little ugly.
Seems a bit too clever. Assumes a lot about the inner workings of the formatting command. Also, I'd much rather people avoid using LazyString.
We could just push a LogAdapter that process %(cmd)s with _list2cmdline.
But I don't think we need either. We rarely need to do this kind of stuff anyway. Line 736: Line 737: p = CPopen(command, close_fds=True, cwd=cwd, env=env, Line 738: deathSignal=deathSignal, childUmask=childUmask) Line 739: if not sync:
Nir Soffer has posted comments on this change.
Change subject: utils: Correct and faster list2cmdline ......................................................................
Patch Set 7:
(1 comment)
http://gerrit.ovirt.org/#/c/27548/7/lib/vdsm/utils.py File lib/vdsm/utils.py:
Line 731: Line 732: if not printable: Line 733: printable = command Line 734: Line 735: execCmdLogger.debug("%s (cwd %s)", _list2cmdline(printable), cwd)
Seems a bit too clever. Assumes a lot about the inner workings of the forma
Ok, I',ll handle this in a following patch - this is a general issue and we have more places where we like to log debug message only if logger is enabled for debug. Line 736: Line 737: p = CPopen(command, close_fds=True, cwd=cwd, env=env, Line 738: deathSignal=deathSignal, childUmask=childUmask) Line 739: if not sync:
Federico Simoncelli has posted comments on this change.
Change subject: utils: Correct and faster list2cmdline ......................................................................
Patch Set 7: Code-Review+2
(1 comment)
http://gerrit.ovirt.org/#/c/27548/7/lib/vdsm/utils.py File lib/vdsm/utils.py:
Line 731: Line 732: if not printable: Line 733: printable = command Line 734: Line 735: execCmdLogger.debug("%s (cwd %s)", _list2cmdline(printable), cwd)
Ok, I',ll handle this in a following patch - this is a general issue and we
Yes it seems something that belongs to another patch. Line 736: Line 737: p = CPopen(command, close_fds=True, cwd=cwd, env=env, Line 738: deathSignal=deathSignal, childUmask=childUmask) Line 739: if not sync:
Dan Kenigsberg has submitted this change and it was merged.
Change subject: utils: Correct and faster list2cmdline ......................................................................
utils: Correct and faster list2cmdline
subprocess.list2cmdline dominates the profile on idle vdsm with many storage domains, using more than 10% of cpu time. We use this function to format commands in the log, so it is possible to copy the command to the shell and run for debugging.
Turns out that this is undocumented function used on Windows to convert argument list to string compatible with the MS C runtime. Using repr(), it works on Linux for simple commands, but does not get quoting right with complex commands.
This patch replaces the function with a correct, simple and efficient implementation compatible with Linux shells. The new function should not be used with repr(), which render its output useless.
I tested this patch on Fedora 19 machine connected to 30 storage domains, running 10 minutes profiling session as spm. Without this patch subprocess.list2cmdline() is the most expensive function responsible for 12.5% of cpu time:
Thu Jun 5 23:15:18 2014 vdsmd.prof
2139104 function calls (2202966 primitive calls) in 16.652 seconds
Ordered by: internal time List reduced from 1785 to 10 due to restriction <10>
ncalls tottime percall cumtime percall filename:lineno(function) 2151 1.816 0.001 2.140 0.001 subprocess.py:579(list2cmdline) 97/128 1.407 0.015 11.004 0.086 threading.py:753(Thread.run) 10221 0.494 0.000 1.028 0.000 __init__.py:242(LogRecord.__init__) 5397 0.478 0.000 1.575 0.000 misc.py:125(findCaller) 8508 0.422 0.000 0.641 0.000 spark.py:211(Parser.buildState) 50336 0.404 0.000 0.463 0.000 genericpath.py:85(_splitext) 32657/32664 0.273 0.000 0.815 0.000 misc.py:110(_shouldLogSkip) 50095 0.271 0.000 0.271 0.000 pthread.py:95(Lock.lock) 2147 0.266 0.000 0.517 0.000 subprocess.py:1406(CPopen._communicate_with_poll) 50331/50336 0.238 0.000 0.708 0.000 posixpath.py:104(splitext)
With this patch, utils._list2cmdline() uses only fraction of the cpu time:
2158 0.085 0.000 0.118 0.000 utils.py:652(_list2cmdline)
Note: the profile was performed with older version of this patch, the current version is simpler more efficient.
Change-Id: Ibfc7819c126fe51cc068cc6f9f85b1002878385b Signed-off-by: Nir Soffer nsoffer@redhat.com Reviewed-on: http://gerrit.ovirt.org/27548 Reviewed-by: Federico Simoncelli fsimonce@redhat.com --- M lib/vdsm/utils.py M tests/utilsTests.py 2 files changed, 59 insertions(+), 3 deletions(-)
Approvals: Nir Soffer: Verified Federico Simoncelli: Looks good to me, approved
oVirt Jenkins CI Server has posted comments on this change.
Change subject: utils: Correct and faster list2cmdline ......................................................................
Patch Set 8:
Build Successful
http://jenkins.ovirt.org/job/vdsm_master_create-rpms_merged/1452/ : SUCCESS
vdsm-patches@lists.fedorahosted.org