Wed, Jan 13, 2021 at 11:31:24AM CET, olichtne(a)redhat.com wrote:
On Wed, Jan 13, 2021 at 11:13:35AM +0100, Jan Tluka wrote:
> Tue, Jan 12, 2021 at 02:52:52PM CET, olichtne(a)redhat.com wrote:
> >On Thu, Jan 07, 2021 at 02:52:29PM +0100, Jan Tluka wrote:
> >> The align_data method creates a copy of FlowMeasurementResults and
> >> based on the provided start and end timestamps copies only those
> >> measurement samples that fit the timestamp interval.
> >>
> >> Signed-off-by: Jan Tluka <jtluka(a)redhat.com>
> >> ---
> >> .../Perf/Measurements/BaseFlowMeasurement.py | 38 +++++++++++++++++++
> >> 1 file changed, 38 insertions(+)
> >>
> >> diff --git a/lnst/RecipeCommon/Perf/Measurements/BaseFlowMeasurement.py
b/lnst/RecipeCommon/Perf/Measurements/BaseFlowMeasurement.py
> >> index bbf19a2e..27148a09 100644
> >> --- a/lnst/RecipeCommon/Perf/Measurements/BaseFlowMeasurement.py
> >> +++ b/lnst/RecipeCommon/Perf/Measurements/BaseFlowMeasurement.py
> >> @@ -4,6 +4,7 @@ from lnst.RecipeCommon.Perf.Measurements.MeasurementError
import MeasurementErro
> >> from lnst.RecipeCommon.Perf.Measurements.BaseMeasurement import
BaseMeasurement
> >> from lnst.RecipeCommon.Perf.Measurements.BaseMeasurement import
BaseMeasurementResults
> >> from lnst.RecipeCommon.Perf.Results import SequentialPerfResult
> >> +from lnst.RecipeCommon.Perf.Results import ParallelPerfResult
> >>
> >> class Flow(object):
> >> def __init__(self,
> >> @@ -170,6 +171,43 @@ class FlowMeasurementResults(BaseMeasurementResults):
> >> def end_timestamp(self):
> >> return min([seq_result[-1].timestamp for seq_result in
self.generator_results])
> >>
> >> + def _copy(self):
> >> + copy = FlowMeasurementResults(self.measurement, self.flow)
> >> +
> >> + return copy
> >
> >Not sure what this _copy method is for?
> >
>
> So, a bit longer story here.
>
> I originally started with deepcopy on the *Results objects but it turned
> out that deepcopy of such object takes significant amount of time (~2 minutes
> for 20 second run). The deepcopy did not work for other reasons as well,
> e.g. different Flow.Host objects when aggregating iterations.
Yeah I thought it was something like that, very interesting that it
takes 2 minutes to deepcopy, I think I'll take a note to at some point
look at possibly optimizing the data structures as it may also explain
some memory inefficiencies that I've indirectly seen sometimes.
I'm not quite sure if something can be optimized here. But yeah,
everything can be optimized.
The problem is mostly with CPUStatMeasurement that generates a lot of
PerfInterval objects. For example on 24 cpu machine and five 60 sec iterations
that is:
24 * 5 * 60 * ~6 (cpu_state metrics) = 43200 objects (!!!) to deepcopy
and that is just for 1 machine, so additional multiply twice.
With FlowMeasurements this is quite faster simply that we measure just 4
metrics (TX tput/cpu, RX tput/cpu) where each has just 60 PerfInterval
samples.
>
> Later I realized that I simply do not want to copy the whole object and then
> remove the individual iteration samples but rather do much simpler list
> comprehension.
>
> So I simply want to create an incomplete copy of the object but with
> same reference to Measurement, Flow and other objects. That solved the
> aggregation issue.
>
> There's no real value of the method other than a guide for anyone who
> reads the code that an intention is to create a copy of the object.
>
> If you think it's not necessary I can remove this and use it inline,
> that is:
>
> result_copy = FlowMeasurementResults(self.measurement, self.flow)
right, so instead of a deepcopy you just need a shallow one - copy.copy :)
I think it's ok to remove the _copy method for now to shorten the code
and to either do what you proposed (the explicit new object creation),
or to just use result_copy = copy.copy(result)
-Ondrej
>
> >> +
> >> + def align_data(self, start, end):
> >> + result_copy = self._copy()
> >> +
> >> + # NOTE: iperf reports the cpu utilization for the whole test
> >> + # period, not each second, so the CPU samples cannot be aligned
> >> + result_copy.generator_cpu_stats = self.generator_cpu_stats
> >> + result_copy.receiver_cpu_stats = self.receiver_cpu_stats
> >> +
> >> + result_copy.generator_results = ParallelPerfResult()
> >> + result_copy.receiver_results = ParallelPerfResult()
> >> +
> >> + for stream in self.generator_results:
> >> + aligned_intervals = [
> >> + interval
> >> + for interval in stream
> >> + if interval.timestamp >= start and
interval.timestamp <= end
> >> + ]
> >> +
> >> +
result_copy.generator_results.append(SequentialPerfResult(aligned_intervals))
> >> +
> >> + for stream in self.receiver_results:
> >> + aligned_intervals = [
> >> + interval
> >> + for interval in stream
> >> + if interval.timestamp >= start and
interval.timestamp <= end
> >> + ]
> >> +
> >> +
result_copy.receiver_results.append(SequentialPerfResult(aligned_intervals))
> >> +
> >> + return result_copy
> >> +
> >> +
> >> class AggregatedFlowMeasurementResults(FlowMeasurementResults):
> >> def __init__(self, measurement, flow):
> >> super(FlowMeasurementResults, self).__init__(measurement)
> >> --
> >> 2.26.2
> >> _______________________________________________
> >> LNST-developers mailing list -- lnst-developers(a)lists.fedorahosted.org
> >> To unsubscribe send an email to
lnst-developers-leave(a)lists.fedorahosted.org
> >> Fedora Code of Conduct:
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> >> List Guidelines:
https://fedoraproject.org/wiki/Mailing_list_guidelines
> >> List Archives:
https://lists.fedorahosted.org/archives/list/lnst-developers@lists.fedora...