On Mon, Jan 20, 2020 at 1:38 AM Bohdan Khomutskyi <bkhomuts(a)redhat.com> wrote:
In my previous message, I mentioned that CPU is underutilized during
installation. I haven't investigated further why, but I suspect it's due to the
inefficiency caused by the usage of the loop device and/or inefficiency in the rsync
In all installations with xz compression, I see the loop1 device
pegged at 100% CPU (single thread), and perf shows this is almost
entirely lzma decompression. Better utilization would happen with
parallelized decompression threads. But what about the use cases where
there's only one CPU? VMs may not assign multiple CPUs, and what about
the ARM boards we support?
A zstd compressed squashfs image, loop1 uses 30% CPU or less, IO of
either the source or target is near 100% utilization, therefore I'm
not sure parallelization in this case would improve things by much.
In fact, I have an optimization to file next weekend on my to do
> All of the Live installations use rsync.
And that's what I propose to change: to use unsquashfs instead of rsync, preliminary
benchmarks show 8x improvement in decompressing speed on local media for XZ on local
I had not considered unsquashfs, so that's an interesting optimization.
Yes, Zstd consumes 12.24x less CPU user time while unsquashfs, but
let's consider the practical application.
I am. There's an electricity cost when there's enough heat generated
by an installation that my computer sounds like a hair dryer.
Therefore, I'm still biased against the heavy CPU cost hit for xz for
an insignificant reduction in ISO size, multiplied by thousands of
installations per week (real, virtual and test) quite a lot of which
aren't USB sticks as sources.
Will Zstd decrease the installation time, given the constraints and optimization above --
that's what I plan to investigate in upcoming weekends.
My proposal focuses on reducing the installation media size, and recommends to use
certain compression options. But, I think, the final decision is to be made by FESCO.
If image size is a significant consideration, then evaluation of erofs
seems indicated. It promises both significant compression and CPU
performance. The intended use case is for Android device read-only
partitions with both limited storage and CPU/power capacity.