Re: Fedora 31 System-Wide Change proposal: Switch RPMs to zstd compression

Wednesday, 5 June 2019

On 6/5/19 12:53 AM, Chris Murphy wrote:
...
 On Mon, Jun 3, 2019 at 7:01 PM Jason L Tibbitts III
<tibbs(a)math.uh.edu&gt; wrote:
>
>>>>>> "PM" == Panu Matilainen <pmatilai(a)redhat.com&gt;
writes:
>
> PM> Note that rpm doesn't support parallel zstd compression, and while
> PM> it does for xz, that's not even utilized in Fedora.
>
> Doing parallel xz compression has a surprising cost in compression ratio
> which gets worse as the thread count increases (because it just splits
> the input into independent blocks and compresses them separately).  I
> did start on a feature to have it enabled but then abandoned that after
> realizing that it didn't really work as I'd hoped.

 Which is also why parallel xz compression doesn't produce reproducible results.

> That said, I do wonder how difficult it would be to do parallel zstd
> compression/decompression within RPM.  If it were possible then that
> might help to obviate some of the downsides.

 At least for small files, and there are many in any distribution,
 using a dictionary very well could improve compression/decompression
 time, compression ratio, more than threads. Adding dictionary support
 would help all the single thread hardware, and even the builders when
 zstd -T0 option dictates there's only 1 or 2 threads available. On the
 generic sample set, it's functionally like getting 4 threads on speed,
 and even compression ratio goes up by ~3x. But I have no idea how that
 sample set compares to Fedora's files.

Yes, but as I mentioned in another email, rpm doesn't compress the files 
individually, it compresses them as one big continuous archive. The 
dictionary is unlikely to help that (in my quick test yesterday it 
actually made it worse)

	- Panu -

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: Fedora 31 System-Wide Change proposal: Switch RPMs to zstd compression