New subject: Fedora 34 Change: DNF/RPM Copy on Write enablement for all variants (System-Wide Change)

Monday, 21 December 2020

https://fedoraproject.org/wiki/Changes/RPMCoW

== Summary ==

RPM Copy on Write provides a better experience for Fedora Users as it
reduces the amount of I/O and offsets CPU cost of package
decompression. RPM Copy on Write uses reflinking capabilities in
btrfs, which is the default filesystem in Fedora 33.

== Owners ==

* Name: [[User:malmond|Matthew Almond]], [[User:dcavalca|Davide Cavalca]]
* Email: malmond(a)fb.com, dcavalca(a)fb.com

== Detailed description ==

Installing and upgrading software packages is a standard part of
managing the lifecycle of any operating system. For the entire
lifecycle of Fedora, all software is packaged and distributed using
the RPM file fomat. This proposal changes how software is downloaded
and installed, leaving the distribution process unmodified.

=== Current process ===

# Resolve packaging request into a list of packages and operations
# Download and verify new packages
# Install and/or upgrade packages sequentially using RPM files,
decompressing, and writing a copy of the new files to storage.

=== New process ===

# Resolve packaging request into a list of packages and operations
# Download and '''decompress''' packages into a
'''locally optimized''' rpm file
# Install and/or upgrade packages sequentially using RPM files, using
'''reference linking''' (reflinking) to reuse data already on
disk.

The outcome is intended to be the same, but the order of operations is
different.

# Decompression happens inline with download. This has a positive
effect on resource usage: downloads are typically limited by
bandwidth. Decompression and writing the full data into a single file
per rpm is essentially free. Additionally: if there is more than one
download at a time, a multi-CPU system can be better utilized. All
compression types supported in RPM work because this uses the rpm I/O
functions.
# RPMs are cached on local storage between downloading and
installation time as normal. This allows DNF to defer actual RPM
installation to when all the RPM are available. This is unchanged.
# The file format for RPMs is different with Copy on Write. The
headers are identical, but the payload is different. There is also a
footer.
## Files are converted (“transcoded”) locally during download using
<code>/usr/bin/rpm2extents</code> (part of rpm codebase). The format
is not intended to be “portable” - i.e. copying the files from the
cache is not supported.
## Regular RPMs use a compressed .cpio based payload. In contrast,
extent based RPMs contain uncompressed data aligned to the fundamental
page size of the architecture, e.g. 4KiB on x86_64. This alignment is
required for <code>FICLONERANGE</code> to work. Only files are
represented in the payload, other directory entries like symlinks,
device nodes etc are constructed entirely from rpm header information.
Files are referenced by their digest, so identical files are
de-duplicated.
## The footer currently has three sections
### Table of original (rpm) file digests, used to validate the
integrity of the download in dnf.
### Table of digest → offset used when actually installing files.
### Signature 8 bytes at the end of the file, used to differentiate
between traditional RPMs and extent based.

=== Notes ===

# The headers are preserved bit for bit during transcoding. This
preserves signatures. The signatures cover the main header blob, and
the main header blob ensures the integrity of data in two ways:
## Each file with content has a digest. Originally this was md5, but
today it’s usually sha256. In normal RPM this is only used to verify
the integrity of files, e.g. <code>rpm -V</code>. With CoW we use this
as a content key.
## There is/are one or two digests (<code>PAYLOADDIGEST</code> and
<code>PAYLOADDIGESTALT</code>) covering the payload archive
(compressed cpio). The header value is preserved, but transcoded RPMs
do not preserve the original structure so RPM’s pre-installation
verification (controlled by <code>%_pkgverify_level</code> will fail.
<code>dnf-plugin-cow</code> disables this check in dnf because it
verifies the whole file digest which is captured during
download/transcoding. The second one is likely used for delta rpm.
# This is untested, and possibly incompatible with delta RPM (drpm).
The process for reconstructing an rpm to install from a delta is
expensive from both a CPU and I/O perspective, while only providing
marginal benefits on download size. It is expected that having delta
rpm enabled (which is the default) will be handled gracefully.
# Disk space requirements are expected to be marginally higher than
before: all new packages or updates will consume their installed size
before installation instead of about half their size (regular rpms
with payloads still cost space).
# <code>rpm-plugin-reflink</code> will fall back to simple file
copying when the destination path is not on the same
filesystem/subvolume. A common example is <code>/boot</code> and/or
<code>/boot/efi</code>.
# The system will still work on other filesystem types, but will
''always'' fall back to simple copying. This is expected to be
slightly slower than not enabling CoW because the source for copying
will be the decompressed data.
# For systems that enable transparent filesystem compression: every
file will continue to be decompressed from the original rpm, and then
transparently re-compressed by the filesystem. There is no effective
change here. There is a future project to investigate alternate
distribution mechanics to provide parallel versions of file content
pre-compressed in a filesystem specific format, reducing both CPU
costs and I/O. It is expected that this will result in slightly higher
network utilization because filesystem compression is purposely
restricted to allow random I/O.
# Current implementation of <code>dnf-plugin-cow</code> is in Python,
but it looks possible to implement this in <code>libdnf</code> instead
which would make it work in <code>packagekit</code>.

=== Performance Metrics ===

Ballpark performance difference is about half the duration for file
download+install time. A lot of rpms are very small, so it’s difficult
to see/measure. Larger RPMs give much clearer signal.

(Actual numbers/charts will be supplied in Jan 2021)

=== Terminology ===

* '''Copy on Write (CoW)''' is a broad description of any
technology
that reduces or eliminates data duplication by sharing the data behind
the scenes until one of the references makes changes. This has been a
cornerstone technology in memory management in Unix systems. Here we
are using it to specifically reference Copy on Write as supported in
modern filesystems, e.g. btrfs, xfs and potentially others.
* '''Reflink''' is the verb for duplicating stored data on a
filesystem. See
[https://man7.org/linux/man-pages/man2/ioctl_ficlonerange.2.html
ioctl_ficlonerange(2)] for the specific call we use on Linux
* '''Extent''' (based RPMs) refers to how payload file data is
stored
in within an RPM. Normal RPMs simply contain a compressed CPIO
archive. Extent based RPMs contain the raw data uncompressed, which
can be referenced with reflink.

== Benefit to Fedora ==

Faster package installs and upgrades

== Scope ==

* Proposal owners:
** Merge changes to rpm, librepo to enable capabilities
** Add dnf-plugin-cow to available packages
** Test days
** Aid with documentation
* Other developers:
** rpm, librepo: review PRs as needed
* Release engineering: https://pagure.io/releng/issue/9914
* Policies and guidelines: N/A
* Trademark approval: N/A

== Upgrade/compatibility impact ==

None, RPM with CoW is not enabled by default.

Upgrades with <code>keepcache</code> in dnf.conf will be able to use
existing packages, but it will not convert them. This only happens at
download time.

If a system is configured to keep packages in the cache
(<code>keepcache</code> in <code>dnf.conf</code>) and
<code>dnf-plugin-cow</code> is removed then the packages will be
unusable. Recommend <code>dnf clean packages</code> to resolve this.

== How to test ==

Enable RPM with CoW with

<pre>$ sudo dnf install dnf-plugin-cow
...
$ sudo dnf install hello
...
$ hello
Hello, world!</pre>
There should be no end user visible changes, except timing.

== User experience ==

No anticipated user visible changes in this change proposal. This
makes the feature available, but does not enable it by default.

== Dependencies ==

# A copy-on-write filesystem; this Change is primarily targeting
btrfs, but RPM with CoW should work with XFS as well (untested)
# Most package install paths and the dnf package cache on the same
filesystem / subvolume.
# <code>rpm</code> with Copy on Write patch set:
https://github.com/malmond77/rpm/tree/cow
# <code>librepo</code> with transcoding support:
https://github.com/malmond77/librepo/tree/transcode_cow
# dnf-plugin-reflink (a new package):
https://github.com/facebookincubator/dnf-plugin-cow/

== Contingency plan ==

* Contingency mechanism: will not include PR patches if not merged
upstream, skip <code>dnf-plugin-cow</code>
* Contingency deadline: Final freeze
* Blocks release? No
* Blocks product? No

== Documentation ==

Documentation will be available at
https://github.com/facebookincubator/dnf-plugin-cow in the coming
weeks

== Release Notes ==

RPM with CoW is not enabled by default. To enable it:

<pre>$ sudo dnf install dnf-plugin-cow</pre>

-- 
Ben Cotton
He / Him / His
Senior Program Manager, Fedora & CentOS Stream
Red Hat
TZ=America/Indiana/Indianapolis

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Fedora 34 Change: DNF/RPM Copy on Write enablement for all variants (System-Wide Change)