sorting yum/dnf metadata and metadata diffs

Marcin Juszkiewicz mjuszkiewicz at redhat.com
Fri Feb 13 08:21:07 UTC 2015


On 13.02.2015 08:11, Casey Jao wrote:
> How feasible would it be to keep the listings in primary.xml and
> filelists.xml sorted by package name and arch? Doing so could open the door
> to simple and efficient diffs of repository metadata.

Something like pdiffs in Debian?

> Those two are by far the largest metadata files. If the observed
> improvements are typical, then keeping those files in order and hosting the
> diffs between the present and the previous few days (and modifying dnf to
> look for those diffs) could substantially reduce the amount of data that
> users must download every time a repository is updated, which for a
> fast-moving OS like Fedora could happen nearly every day.

If only amount of download data matters then why not compress
primary.xml and filelists.xml with xz?

 11646147 primary.xml.gz
  8676976 primary.xml.xz
 30607019 filelists.xml.gz
 23661236 filelists.xml.xz

But yeah, it can make dnf/yum use more cpu power to uncompress them each
time they want to use that data.


More information about the devel mailing list