[announce] yum: parallel downloading

Zdenek Pavlas zpavlas at redhat.com
Wed May 16 16:07:32 UTC 2012


A new yum and urlgrabber packages have just hit Rawhide.  These releases
include some new features, including parallel downloading of packages and
metadata, and a new mirror selection code.  As we plan to include these
features in RHEL7, I welcome any feedback or bug reports!

python-urlgrabber-3.9.1-12.fc18 supports a new API to urlgab() files in
parallel, and yum-3.4.3-26.fc18 can use this.  Both packages are compatible
with older versions.

Feature list:

- parallel downloading of packages and metadata

If possible, multiple files are downloaded in parallel.  (see below for the
limitations that apply)

- configurable 'max_connections' limit in yum.conf

This is the maximum number of simultaneous connections Yum makes.  Purpose of
this is to limit local resources (number of processes forked).  The default is
to use urlgrabber's default value of 5.

- mirror limits are honored, too.

Making many connections to the same mirror usually does not help much, it just
consumes more resources.  That's why Yum also uses mirror limits from
metalink.xml.  If no such limit is available, at most 3 simultaneous
connections are made to any single mirror.

- new mirror selection algorithm

The real downloading speed is calculated after each download, and the mirror's
statistics get updated.  These are in turn used when selecting mirrors for
further downloads.  This should be more accurate than measuring latencies in
fastestmirror plugin, but slow mirrors now have to be tried from time to time,
and the statistics need some time to build up.

- ctrl-c handling

This is a long-standing problem in Yum.  Due to various shortcomings in rpm and
curl it's impossible to react immediately to SIGINT.  But now the downloader 
runs in a different process, so we can exit even if curl is still stuck.
The "skip to next mirror" feature is gone (we don't want to restart all
currently running downloads).

Known limitations:

- metalink.xml and repomd.xml downloads are not parallelized yet.

Zdeněk Pavlas

More information about the devel mailing list