Keeping old versions of packages

Chris Adams cmadams at hiwaay.net
Wed Apr 10 15:33:43 UTC 2013


Once upon a time, John.Florian at dart.biz <John.Florian at dart.biz> said:
> Is there anything that could be done to make it unnecessary to pull the 
> complete metadata for every update?  For example, IIRC this is all sqlite 
> data, but what if this was in a plain-text data dump form where something 
> like rsync could be used to efficiently transfer only those bits that have 
> changed.  Client CPU time to reconstruct the DB is probably cheaper than 
> the bandwidth.  Maybe such a mode would only be used if the DB size 
> exceeded some threshold.

The metadata starts in XML before being loaded into an SQLite DB file,
and the XML is in the repodata directory with the DB.  However, both are
compressed, as they are large.  For example, the current
updates/18/x86_64 XML is over 34M (5M gzip compressed), and the DB is
41M (9M bzip2 compressed).  I'm guessing there are historical reasons
why different compression is used; both could be made noticeably smaller
with xz (XML to just over 3M, DB to 7M), but that's still a lot of data
to download (and there are also other metadata files that have to be
downloaded sometimes, especially the filelists.xml.gz, which is 10M gzip
compressed).

I'm not sure when the XML is downloaded instead of (or in addition to)
the DB, but it does appear to happen (I see one example in my mirror
server web logs this morning for example).

All the metadata changes with every push (usually once per day for the
updates repo), so it has to be downloaded constantly.  Your only
practical choice for distribution is HTTP; rsync has much higher server
overhead and only available on some mirrors.  If you want anything other
than the full download, it'll have to be in the form of additional
repodata files generated as part of the push.

-- 
Chris Adams <cmadams at hiwaay.net>
Systems and Network Administrator - HiWAAY Internet Services
I don't speak for anybody but myself - that's enough trouble.


More information about the devel mailing list