Software Management call for RFEs
fweimer at redhat.com
Mon May 27 10:17:25 UTC 2013
On 05/27/2013 11:48 AM, Zdenek Pavlas wrote:
>> And there package diffs, which are ed-style diffs of the
>> Packages file I mentioned above. This approach would work quite well
>> for primary.xml because it doesn't contain cross-references between
>> packages using non-natural keys. It doesn't work for the SQLite
>> database, either in binary or SQL dump format, because of the reliance
>> on artificial primary keys (such as package IDs).
> I've once tried this. With about 10k packages in fedora-updates, the delta
> over 2-3 days was +491 -479. Assuming deletions are cheap, the delta should
> ideally be 5%. As expected, binary bsddiff yields much bigger (~29%) delta.
A line-wise diff is much smaller because dependencies and package
descriptions mostly stay the same. (This assumes consistent sorting of
the primary.xml file.)
Can you point me to the primary.xml -> SQLite translation in yum? I've
got a fairly efficient primary.xml parser. It might be interesting to
see if it's possible to reduce the latency introduced by the SQLite
conversion to close to zero. (Decompression and INSERTs can be
interleaved with downloading, and maybe the index creation improvements
in SQLite are sufficient these days.)
>> However, for many users that follow unstable or testing, package diffs
>> are currently slower than downloading the full Packages file because the
>> diffs are incremental (i.e., they contain the changes from file version
>> N to N+1, and you have to apply all of them to get to the current
>> version) and apt-get can easily write 100 MB or more because the
>> Packages file is rewritten locally multiple times.
> Yes, patch chaining should be avoided. I'd like to use N => 1 deltas,
> that could be applied to many recent snapshots.
The Debian package diffs could be combined efficiently in the client
because it's possible to combine diffs for two adjacent versions without
actually knowing what the old or new versions look like. But this
hasn't been implemented in APT because ABI impact (which is a bit
puzzling, but anyway). Instead, the diffs should soon be combined on
the archive side.
Florian Weimer / Red Hat Product Security Team
More information about the devel