On Wed, 2004-01-28 at 16:39, Alexandre Oliva wrote:
The problem with the deltas is that they're static, making a guess on what the client has on its end. rsync hashes, OTOH, make no such assumptions: they tell, with a very small footprint, what the client is about to download, such that the client can tell which bits it doesn't need. The fact that rpm uses minigzip makes it as suitable for rsyncing as for xdelta.
While I really prefer this solution, the assertion that minigzip makes the packages "rsync-able" does not hold up. The rpm-delta proof-of-concept code used rpm2cpio to extract the rpm payload, and this implicitly performs a un-minigzip operation on the payload. The uncompressed cpio payloads are then given to xdelta.
I have just tried using a couple of large rpms (on the assumption that these should have more commonality between minor version tweaks than small rpms) rsync-ing an updated rpm onto the previous rpm.
For a Fedora kernel rpm (kernel-2.4.22-1.2140.nptl.i686.rpm and kernel-2.4.22-1.2149.nptl.i686.rpm) there was effectively no speedup seen by pre-heating the destination file with its previous version. The extra negotiation required was of a very similar size to the actual commonality between the packages.
With the glibc-common package (picked as it was quite likely to have no changes, and particularly would not suffer from object file timestamps), there was a slight advantage - 1717760 bytes out of 11193793 were saved, but at a cost of 60674 in protocol overhead sending the hashes.
I would guess that the commonality may well be the rpm internal metadata - the payload compression would cause the data stream to become very different quite quickly.
The numbers don't justify this with rsync at present unless rpm were changed to make the compression be done on a per payload file, or possibly per block basis. That might well make the basic rpms larger and would certainly require a rpm filespec and tools update, with backwards compatibility issues.
The rsync protocol-based solution I'm proposing would reconstruct the rpm file from the installed files, and use that to save on the download.
This doesn't seem to fly based on my very quick suck-it-and-see tests. Anyone got numbers to contradict me?
Nigel.