CC'ing fedora-infrastructure, as I think they got lost somewhere along the way.
On Tue, 2018-03-20 at 17:04 +0100, Michal Domonkos wrote: <snip>
Yeah, the level doesn't really matter much. My point was, as long as we chunk, some of the data that we will be downloading we will already have locally. Typically (according to mdhist), it seems that package updates are more common than new additions, so we won't be reusing the unchanged parts of package tags. But that's inevitable if we're chunking.
Ok, I see your point, and you're absolutely right.
<snip>
The beauty of the zchunk format (or zsync, or any other chunked format) is that we don't have to download different files based on what we have, but rather, we download either fewer or more parts of the same file based on what we have. From the server side, we don't have to worry about the deltas, and the clients just get what they need.
+1
Simplicity is key, I think. Even at the cost of not having the perfectly efficient solution. The whole packaging stack is already complicated enough.
+1000 on that last!
<snip>
While I'm not completely sure about application-specific boundaries being superior to buzhash (used by casync) in terms of data savings, it's clear that using http range requests and concatenating the objects together in a smart way (as you suggested previously) to reduce the number of HTTP requests is a good move in the right direction.
Just to be clear, zchunk *could* use buzhash. There's no rule about where the boundaries need to be, only that the application creating the zchunk file is consistent. I'd actually like to make the command-line utility use buzhash, but I'm trying to keep the code BSD 2-clause, so I can't just lift casync's buzhash code, and I haven't had time to write that part myself.
Currently zck.c has a really ugly if statement that chooses a division based on string matching if it's true and a really naive inefficient rolling hash if it's false. If you wanted to contribute buzhash, I'd happily take it!
BTW, in the original thread, you mentioned a reduction of 30-40% when using casync. I'm wondering, how did you measure it? I saw chunk reuse ranging from 80% to 90% per metadata update, which seemed quite optimistic. What I did was:
$ casync make snap1.caidx /path/to/repodata/snap1 $ casync make --verbose snap2.caidx /path/to/repodata/snap2
<snip> Reused chunks: X (Y%) <snip>
IIRC, I went into the web server logs and measured the number of bytes that casync actually downloaded as compared to the gzip size of the data.
Thanks so much for your interest!
Jonathan
On Thu, Mar 22, 2018 at 12:39 PM, Jonathan Dieter jdieter@gmail.com wrote:
CC'ing fedora-infrastructure, as I think they got lost somewhere along the way.
Oh, thanks. I screwed up (again), this time by hitting "Reply" instead of "Reply to all" in gmail (*facepalm*).
Just to be clear, zchunk *could* use buzhash. There's no rule about where the boundaries need to be, only that the application creating the zchunk file is consistent. I'd actually like to make the command-line utility use buzhash, but I'm trying to keep the code BSD 2-clause, so I can't just lift casync's buzhash code, and I haven't had time to write that part myself.
Makes sense, thanks for the clarification!
For completeness, I'm also copying below the "git concept" idea that I elaborated on in the "lost" email:
---8<-----
The git concept is basically just a generalization of the chunking idea we're talking about.
As long as your data semantically represents a tree, you can chunk up the content and get a structure like this:
tree +-- tree +-- blob1 +-- blob2 +-- blob3 +-- tree +-- blob1 +-- blob2 +-- blob3 +-- ...
Now, to sync this structure locally, a simple recursive algorithm is used: look at the root tree to see what objects it needs (i.e. gather a list of hashes), then download them and do the same with those recursively until you have no more incomplete trees left. In order to avoid having too many files, blobs could be stored in one file (maybe per tree) and accessed via HTTP ranges, the same way as in zchunk.
The point is, you will only have to fetch those subtrees where some objects along the path have changed. The effectiveness is then a (logarithmic) function of how deep and how well you do the chunking of your content.
Applying this to our domain, we have:
repomd (tree) +-- primary (tree) +-- srpm1 (tree) +-- rpm1 (blob) +-- rpm2 (blob) +-- srpm2 (tree) +-- srpm3 (tree) +-- filelists (tree) +-- ...
Doing a "checkout" of such a structure would result in the traditional metadata files we're using now. That's just for backward compatibility; we could, of course, have a different structure that's better suited for our use case.
As you can see, this is really just zchunk, only generalized (not sure if compressions plays a role here, I haven't considered it).
---8<-----
Regards,
Michal
On Thu, Mar 22, 2018 at 9:45 AM, Michal Domonkos mdomonko@redhat.com wrote:
On Thu, Mar 22, 2018 at 12:39 PM, Jonathan Dieter jdieter@gmail.com wrote:
CC'ing fedora-infrastructure, as I think they got lost somewhere along the way.
Oh, thanks. I screwed up (again), this time by hitting "Reply" instead of "Reply to all" in gmail (*facepalm*).
Just to be clear, zchunk *could* use buzhash. There's no rule about where the boundaries need to be, only that the application creating the zchunk file is consistent. I'd actually like to make the command-line utility use buzhash, but I'm trying to keep the code BSD 2-clause, so I can't just lift casync's buzhash code, and I haven't had time to write that part myself.
Makes sense, thanks for the clarification!
For completeness, I'm also copying below the "git concept" idea that I elaborated on in the "lost" email:
---8<-----
The git concept is basically just a generalization of the chunking idea we're talking about.
As long as your data semantically represents a tree, you can chunk up the content and get a structure like this:
tree +-- tree +-- blob1 +-- blob2 +-- blob3 +-- tree +-- blob1 +-- blob2 +-- blob3 +-- ...
Now, to sync this structure locally, a simple recursive algorithm is used: look at the root tree to see what objects it needs (i.e. gather a list of hashes), then download them and do the same with those recursively until you have no more incomplete trees left. In order to avoid having too many files, blobs could be stored in one file (maybe per tree) and accessed via HTTP ranges, the same way as in zchunk.
The point is, you will only have to fetch those subtrees where some objects along the path have changed. The effectiveness is then a (logarithmic) function of how deep and how well you do the chunking of your content.
Applying this to our domain, we have:
repomd (tree) +-- primary (tree) +-- srpm1 (tree) +-- rpm1 (blob) +-- rpm2 (blob) +-- srpm2 (tree) +-- srpm3 (tree) +-- filelists (tree) +-- ...
Doing a "checkout" of such a structure would result in the traditional metadata files we're using now. That's just for backward compatibility; we could, of course, have a different structure that's better suited for our use case.
As you can see, this is really just zchunk, only generalized (not sure if compressions plays a role here, I haven't considered it).
---8<-----
One thing I'm concerned about is handling appended metadata. For example, both Mageia and openSUSE append AppStream metadata to the repodata, using a combination of appstream-builder[1] (or appstream-generator[2]) and modifyrepo_c[3]. How does this scale to handling that?
[1]: https://www.mankier.com/1/appstream-builder [2]: https://www.mankier.com/1/appstream-generator [3]: https://www.mankier.com/8/modifyrepo_c
infrastructure@lists.fedoraproject.org