On Sat, Jan 29, 2005 at 05:07:00PM -0500, seth vidal wrote:
> For N packages the ballanced load are log_2 N bins. Adding M
> packages touches only log_2 M bins. And the bins have a max size
> of 2^i packages where i goes from 0 to N-1. And the good news is
> you touch the bins with i < M, e.g. the small ones.
> The statistical net effect is that for M package additions to
> arbitrary N you get log_2 M downloads of a total of 2M packages.
> In relevant numbers:
> o N~=4000, log_2 N~=12
> You have 12 bins.
> o 10 security/bug fix updates, (statistically) only bins 0 to 4
> are changed amounting to 32 packages. Clients download only 5
> files worth of 32 packages in size.
> Compare with the current situation, where you need to get the
> whole lot of N packages for each update.
> For this to work you need to
let's be clear - for this to work YOU need to.
But far be it from to halt the steady march of progress - when you get a
chance to implement this stuff let me know.
Hey Seth, relax. This is just a suggested concept for improving
things. Someone may pick it up, I didn't enforce it on YOU. ;)
Oh and once more - who is it gets the benefit from all this work?
It sounds like it's mostly repo maintainers - not the users.
Did you miss the "User downloads 5 files in size of 32 package
metadata _in total_ vs 4000"? E.g. the user will typically download
less than 1% of what he's downloading now. It benefits by far more the
user base (and perhaps mirror admins) than the repo creator.
> o introduce package cancelation (anti-packages ;)
Sorry, my slang is off, does this mean "no way", or "already in
development"? From the context of the rest I'd guess the first. ;)
> o introduce multiple repodata components
which buys us not all that much other than complexity of debugging.
It buys you all the nice things already outlined.
> o keep a manifest of the last state and feed the repo creation
> with the differences (packages lost, packages gained).
And how do you feed the repo creation system this data? Where do you get
it to begin with? The only way you know this information is if you
already have it
But you do, this is about incremental updates to a repository, right?
- the only way you have it is if you checked all the packages for
what has changed. Are you beginning to see the loop here?
If someone wants to combine createrepo and yum-arch into one program
it makes both at the same time that's fine - it's about an hour or two
worth of work,
That's a complete other topic.
what you're describing above is considerably more, not to
redesigning the depsolvers to deal with the new repository format.
It may even may it simpler, since you don't need to split it into more
importnant and less important data and have file dependencies computed
in two loops.
Axel.Thimm at ATrpms.net