Better repodata performance

Alexandre Oliva aoliva at redhat.com
Sun Jan 30 23:04:57 UTC 2005


On Jan 30, 2005, seth vidal <skvidal at phy.duke.edu> wrote:

>> Definitely.  But couldn't we perhaps do it by intelligently filtering
>> information out of the rpm header and, say, generating a single
>> archive containing all of the info needed for depsolving and for
>> rpmlib's transaction verification?

> you can't do that b/c file conflicts CAN NOT be calculated via rpm w/o
> having the full header and/or all the file information present.

You surely don't need the package description and the changelog for
any of that, this was my point.

>> I was expecting depsolving wouldn't require all the headers.  And from
>> what I gather from your reply, it indeed doesn't.

> it requires all the headers of the packages involved, yes.

For solving dependencies (as opposed to testing the transaction)?!?

> yum 2.1.x ONLY DOWNLOADS THE XML FILES WHEN IT NEEDS THEM.

> go read the code and stop guessing.

Go read my e-mail.  This is all covered.  I'm not guessing.

> it downloads repomd.xml everytime - that's < 1K.

Check.

> it downloads primary.xml.gz if the file has changed - that's typically <
> 1M.

Check.

> it downloads filelists.xml.gz only when there is a file dep that it
> cannot resolve with primary.xml.gz.

Check.

All covered in my e-mail.  *You* stop guessing.

>> I don't know how yum 2.0 did it, but up2date surely won't even try to
>> download a .hdr file if it already has it in /var/spool/up2date, so
>> this is not an issue.

> yum 2.0.x certainly DID NOT download a .hdr file it already had. Sheesh,
> go read the code, stop making suppositions based on anecdotes.

I'm not making suppositions.  Granted, I didn't read the code, only
observed behavior.  My analysis is still valid.

>> repodata helps the initial download, granted, but it loses terribly in
>> the long run.

> only as the number of file deps outside of /etc/* and *bin/* increases.

So you're saying the factor I put in to account for that too small?
How much should it be to match reality?  Is repodata still a win?

> if you keep the file deps in those paths then repodata is a huge win.

I find that very hard to believe, since the downloads of
primary.xml.gz alone are enough to get above what yum 2.0 would
download.  Go read my text!

You know, text is supposed to be easier to read than code, that's why
we write comments.  So instead of fighting I haven't read your code,
how about you pay just a little bit of attention that actually matches
*exactly* what's in both versions of your code?  If you find some
passage particularly difficult to understand, I may try to explain it
in other words (I'm not a native English speaker, you know), but
refraining from reading it just because you *think* it doesn't match
what your code does (even though it does match it) is making a fool of
yourself.

-- 
Alexandre Oliva             http://www.ic.unicamp.br/~oliva/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}




More information about the devel mailing list