Back Again
Sam Varshavchik
mrsam at courier-mta.com
Wed Aug 1 02:59:35 UTC 2007
David Boles writes:
> on 7/31/2007 7:03 PM, Sam Varshavchik wrote:
>> Todd Zullinger writes:
>>
>>> Sam Varshavchik wrote:
>>>> Nah, it's not closer. It's just that rpm is getting crappier every
>>>> year, and is long overdue for replacement.
>>> I could easily be mistaken, but AFAIK, the main difference in speed
>>> that end users notice between yum and apt is due to the fact that apt
>>> caches it's metadata. In between runs of apt-get update, calls to
>>> apt-get use the data on disk without hitting the network. With yum,
>>> the update and upgrade steps from apt-get are both done in the update.
>>
>> I don't know if you've ever upgraded Fedora from one release to the next.
>> The upgrade process is as slow as molasses, even though all the metadata is
>> right there.
>
>
> Do you know just why an upgrade of a system 6 months old, or more, takes
> longer than a fresh install of a new release? You should study that
> situation. Start with package dependencies and then think about just what
> you might have changed and added from third party sites. Then think some more.
Well, I did think. The system does not have anything beyond Fedora and
Fedora Extras, plus my own RPMs. But why does it matter, anyway? Why does
the presence of a foreign RPM cause such a nervous breakdown? At most it
should result in an unsatisifed dependency. But why would should this result
in rpm spinning its wheels, to such an extent?
> Care for a really stupid example? Take a 2006 automobile. Examine it very
> closely. Then with a garage full of new 2007 parts make it a 2007
> automobile. All the time making sure that everything fits and still works.
>
> Email us when you're finished.
No matter which parts you do have in your automobile and where they came
from, when you have to compare its part with a fixed list of two thousand
other parts, from a reference model, it should take the exact same amount of
time whether all your parts are OEM or aftermarket. It's the same number of
parts in your car, whether original or replacement, after all. So why would
it matter?
At most, the complexity of what RPM has to do would be O(N), and it should
really be O(log N). But it seems, though, that RPM's actual complexity is at
least O(N^2), unscientifically.
I tell you this. I mentioned before that I use my own package management
tool internally to manage some homebrewed software. I have a compatilibity
shim that sucks out pretty much the entire contents of the system RPM
database, and imports all of the dependencies into my internal package
database. This is to allow my own packages, which might have, say, a
dependency on something.so, have the dependencies satisified by an RPM.
Basically, I read all RPM resources, and create a dummy package that
provides those resources, then install the dummy package, so my internal
package database contains all the RPM-provided resources. Each time I update
some RPMs, I rerun the import script and upgrade the old dummy RPM
compatibility package to a new one.
This operation, you understand of course, is analogous to your example --
taking an old snapshot of the entire RPM database, comparing it to a new
one, and reconciling any differences against resources required by my
internal packages, to make sure that they don't break. This operation is
also equivalent, to what Anaconda has to do when it's about to upgrade the
Fedora distro -- take the current RPM database, and reconcile it with the
RPM database from the release you're updating to.
It takes me, oh, maybe a minute or so to crunch everything together. The
analogous step in Anaconda -- "Preparing transaction" -- takes aout 5-10
minutes.
And I actually have more work to do. RPM has, I believe, three resources
classes to reconcile against each other -- provided resources, required
resources, and conflicting resources. My internal package database has six
resource classes to reconcile, so I actually have more work to do.
The performance degradation that I see in Anaconda is far more pronounced on
less-robust hardware. On my less-than one year old laptop, with a fairly
speedy Pentium, and 2 gigs of RAM, Anaconda is about 2-3 times slower than
my homegrown code. On an old box that I have, running a pair of decade-old
(approx) 500 Mhz Celerons, with 256MB RAM, rpm is dreadfully slow -- about
10-15 times slower than my homegrown code. There's something terribly
inefficient in the way that Anaconda goes about its business. It should
/not/ take that long to do its duty.
Some of it might be due to Anaconda being Python code, and my homegrown code
being C++.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.fedoraproject.org/pipermail/users/attachments/20070731/75f8da09/attachment-0002.bin
More information about the users
mailing list