F21 downloads repository metadata in 3 places!

Hedayat Vatankhah hedayat.fwd at gmail.com
Mon Dec 15 13:09:03 UTC 2014

/*Richard Hughes <hughsient at gmail.com>*/ wrote on Mon, 15 Dec 2014 
09:37:27 +0000:
> On 13 December 2014 at 21:10, Hedayat Vatankhah<hedayat.fwd at gmail.com>  wrote:
>> Surprisingly, PackageKit uses its own separate cache.
> Not surprising at all, when you're familiar with how PackageKit works.
> PackageKit has to accept transactions from clients and return results
> very quickly. Just something as simple as SHA'ing a metadata file
> destroys our latency, which is one of the biggest reasons nobody liked
> the command-not-found functionality when it was introduced: it was
> SLOW. This interactive command had to return results in ~100ms, not
> tens of seconds.
> By having 100% complete control of a copy of the cache we can keep
> certain files locked in memory, and we can be aggressive about caching
> pools of packages. This allows us to achieve the low-latency design
> required by gnome-software, which is firing off tons of transactions
> in parallel at startup with expected latency guarantees. Another thing
> it allows us to do is atomically update the cache, so if we're
> updating the cache in the background and we get interrupted or the
> transaction is cancelled to make room for a user-requester
> "interactive" transaction, we can just continue to use the old cache,
> and then atomically rename the new location to the proper location and
> update pools when done. You just can't do this when there are three
> things fiddling with files behind your back without any co-ordination.
> <...>
> Note, if yum or DNF wanted to use the PK cache, it's guaranteed to be
> valid, complete and up to date, although I'm not sure a dependency
> from the package manager CLI to PK would be acceptable for their
> maintainers.
> Richard.
What I think about this (I'm looking at the distribution level, rather 
than specific packages):
1. If PK really needs its own *copy* of the cache, that's OK (well, not 
OK but acceptable), but IMHO it should not download it independently 
too. I think it should just copy the DNF(librepo) cache if it is 
considered valid and up-to-date, or ask it to bring its cache up-to-date 
and then copy the cache atomically to its own cache (preferably using 
hardlinks if possible).

2. I believe that the use should know, and more importantly be able to 
control WHEN the repo data is being updated. At the very least, he 
should be able to specify if the updates are automatic or not using a 
very user friendly method (probably during/after the installation; or 
per network connection).

3. I think the repository data management backend should be separate 
from the frontends (including PK, and dnf cli). Also, I like the idea of 
having a working cache even when new repodata is being downloaded, and I 
think it is something that DNF/Yum/... should also do. There were many 
times that I ended up with a half-updated repo cache which prevented me 
from using Yum as I didn't want/can let it download whole repodata. 
Probably this should be filled as a feature request against DNF.


More information about the devel mailing list