F21 downloads repository metadata in 3 places!
hedayat.fwd at gmail.com
Mon Dec 15 13:09:03 UTC 2014
/*Richard Hughes <hughsient at gmail.com>*/ wrote on Mon, 15 Dec 2014
> On 13 December 2014 at 21:10, Hedayat Vatankhah<hedayat.fwd at gmail.com> wrote:
>> Surprisingly, PackageKit uses its own separate cache.
> Not surprising at all, when you're familiar with how PackageKit works.
> PackageKit has to accept transactions from clients and return results
> very quickly. Just something as simple as SHA'ing a metadata file
> destroys our latency, which is one of the biggest reasons nobody liked
> the command-not-found functionality when it was introduced: it was
> SLOW. This interactive command had to return results in ~100ms, not
> tens of seconds.
> By having 100% complete control of a copy of the cache we can keep
> certain files locked in memory, and we can be aggressive about caching
> pools of packages. This allows us to achieve the low-latency design
> required by gnome-software, which is firing off tons of transactions
> in parallel at startup with expected latency guarantees. Another thing
> it allows us to do is atomically update the cache, so if we're
> updating the cache in the background and we get interrupted or the
> transaction is cancelled to make room for a user-requester
> "interactive" transaction, we can just continue to use the old cache,
> and then atomically rename the new location to the proper location and
> update pools when done. You just can't do this when there are three
> things fiddling with files behind your back without any co-ordination.
> Note, if yum or DNF wanted to use the PK cache, it's guaranteed to be
> valid, complete and up to date, although I'm not sure a dependency
> from the package manager CLI to PK would be acceptable for their
What I think about this (I'm looking at the distribution level, rather
than specific packages):
1. If PK really needs its own *copy* of the cache, that's OK (well, not
OK but acceptable), but IMHO it should not download it independently
too. I think it should just copy the DNF(librepo) cache if it is
considered valid and up-to-date, or ask it to bring its cache up-to-date
and then copy the cache atomically to its own cache (preferably using
hardlinks if possible).
2. I believe that the use should know, and more importantly be able to
control WHEN the repo data is being updated. At the very least, he
should be able to specify if the updates are automatic or not using a
very user friendly method (probably during/after the installation; or
per network connection).
3. I think the repository data management backend should be separate
from the frontends (including PK, and dnf cli). Also, I like the idea of
having a working cache even when new repodata is being downloaded, and I
think it is something that DNF/Yum/... should also do. There were many
times that I ended up with a half-updated repo cache which prevented me
from using Yum as I didn't want/can let it download whole repodata.
Probably this should be filled as a feature request against DNF.
More information about the devel