[releng] Issue #6999: A static index for registry.fedoraproject.org
[placeholder]
by Owen Taylor
otaylor added a new comment to an issue you are following:
``
I actually got pretty far with this in the fall - there's a fairly complete solution at:
https://github.com/owtaylor/metastore/
I started off looking at 'reg' or writing a similar static approach by hand, but it didn't seem convincing to me that it would scale even to the anticipated needs of registry.fedoraproject.org - the basic problem is that you need a *lot* of HTTP requests to get all the data you need:
A) List all the repositories
B) List all the tags in each repositories
C) Get the manifest for each tag
D Get the config for the manifest to find out labels
I *want* to get to the point where we have ~1000 flatpaks in the Fedora registry, in addition to all the server side containers - multiply that out by different version tags, etc, and we're talking 10k+ requests at a minimum to do a complete scan. That's going to put a lot of load on the registry for multiple minutes, at my estimation. (A complete scan of the *current* registry contents to this level of detail takes several minutes, though it would likely be faster collocated.)
So how often could we run such a scan? Every 30 minutes? Every hour? For the candidate registry, we'd like new builds to visible be not in an hour, but ideally within seconds so that after a build a maintainer can immediately test out if it works.
The other problem is that you end up hardcoding in the server config the metadata and containers you need, or generating gigantic output files with ALL the metadata for ALL the containers.
The basic principles of metastore are:
* Harvest metadata out of the registry via the http API
* Update it incrementally triggered by webhooks
* Store it in a database
* Allow arbitrary queries, but pay a lot of attention to being able to cache them with a frontend cache
I've been holding off on pushing to deploy this to try and get some feedback internally/externally first, but haven't succeeded so far - partly because of lack of bandwidth to bug people.
In terms of appstream, I think we just put it into a OCI annotation at build time - though digging inside the container to extract when indexing is theoretically possible.
``
To reply, visit the link below or just reply to this email
https://pagure.io/releng/issue/6999
6 years, 4 months