Introduction ========= The original way that Flatpaks were updated was via ostree repositories. More recently, the ability was added to distribute them as container images, and that is what we do for the Fedora build Flatpaks via registry.fedoraproject.org.
The one big gap we have between the two ways of distributing Flatpaks is delta updates - if you have downloaded a Flatpak from an ostree repository, you can update to the next version via the raw ostree protocol - downloading only changed files one, and even better, if "ostree static deltas" have been properly computed and stored in the upstream repository, you can get one big blob that uses bsdiff and other techniques to efficiently compress the differences
This spring, Alex Larsson came up with a way of updating containers with deltas *in general* and implemented that in podman (https://github.com/containers/image/pull/902, still pending) and in Flatpak (support released with Flatpak 1.8.). See:
https://blogs.gnome.org/alexl/2020/05/13/putting-container-updates-on-a-diet...
This is inspired by ostree static deltas, but modified to fit into the container world - the end result is both super simple and works remarkably well.
The question then becomes: how do we generate deltas for the Flatpaks we ship for Fedora and make them available to users?
Generating static deltas for Flatpaks ===========================
There are currently 3 different codebases I maintain to wrangle Flatpak metadata:
regindexer: script that queries registry.fedoraproject.org and writes an index of Flatpaks for use by Flatpak clients (deployed in Fedora infrastructure currently) https://pagure.io/regindexer
flatpak-status: daemon that queries bodhi and koji, figures out what Flatpaks are out-of-date, and generates a JSON file used to create a web user interface https://fedora.fishsoup.net/flatpak-status/ https://github.com/owtaylor/flatpak-status
flatpak-indexer: daemon that queries the Red Hat container api, and internal Koji instance, and writes an index of Flatpaks for use by Flatpak clients (e.g., https://flatpaks.redhat.io/rhel/index/static?label:org.flatpak.ref:exists=1&...)
My path forward here was: take the code from flatpak-status that queries bodhi and koji, use it to teach flatpak-indexer how to handle Fedora Flatpaks as well as RHEL flatpaks, then add the capability to generate deltas.
The result of this can be found:
https://github.com/owtaylor/flatpak-indexer
it seems to work fine - it generates index and deltas for Fedora Flatpaks that work in limited testing. (The download for updating berusky from the last stable version in Fedora to the current one was reduced from 5.5MB to 18k)
Distributing static deltas for Flatpaks =========================== The eventual goal of our delta project is to upload them to container registries - as described in Alex's blog post - and we've been in discussion with the Quay.io folks to figure out the best way to do this. But we didn't want to block static deltas on a) having OCI artifact support on quay.io b) having a finalized way to do delta updates as OCI artifacts and a white-listed MIME type c) getting Fedora switched to quay.io d) having Red Hat built containers hosted on a container registry. So we added a second path to Flatpak - the image index that Flatpak consumes can point to a "delta manifest" as an HTTP URL, and that can point to the individual layer deltas also by HTTP URL.
For now, flatpak-indexer doesn't upload the delta manifests or layer deltas - it just writes them into the static data along with the indexes and icons, and the points to them from the index.
Architecture of flatpak-indexer ======================= flatpak-indexer shares a property with the currently deployed regindexer - it is entirely generating static content. The overall components are:
redis: used to cache data retrieved from "upstream" (koji and bodhi for Fedora), and to communicate between the indexer and differ containers indexer: the main container - it periodically retrieves data from upstream sources, determines what layers need deltas, queues them up for the differ containers, collects the results, and writes the delta manifests and indexes differ: containers that execute the expensive 'tar-diff' operation to compute the layer deltas. The number of containers can potentially be scaled based on the number of queued layer deltas frontend: an apache server that serves up the generated data - with appropriate redirects and headers
This is set up in openshift internally, and would presumably be done the same way in Fedora infrastructure, though we could potentially leave the "frontend" role to sundries as it is currently with the regindexer generated index and icons.
Resource consumption ================== The 'redis' and 'indexer' containers are lightweight - intermittent usage of 1 cpu, maybe 256MB of memory. The differ containers need to be more beefy - they could use 1-2 cpus and 2-3GB of memory. But they are only needed intermittently and are otherwise idle. With some added complexity, they could be scaled to zero, and only scaled up when there are tar-diffs to process.
The disk usage for index+icons+deltas for the current Fedora Flatpak set is 347MB - this will increase proportional to the number of Flatpaks in Fedora, but I wouldn't expect it to ever be *much* bigger - deltas pointing to old versions of images will be cleaned up.
Future work ========= The current status is good enough to deploy as a delta solution, but some more future possibilities:
* Teach flatpak-indexer how to upload deltas to a container registry as OCI artifacts * Use flatpak-indexer to generate indexes for Fedora containers that are not Flatpaks, useful for * Make flatpak-indexer the backend to the Flatpak status web page, move that inside Fedora infrastructure
infrastructure@lists.fedoraproject.org