On Wed, Apr 8, 2020 at 4:27 PM Jeremy Cline <jeremy(a)jcline.org> wrote:
Hi folks,
The Fedora kernel is moving to maintaining the package in a source
(sometimes people refer to it as an "exploded") tree. Basically just a
fork of upstream. This makes a lot of packager tasks easier, but has
introduced a minor issue with respect to the lookaside cache.
Right now, it's configured to create a tarball from the git tree and
upload it to the lookaside cache for each build. We build the rawhide
kernel every weekday (give or take) and the xz compressed source
tarball is ~110MB. This works out to about 28GB per year for Rawhide
alone (if this is a drop in the bucket and no one cares please let me
know and we'll just do this). The old approach uploaded a release
tarball and then incremental tarballs on top of that.
If, however, Fedora allowed packagers to optionally generate tarballs
from a git repository we could just push the linux git repository. The
entire repository with history going back 15 years is under 4GB total,
which is pretty good when compared to ~419GB which is the space
required for the equivalent time using the lookaside cache.
What would need to change:
* Fedora offers a git repository to push source trees to.
* A new file in the dist-git repository could be added if the packager
wishes called "source-repos". In it, it contains a git url and commit
identifier. For example, an entry might look like:
"
https://src.fedoraproject.org/sources/kernel.git v5.6"
where v5.6 is a tag in the repository. We can restrict it so the git
repository must be hosted by Fedora so we keep all the sources
forever.
* fedpkg and fedpkg-minimal would need to be updated to pull the
source tree if the "source-repos" file is found and run
"git archive". Fortunately this work is actually already done since
Red Hat's version of fedpkg already supports this.
I'm happy do to all the work for fedpkg/fedpkg-minimal to make this
possible because the other option is to add a bunch of hacks to the
kernel tooling to spit out a bunch of incremental tarballs to reduce
what we have to upload.
I assume this is something that will need to go through the packaging
SIG, but from an infra side of things are there any thoughts/concerns?
At least with this _specific_ proposal, I don't see too many issues.
Adding a "sources" namespace to Pagure and setting up a workflow for
that isn't a horrible idea.
I still feel like my general concerns in original proposal from two
years ago[1] haven't been sufficiently addressed. But, given that you
seem to have a specific idea in mind here, my questions about this for
the kernel (and others that would opt into this workflow):
* Are you okay with imposing the same restrictions we have on rpms/*,
modules/*, flatpaks/*, and containers/* for sources/*? That is, no
rewriting history, no branch deletion, no tag deletion, etc.
* Are you okay with blocking the usage of submodules, Git LFS,
Git-Annex, or any other mechanism that allows bypassing our
protections or cannot be replicated from an upstream repo locally?
[1]:
https://pagure.io/releng/issue/7498
--
真実はいつも一つ!/ Always, there's only one truth!