On Sat, Nov 20, 2021 at 03:04:18PM -0500, Matthew Miller wrote:
I mentioned on devel list that Sourcegraph is going to be indexing
source.fedoraproject.org.
You mean
src.fedoraproject.org ? :)
So I guess, first — heads-up! if you see their traffic, it's not
malicious.
(I asked them to provide us with the user agent they use, and I'll pass that
on.)
But then, I'd actually like to go a step further, and have them index _the
actual source for every build_. They're open to doing that, but what they
need is a git repo.
ok. I assume they want us to give them one, they don't/can't/aren't able
to do any work on their end?
And you want the source of the actual build with all patches and
changes? Or upstream and the patches and changes seperately?
So...... how hard / ridiculous / bad would it be to have a step in
the build
process in koji which, between the %prep and %build phases, push the
unpacked-and-prepped source tree to Gitlab, under, say,
https://gitlab.com/fedora/exploded-build-sources/package-name/?
It would be very bad if gitlab was down/broken.
AFAIK we don't have any SLA there, it's just a namespace on
gitlab.com.
If this process does admin things, it would mean exposing that
potentially to all packagers.
Also unpacked sources could be... very large.
I'm imaginging this would work something like this:
1. If remote repo doesn't exist, create it with the gitlab api
2. Do a shallow, no-checkout clone of that remote repo, using
--git-dir and --work-tree so that the .git directory isn't created inside
the working directory
3. copy the unpacked source tree from %prep into the work-tree dir
(are we building on btrfs? cp -l if not, to save io?)
4. git add --all
5. git commit -m "Build ID: ${buildID}
https://koji.fedoraproject.org/koji/buildinfo?buildID=${buildID}"
6. git push
This would add... a lot of overhead to builds. Especially builds with
lots of files. I dont think say kernel maintainers want another few
hours added...
Additionally, this would only mean packages built recently would be
searchable? until the next mass rebuild anyhow...
With some details to be worked out. :) Like, repo tags as branches,
maybe?
And, do this on all arches, or just one of them? Or maybe run up through
%prep as part of the src build rather than any of the binary builds? Make
the commit as the Fedora username of the person who did the build?
But those kinds of implementation details aside -- does this seem like
something we might be able to do?
Or, any ideas for an alternate approach? I mean, obviously, source-git would
be one such alternative, but getting that for _everything_ would require a
big rework of how everyone works. (I think we should get to that eventually,
but that's a long way off.)
There's a sig that has been working on this very issue. I would vastly
prefer we do something that every buys into for normal workflow.
https://fedoraproject.org/wiki/SIGs/Source-git
Perhaps we should ask for an update from them?
It may also be very worth looking at the debugsource packages if they
can unpack/search those? Or perhaps we could use those to populate the
git repos.
Anyhow, we can try and figure something out here... it would be great to
have things searchable.
kevin