On Wed, May 6, 2020 at 10:37 AM Vít Ondruch vondruch@redhat.com wrote:
Dne 05. 05. 20 v 18:37 Fabio Valentini napsal(a):
On Mon, May 4, 2020 at 5:06 PM Tomas Tomecek ttomecek@redhat.com wrote:
Hi Tomas,
I'll respond below with some of my experiences and opinions ...
Let’s talk about dist-git, as a place where we work. For us, packagers, it’s a well-known place. Yet for newcomers, it may take a while to learn all the details. Even though we operate with projects in a dist-git repository, the layout doesn’t resemble the respective upstream project.
There is a multitude of tasks we tend to perform in a dist-git repo:
- Bumping a release field for sake of a rebuild.
- Updating to the latest upstream release.
- Resolving CVEs.
- Fixing bugs by…
- Changing a spec file.
- Pulling a commit from upstream.
- Or even backporting a commit.
- And more...
For some tasks, the workflow is just fine and pretty straightforward. But for the other, it’s very gruesome - the moment you need to touch patch files, the horror comes in. The fact that we operate with patch files, in a git repository, is just mind-boggling to me.
Luckily, we have tooling which supports the repository layout - `fedpkg prep`, `srpm` or `mockbuild` are such handy commands - you can easily inspect the source tree or make sure your local change builds.
Where am I getting with this?
Over the years there have been multiple tools created to improve the development experience: rdopkg [r], rpkg-util [ru], tito [t] and probably much much more (e.g. the way Fedora kernel developers work on kernel [k]).
(snip)
In the packit project, we work in source-git repositories. These are pretty much upstream repositories combined with Fedora downstream packaging files. An example: I recently added a project called nyancat [n] to Fedora. I have worked [w] on packaging the project in the GitHub repo and then just pushed the changes to dist-git using packit tooling. These source-git repositories can live anywhere: we have support for GitHub right now and are working on supporting pagure.
Would there be an interest within the community, as opt-in, to have such source-git repositories created for respective dist-git repositories? The idea is that you would work in the source-git repo and then let packit handle synchronization with a respective dist-git repo. Our aim is to provide the contribution experience you have in GitHub when working on your packages. Dist-git would still be the authoritative source and a place where official builds are done - the source-git repo would work as a way to collaborate. We also don’t have plans right now to integrate packit into fedpkg.
So, in my experience, source-git might be a workable solution for packages with *big* downstream modifications. However, for everything else, I think it's a way to make it easy to accrue technical debt and to do cargo-culting with downstream patches.
The vast majority of packages has *no patches* (or at most, one or two of them)
(snip)
I don't really want to argue with this point, I tend to agree. Just out of interest, do we have some statistics to support this? O:-)
I did not have any stats when I wrote this, but now I do. Parsing the rawhide spec files from [0] for lines matching "^Patch[0-9]*:[ \t]*.*$", I get the following distribution:
number of patches: number of packages total: 21970 0: 15638 1: 3287 2: 1232 3: 598 4: 325 5: 221 6: 154 7: 97 8: 83 9: 57 10: 41 11: 27 12: 26 13: 25 14: 13 15: 13 16: 14 17: 15 18: 5 19: 8 20: 2 21: 11 22: 2 23: 4 24: 4 25: 5 26: 3 27: 4 28: 5 29: 5 30: 2 31: 6 32: 4 33: 3 34: 1 35: 4 37: 2 38: 1 41: 1 42: 2 46: 1 47: 1 48: 3 49: 1 50: 2 51: 1 53: 1 54: 1 66: 1 68: 1 71: 1 75: 1 78: 1 79: 1 85: 1 127: 1 170: 1
In relative terms:
- 71% of packages have ZERO patches - 15% have ONE patch - 5% have TWO patches - 3% have THREE patches - 5% have MORE than THREE patches
Most packages have none (71%) - or at most two - patches (91%, my original "guess" for "vast majority"), some have 3-5 patches (5%), and a minority (4%) has six patches or more. So it seems this backs up my claim :)
Fabio
[0]: https://pkgs.fedoraproject.org/repo/rpm-specs-latest.tar.xz
, and uses *unmodified upstream sources / tarballs*. I never want to deal with exploded upstream sources, unless I'm creating a patch for something.
When it's an upstream commit that applies cleanly to the latest sources, I'll just add it in the .spec file, and let the tooling handle the rest. It's pretty neat to directly link to upstream commits (it works with github and gitlab and pagure, as far as I know), and let our tooling (spectool, fedpkg) do everything else. I don't have to download, patch, or touch sources myself in any way for that.
Unfortunately, in Ruby world, this unfortunately works less and less, because the released packages does not contain test suite these days. So if there is fix for some feature and associated test, then the patch has to be modified (the test part has to be stripped or split out). Otherwise I like this approach as well.
When I need to make changes that I am able to push back upstream, I don't do that in packaging, but fork upstream, do my changes, create a pull request, and again point my .spec file to the patches from there. No need to touch dist-git there, instead I'm working closely with upstream.
In the rare occasion that I need to make downstream-only changes with patches, I usually just explode the upstream tarball, run "git init", then "git add .", "git commit -m import", apply my changes, and then do "git diff --patch > ../00-my-changes.patch" (if it's just one commit), or "git format-patch -o ../" if there are multiple commits, and then delete the exploded sources again.
Having infrastructure for exploding sources from the package would be very interesting.
Vít
I maintain ~400 packages in fedora, and the only one with substantial downstream modifications (about 10 patches on top of upstream) is Jekyll (rubygem-jekyll), where I primarily disable tests for features that are not enabled in fedora anyway (if I didn't want to run any tests, I could just drop the number of patches to 1 or 2, making the fork unnecessary - but I like running the tests).
So while I agree that for *some* packages with *huge*, non-upstreamable diffs between upstream and fedora the source-git approach might work, I doubt that it would help in 99% of cases, or even make it too easy for packagers to make more and more downstream-only changes.
Fabio
The main reason I am sending this is to gather feedback from all of you whether there is an interest in such a workflow. We don’t have concrete plans for Fedora right now but based on your feedback we could.
[r] https://github.com/softwarefactory-project/rdopkg [ru] https://pagure.io/rpkg-util [t] https://github.com/rpm-software-management/tito [k] https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedorapro... [n] https://github.com/TomasTomecek/nyancat [w] https://github.com/TomasTomecek/nyancat/pull/2
Cheers, Tomas _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org