On 1/13/22 2:59 PM, Richard Fontana wrote:
On Thu, Jan 13, 2022 at 8:56 AM Vít Ondruch vondruch@redhat.com wrote:
With my packagers hat on, I'd like this to be as simple as me (or some scanner) going through the code and listing all the licenses which are then put into `License` tag. The rest can handle automation, e.g. are these licenses good and in right combination, what is effective Fedora license. I am not sure why I, as a packager, should be involved in decision if Artistic should be listed somewhere or not.
so, are you saying you'd like to use a license scanner (e.g., ScanCode or FOSSology) to scan a proposed new package for Fedora and then have those results checked against the "good" list and then generate the License tag? This sounds like something that would require some additional tooling to be written to take the FOSSology results and compare them against the Fedora-license data, right?
BTW reducing the license list or identifying good/bad licenses is much easier task then identifying the licenses in the source code. If we mostly trust the scanners today, then we should leave the remaining decisions to automation as well, because these are much simpler task then license identification.
This is a very important issue which I think we need to tackle somehow. Jilayne has started this in her pull request by including a section providing guidance on how to determine the license applicable to a package,
My start to this was a bit less sophisticated than what I think Vit is intending. I was merely trying to address a more manual situation where, for example: package maintainer looks at LICENSE file for package, license text *looks* like BSD-3-Clause, but instead of trusting a visual inspection, the package maintainer could use - for example, the SPDX license-diff to see if the license actually matched something on the SPDX License List and then that info could be used to search the Fedora Good licenses.
I think (see above) that Vit is suggesting a more thorough and advance analysis.
but I think this is sufficiently difficult and complex and contentious that I believe it should be addressed in a separate document which will require a lot of work to develop.
yes, will move it out of the PR once we have the new location to put it. Then we can iterate from there, as needed.
Part of this is figuring out what we want the identifiers to signify and how much deviation from reality we want to tolerate. We really ought to start out figuring out this question and only then develop practical guidelines around it.
For example, existing Fedora guidance indicates that the License: tag should reflect the license of the appropriate binary (a very interesting convention which I've often thought is beneficial), which among other things implies that merely scanning source code will sometimes give "incorrect" results even if the scanner is somehow perfect.
The Fedora policy of license-reflects-binary is indeed a monkey wrench in using a license scanner to scan the source code. Would it be possible to change the question or analysis instead to: the License: tag should reflect the license for the code (whatever format, source or binary) that is actually distributed in Fedora. ??
Do we trust scanning tools? There are some tools (FOSS of course) I have some relatively decent confidence in as far as scanners go (ScanCode and FOSSology) although the only one I really ever used directly was ScanCode. Even ScanCode gets lots of things wrong, has spurious or unhelpful results, etc., and doesn't necessarily produce something easily translatable or representable as a compact SPDX expression. ScanCode is also not great at detecting bespoke likely-non-FOSS license texts.
I have not really used ScanCode and have more familiarity (even if a bit outdated) with FOSSology. It is true that many scanner results require some amount of "reconciliation" as I call it - that is, manually inspecting results that are ambiguous in some way. Often, there is an easily human-identifiable "answer", but it still requires some looking. That being said, if some/most package maintainers are actually looking at all the files in a more manual way, using a scanner would be a big improvement over that. FOSSology, and I believe ScanCode both have the capability to output scan results in various formats, including an SPDX document. This is not necessary for Fedora or the spec file, but... there is an SPDX specification field that sort of corresponds to the data that is used for the License: field in the spec file. https://spdx.github.io/spdx-spec/package-information/#714-all-licenses-infor... Only issue is that the All-License-Information-from-files-field does not differentiate on license expressions (disjunctive, conjunctive) so it's just a flat list. Hmmm...
We certainly could adopt a rule that the License: tag should be an SPDX expression representation of the "output of ScanCode" (configured in some particular way, say), despite the resultant inclusion of inaccurate or misleading information. (Inside Red Hat, we thought about this in a non-Fedora context.) We could dispense with the rules that make this difficult (such as the "binary license" rule). The resulting license tags would look a lot ... stranger than the ones Fedora has today, and would be in some ways less meaningful, but maybe that's okay. But I don't think even if we did this it would remove the need for the packager to do some level of analysis. How much analysis is desirable is a big question. Minimizing burdens on people doing packaging has to be an important consideration no matter where we end up.
Agreed.
To answer the question as to how much analysis and the burdens - I'd be interested to hear examples of how and to what extent package maintainers are doing this analysis currently? I imagine it being a wide range of answers, but I really don't have any idea. Maybe we could set up a poll?
Jilayne
Richard
Vít
Dne 13. 01. 22 v 4:37 Richard Fontana napsal(a):
On Wed, Jan 12, 2022 at 12:22 PM Jilayne Lovejoy jlovejoy@redhat.com wrote:
I think a few things got lost in translation - let me clarify. I also just went back and read this entire thread again, as I was only trying to reflect the things stated as either status quo (to then explicitly document) or clarification on things where there is sort of a status quo but may be some inconsistency (to take away inconsistency or questions for package maintainers). :)
On 1/11/22 9:17 PM, Richard Fontana wrote:
On Tue, Jan 11, 2022 at 10:49 PM Jilayne Lovejoy jlovejoy@redhat.com wrote:
So, I have just made another commit to the license packaging guidelines to update the sections on dual-licensing, multiple licenses and use of "with" for license exception over here: https://pagure.io/packaging-committee/pull-request/1142
In light of this thread, I'd suggest we update the first sentence of the Dual Licensing section to say, "If your package is dual licensed under a choice of two (or three, etc.) licenses and both licenses are "good" for Fedora, the License: field must reflect this by using "OR" as a separator. "
Note - this is a slight amendment to the current guidelines under the Dual Licensing section to explicitly state that if both licenses are "good" to pass along the choice which it seemed everyone on the thread agreed should be the case and is the common practice.
and add the following to the Dual Licensing section:
"If your package is licensed under a choice of two licenses and one is a "good" license and one is a "bad" license, then the License: field must reflect the "good" license only contain a comment explaining the original choice.
Note - the Multiple Licensing Scenarios - https://docs.fedoraproject.org/en-US/packaging-guidelines/LicensingGuideline... in the packaging guidelines requires a comment for these scenarios and gives some examples. So, I thought it would be consistent to use a similar approach in the "Good or Bad" dual licensing section and copied one of the examples of how to comment.
Example: Package dbfoo is dual licensed under Affero General Public License v3 or Server Side Public License and Fedora considers the Server Side Public License as "bad". Note the choice in a comment above the License: field and the License field as follows:
# The upstream package license is: AGPL-3.0-or-later OR SSPL-1.0 License: AGPL-3.0-or-later
I don't think this is a good idea. Obviously if a packager wants to put in such a comment they can, but I don't think this should be required or even recommended for the following reasons:
See comment above
First, it arguably creates more work for the packager to analyze licenses. Maybe in some cases this is work that the packager would be doing already, I realize. (For example, encountering SSPL-1.0, in your hypothetical, and verifying that it actually is a match to SPDX SSPL-1.0.)
I did not mean to imply that SPDX identifiers had to be used in the comment whatsoever, so we can simply change the example to something like the following (which would be consistent with other examples in the Multiple Licensing Scenarios examples):
# The upstream package license is: Affero General Public License v3 or later or Server Side Public License License: AGPL-3.0-or-later
or we could add another example like:
# The upstream package license is: GNU General Public License v2 or later or a commercial license License: GPL-2.0-or-later
OK, so as to these examples -- I don't like the AGPL|SSPL one because it's not realistic (I don't believe it is known to have occurred in the real world, unlike the GPL|Artistic cases and so forth).
For the other example: If you change "commercial license" to "proprietary license" (there's a long history of semi-justifiable objections to the term "commercial license" as an antonym to open source/free software license in FOSS), the problem here is that it too is not necessarily a real world case of the sort we're talking about here. It is pretty common for purportedly-single-licensor GPL projects to say informally something like "If you find the GPL problematic, you can write to me for an alternative commercial [yes, they might say commercial] license". But historically Fedora has just paid no attention to that sort of statement for licensing purposes -- it's just background noise. (More worrisome might be such a statement coupled with a reinterpretation of the GPL, e.g. "If you want to use this project commercially, contact me for alternative licensing options". But this doesn't go to the problem of license description in the spec file at all, but rather whether the license is "really" the GPL.)
What we don't usually see is a project repository where you have LICENSE.GPL and LICENSE.PROPRIETARY, in a context suggesting that the operative terms are a disjunctive dual license consisting of the two. I believe this is sufficiently uncommon that we shouldn't worry about this case.
To put it another way, I can't think of any real-world disjunctive dual license involving a (likely) "bad" license other than the Perl module case (where of course it is really common). If it's so unusual, why should the guidelines address it at all? Or if the only likely example is the Perl one, then the guidelines should only use the Perl (GPL/Artistic) example.
I think we could also add the word "known" to the guideline so it reads:
"If your package is licensed under a _known_ choice of two licenses and one is a "good" license and one is a "bad" license, then the License: field must reflect the "good" license only contain a comment explaining the original choice."
So thinking more about this, I think if we say anything about this, it should be specific to the Perl case, until someone encounters a counterexample.
We could say:
" If your package is licensed under a known choice of two licenses and one is a "good" license and one is a "bad" license, then the License: field must reflect the "good" license only. This is highly uncommon in Fedora packages apart from the case of Perl modules dual licensed under the GPL and the Artistic License 1.0. In that case you must pick the appropriate identifier for the GPL side (which in Perl modules will typically map to SPDX "GPL-1.0-or-later"). You are encouraged to include a comment memorializing this, for example: # Upstream project is dual licensed GPL | Artistic 1.0 " (I can explain why I don't think SPDX identifiers should be used in the example comment.)
However, I don't see why we should go to all this trouble if it is reasonably likely that in the one case where this problem is known to occur it might go away by revisiting Fedora's classification of the Artistic License 1.0 (in its various forms) as "bad".
Other posts on the mailing thread suggested a more complex notation in the actual License field, but that seems to risk breaking some checks or something (see David's response) and then runs into the problem you expressed re: using SPDX identifiers in the License: field. But if it's merely a comment, it's more flexible and yet the info if there for anyone who cares downstream or is wondering why the License field reflects just the one license and not the choice.
One thing that bothers me is, this is creating a new requirement that didn't exist before. It didn't exist in the Perl module case because Perl modules got "special treatment".
The question is, who cares downstream? I don't think this matters to Red Hat, to take one Fedora downstream. I would be interested in knowing whether there are Fedora community members (including folks who might be at Red Hat) who would like there to be a requirement that spec files document probably-rare (outside of the Perl module case) situations where, by application of Fedora policy, an upstream dual license disjunct is not selected. If anyone is really that interested they should be analyzing the source code.
The source code of these packages will in most cases I can imagine still have all the information about the upstream dual license. For example, under the new rule, a Fedora Perl module package might have "License: GPL-1.0-or-later OR Artistic-1.0", but the source tarball in the source RPM, at least, will have a copy of the Artistic license or source file license notices that state the dual license or however the dual license is indicated upstream (let's ignore the actual common usage of the horrible "Licensed under the same terms as Perl itself"). So to the really curious person no information is lost. I don't think anyone has ever suggested that the "bad" license in these cases should be stripped out.
Documenting this explicitly also sets expectations downstream as well that Fedora is not going to pass along the option to redistribute a package under a license considered "bad" by Fedora.
We're already basically saying this -- even today, the only thing that is confusing is the "special treatment" given to Perl modules. It just doesn't seem important enough to mandate a new requirement. I just think these guidelines need to spring from and relate to real world scenarios.
I forget whether I said this in this thread, but if we require documentation for this case, why don't we require it for all cases in which the packager makes some simplification or reduction in complexity of the license expression. I am not sure we want to mandate that, though we might want to encourage it. My impression is very few spec files today contain comments of that sort.
Richard _______________________________________________ legal mailing list -- legal@lists.fedoraproject.org To unsubscribe send an email to legal-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/legal@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
legal mailing list -- legal@lists.fedoraproject.org To unsubscribe send an email to legal-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/legal@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
-- _______________________________________________ legal mailing list -- legal@lists.fedoraproject.org To unsubscribe send an email to legal-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/legal@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure