MIT and BSD are very common licenses and can be tricky to convert to SPDX license identifiers. Just today, I got two questions about it. We have this covered in FAQ
https://fedoraproject.org/wiki/Changes/SPDX_Licenses_Phase_1#I_have_a_packag...
But still, it can be confusing. So let me explain more verbosely than we have in the FAQ.
You have a package that is licensed under the "MIT" license. So you run
```
$ license-fedora2spdx *'*MIT*'*
Warning: more options how to interpret MIT. Possible options: ['mpich2', 'libtiff', 'SMLNJ', 'SGI-B-2.0', 'NTP', 'MIT', 'MIT-open-group', 'MIT-feh', 'MIT-enna', 'MIT-Modern-Variant', 'MIT-CMU', 'ICU', 'HPND', 'BSL-1.0', 'Adobe-Glyph']
Adobe-Glyph
```
Can you choose any license from the output? No. Definitely no.
You can take the result of this tool only when there is no warning. E.g.:
```
$ license-fedora2spdx 'GPLv2' GPL-2.0-only
```
Until now, what Fedora described as an "MIT" license was, in fact, a whole family of licenses. SPDX identify them differently. And the differences can be subtle. E.g., compare
* https://spdx.org/licenses/MIT.html * https://spdx.org/licenses/MIT-feh.html * https://spdx.org/licenses/MIT-open-group.html
If your old Fedora license was MIT, there is a very high chance that the new one will be MIT too. But it is far from being 100 % sure. There are 14 other options. These that `license-fedora2spdx` listed in the warning above.
Similarly, for BSD. BSD also identified the whole family. You likely end up with "BSD-2-Clause" or "BSD-3-Clause", but there are two different options as well.
There are two common ways to find out what SPDX identifier you should use in such cases.
1) You can use https://github.com/spdx/spdx-license-diff and use it to identify your license. This is a Chrome and Firefox plugin and allows you to select the text; and in the context menu, you can choose to identify the license. It will print, e.g., that it matches 60% of the MIT-feh license and highlight the difference. Or...
2) you can navigate to
https://docs.fedoraproject.org/en-US/legal/allowed-licenses/
in the search box above the first table, you enter your license and filter the content. If you enter "MIT", it will find you 26 licenses. Out of them, 15 have "MIT" in the "Fedora abbreviation" column (Hmm, this should be changed to "legacy name"). Now you have to open the link in the "URL" column and find your package's license. This may look painful, but you usually find the correct license within a few clicks.
Miroslav
Miroslav Suchý wrote:
There are two common ways to find out what SPDX identifier you should use in such cases.
- You can use https://github.com/spdx/spdx-license-diff and use it to
identify your license. This is a Chrome and Firefox plugin and allows you to select the text; and in the context menu, you can choose to identify the license. It will print, e.g., that it matches 60% of the MIT-feh license and highlight the difference. Or...
- you can navigate to
https://docs.fedoraproject.org/en-US/legal/allowed-licenses/
in the search box above the first table, you enter your license and filter the content. If you enter "MIT", it will find you 26 licenses. Out of them, 15 have "MIT" in the "Fedora abbreviation" column (Hmm, this should be changed to "legacy name"). Now you have to open the link in the "URL" column and find your package's license. This may look painful, but you usually find the correct license within a few clicks.
That is a lot of pointless work for details that almost certainly is going to care about, or even notice to begin with.
I would suggest just picking the most common option (MIT→MIT and BSD→BSD-3- Clause) and letting people file a bug if it turns out to be wrong. We have had packages with more inaccurate License tags than that (wrong GPL version, GPL instead of LGPL or vice-versa, etc., sometimes even entirely wrong licenses).
Kevin Kofler
On Mon, Nov 14, 2022 at 3:29 PM Miroslav Suchý msuchy@redhat.com wrote:
Until now, what Fedora described as an "MIT" license was, in fact, a whole family of licenses. SPDX identify them differently. And the differences can be subtle. E.g., compare
https://spdx.org/licenses/MIT.html https://spdx.org/licenses/MIT-feh.html https://spdx.org/licenses/MIT-open-group.html
If your old Fedora license was MIT, there is a very high chance that the new one will be MIT too. But it is far from being 100 % sure.
BTW this can vary based on the age and language community/ecosystem of the upstream project. Relatively old projects written in C are more likely to have "MIT"-like licenses that are not MIT in the OSI/SPDX sense, while, say, less old PyPI-packaged Python projects are more likely to just have that de-facto-standard MIT license. I'm pretty sympathetic to maintainers of some of the older and more (license-wise) complex packages where this process of license representation migration can be more complicated.
There are 14 other options. These that `license-fedora2spdx` listed in the warning above.
Similarly, for BSD. BSD also identified the whole family. You likely end up with "BSD-2-Clause" or "BSD-3-Clause", but there are two different options as well.
There are two common ways to find out what SPDX identifier you should use in such cases.
You can use https://github.com/spdx/spdx-license-diff and use it to identify your license. This is a Chrome and Firefox plugin and allows you to select the text; and in the context menu, you can choose to identify the license. It will print, e.g., that it matches 60% of the MIT-feh license and highlight the difference. Or...
you can navigate to
https://docs.fedoraproject.org/en-US/legal/allowed-licenses/
in the search box above the first table, you enter your license and filter the content. If you enter "MIT", it will find you 26 licenses. Out of them, 15 have "MIT" in the "Fedora abbreviation" column (Hmm, this should be changed to "legacy name"). Now you have to open the link in the "URL" column and find your package's license. This may look painful, but you usually find the correct license within a few clicks.
While that is worth checking, it assumes that you can identify a license based on its name (or what you think it might be) which will not work in all cases. I'm hoping that eventually we can develop tools that could do license text matching against the corpus of allowed and not-allowed Fedora licenses (maybe something like an adaptation of spdx-license-diff, maybe something simpler).
Also, feel free to submit an issue at https://gitlab.com/fedora/legal/fedora-license-data or (less preferable) posting a question to legal@lists.fedoraproject.org.
Richard
V Mon, Nov 14, 2022 at 08:53:09PM +0100, Miroslav Suchý napsal(a):
There are two common ways to find out what SPDX identifier you should use in such cases.
- You can use https://github.com/spdx/spdx-license-diff and use it to
identify your license. This is a Chrome and Firefox plugin and allows you to select the text; and in the context menu, you can choose to identify the license. It will print, e.g., that it matches 60% of the MIT-feh license and highlight the difference. Or...
- you can navigate to
https://docs.fedoraproject.org/en-US/legal/allowed-licenses/
in the search box above the first table, you enter your license and filter the content. If you enter "MIT", it will find you 26 licenses. Out of them, 15 have "MIT" in the "Fedora abbreviation" column (Hmm, this should be changed to "legacy name"). Now you have to open the link in the "URL" column and find your package's license. This may look painful, but you usually find the correct license within a few clicks.
SPDX has a web page https://tools.spdx.org/app/check_license/ which can help identify a license. It works to some degree. E.g. it operates on license texts. Not on license declarations. It also requires the license without a context. E.g. without warranty disclaimers. Without programming laguage comment markers.
Though I usually resort to opening all MIT-like licenses, "X11" is one of them, from an SPDX license list https://spdx.org/licenses/ and comparing them against my license.
-- Petr
On Mon, Nov 14, 2022 at 08:53:09PM +0100, Miroslav Suchý wrote:
MIT and BSD are very common licenses and can be tricky to convert to SPDX license identifiers. Just today, I got two questions about it. We have this covered in FAQ
https://fedoraproject.org/wiki/Changes/SPDX_Licenses_Phase_1#I_have_a_packag...
But still, it can be confusing. So let me explain more verbosely than we have in the FAQ.
You have a package that is licensed under the "MIT" license. So you run
$ license-fedora2spdx *'*MIT*'* Warning: more options how to interpret MIT. Possible options: ['mpich2', 'libtiff', 'SMLNJ', 'SGI-B-2.0', 'NTP', 'MIT', 'MIT-open-group', 'MIT-feh', 'MIT-enna', 'MIT-Modern-Variant', 'MIT-CMU', 'ICU', 'HPND', 'BSL-1.0', 'Adobe-Glyph'] Adobe-Glyph
Interestingly when I run 'license-fedora2spdx MIT' is just always prints 'mpich2', and the list of suggestions is entirely reversed from what you show here. Why 'mpich2' - it is simply because it is last in the list to be loaded. This is rather misleading and unhelpful IMHO.
Can you choose any license from the output? No. Definitely no.
In that case, IMHO, the tool should NOT suggest a license from the list at all, and definitely not arbitrarily suggest the last one it loaded, which is highly likely to be wrong. If it wants to suggest, then suggest the most likely option out of the variants. Or can it suggest an intentionally invalid placeholder eg "{MIT choice}" to make it explicit that the maintainer has to actively make a choice.
$ license-fedora2spdx "(GPLv2 or MIT or BSD)" Warning: more options how to interpret MIT. Possible options: ['Adobe-Glyph', 'BSL-1.0', 'BSL-1.0', 'HPND', 'HPND', 'ICU', 'MIT-CMU', 'MIT-Modern-Variant', 'MIT-enna', 'MIT-feh', 'MIT-open-group', 'MIT', 'NTP', 'SGI-B-2.0', 'SMLNJ', 'SMLNJ', 'libtiff', 'libtiff', 'mpich2'] Warning: more options how to interpret BSD. Possible options: ['BSD-2-Clause-FreeBSD', 'BSD-2-Clause-Views', 'BSD-2-Clause', 'BSD-3-Clause', 'BSD-3-Clause', 'BSD-3-Clause', 'BSD-3-Clause'] ( GPL-2.0-only OR {{pick MIT choice}} OR {{pick BSD choice}} )
With regards, Daniel
Dne 15. 11. 22 v 10:44 Daniel P. Berrangé napsal(a):
Interestingly when I run 'license-fedora2spdx MIT' is just always prints 'mpich2', and the list of suggestions is entirely reversed from what you show here. Why 'mpich2' - it is simply because it is last in the list to be loaded. This is rather misleading and unhelpful IMHO.
Yes. Just first one in the list.
In that case, IMHO, the tool should NOT suggest a license from the list at all, and definitely not arbitrarily suggest the last one it loaded, which is highly likely to be wrong. If it wants to suggest, then suggest the most likely option out of the variants. Or can it suggest an intentionally invalid placeholder eg "{MIT choice}" to make it explicit that the maintainer has to actively make a choice.
$ license-fedora2spdx "(GPLv2 or MIT or BSD)" Warning: more options how to interpret MIT. Possible options: ['Adobe-Glyph', 'BSL-1.0', 'BSL-1.0', 'HPND', 'HPND', 'ICU', 'MIT-CMU', 'MIT-Modern-Variant', 'MIT-enna', 'MIT-feh', 'MIT-open-group', 'MIT', 'NTP', 'SGI-B-2.0', 'SMLNJ', 'SMLNJ', 'libtiff', 'libtiff', 'mpich2'] Warning: more options how to interpret BSD. Possible options: ['BSD-2-Clause-FreeBSD', 'BSD-2-Clause-Views', 'BSD-2-Clause', 'BSD-3-Clause', 'BSD-3-Clause', 'BSD-3-Clause', 'BSD-3-Clause'] ( GPL-2.0-only OR {{pick MIT choice}} OR {{pick BSD choice}} )
Good idea. I will try to implement it. Thank you.
Miroslav
Dne 15. 11. 22 v 10:44 Daniel P. Berrangé napsal(a):
In that case, IMHO, the tool should NOT suggest a license from the list at all, and definitely not arbitrarily suggest the last one it loaded, which is highly likely to be wrong. If it wants to suggest, then suggest the most likely option out of the variants. Or can it suggest an intentionally invalid placeholder eg "{MIT choice}" to make it explicit that the maintainer has to actively make a choice.
$ license-fedora2spdx "(GPLv2 or MIT or BSD)" Warning: more options how to interpret MIT. Possible options: ['Adobe-Glyph', 'BSL-1.0', 'BSL-1.0', 'HPND', 'HPND', 'ICU', 'MIT-CMU', 'MIT-Modern-Variant', 'MIT-enna', 'MIT-feh', 'MIT-open-group', 'MIT', 'NTP', 'SGI-B-2.0', 'SMLNJ', 'SMLNJ', 'libtiff', 'libtiff', 'mpich2'] Warning: more options how to interpret BSD. Possible options: ['BSD-2-Clause-FreeBSD', 'BSD-2-Clause-Views', 'BSD-2-Clause', 'BSD-3-Clause', 'BSD-3-Clause', 'BSD-3-Clause', 'BSD-3-Clause'] ( GPL-2.0-only OR {{pick MIT choice}} OR {{pick BSD choice}} )
Your wish has been granted. Implemented, built and I just submitted it to bodhi as license-validate-14-1
Miroslav
On 14. 11. 22 20:53, Miroslav Suchý wrote:
- You can use https://github.com/spdx/spdx-license-diff and use it to identify
your license. This is a Chrome and Firefox plugin and allows you to select the text; and in the context menu, you can choose to identify the license. It will print, e.g., that it matches 60% of the MIT-feh license and highlight the difference. Or...
Do we have a command line tool for this? Does licensecheck support SPDX identifiers?
(I find the use of browser extension for this very weird. I have the LICENSE file unpackaged with the sources on my machine, I am not browsing it on the web.)
On Tue, Nov 15, 2022 at 6:24 AM Miro Hrončok mhroncok@redhat.com wrote:
On 14. 11. 22 20:53, Miroslav Suchý wrote:
- You can use https://github.com/spdx/spdx-license-diff and use it to identify
your license. This is a Chrome and Firefox plugin and allows you to select the text; and in the context menu, you can choose to identify the license. It will print, e.g., that it matches 60% of the MIT-feh license and highlight the difference. Or...
Do we have a command line tool for this? Does licensecheck support SPDX identifiers?
(I find the use of browser extension for this very weird. I have the LICENSE file unpackaged with the sources on my machine, I am not browsing it on the web.)
licensecheck supports SPDX, you just have to run it with "--shortname-scheme spdx".
Neal Gompa wrote:
On Tue, Nov 15, 2022 at 6:24 AM Miro Hrončok mhroncok@redhat.com wrote:
Do we have a command line tool for this? Does licensecheck support SPDX identifiers?
(I find the use of browser extension for this very weird. I have the LICENSE file unpackaged with the sources on my machine, I am not browsing it on the web.)
licensecheck supports SPDX, you just have to run it with "--shortname-scheme spdx".
In my recent & limited experience, licensecheck did not produce valid SPDX output in many cases. As an example, take a file with the following license header:
/* * test-run-command.c: test run command API. * * (C) 2009 Ilari Liusvaara ilari.liusvaara@elisanet.fi * * This code is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as * published by the Free Software Foundation. */
I expect it to return GPL-2.0-only, but it returns GPL-2:
$ licensecheck --shortname-scheme spdx t/helper/test-run-command.c t/helper/test-run-command.c: GPL-2
I did not see any files in the git source labeled with the appropriate SPDX identifier for GPL-2.0*. Similar for LGPL. For BSD-3-Clause, licensecheck used a lower-case C, which then fails to match a valid license in rpmlint.
Am I missing something obvious or does licensecheck not work as expected? This is with licensecheck-3.3.0-2.fc36.noarch.
On Tue, Nov 15, 2022 at 11:05 AM Todd Zullinger tmz@pobox.com wrote:
Neal Gompa wrote:
On Tue, Nov 15, 2022 at 6:24 AM Miro Hrončok mhroncok@redhat.com wrote:
Do we have a command line tool for this? Does licensecheck support SPDX identifiers?
(I find the use of browser extension for this very weird. I have the LICENSE file unpackaged with the sources on my machine, I am not browsing it on the web.)
licensecheck supports SPDX, you just have to run it with "--shortname-scheme spdx".
In my recent & limited experience, licensecheck did not produce valid SPDX output in many cases. As an example, take a file with the following license header:
/*
- test-run-command.c: test run command API.
- (C) 2009 Ilari Liusvaara ilari.liusvaara@elisanet.fi
- This code is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License version 2 as
- published by the Free Software Foundation.
*/
I expect it to return GPL-2.0-only, but it returns GPL-2:
$ licensecheck --shortname-scheme spdx t/helper/test-run-command.c t/helper/test-run-command.c: GPL-2
That is DEP-5 SPDX(ish) identifiers, which is what Debian uses for debian/copyright files. I am a bit surprised it gives DEP-5 for "spdx", but since the tool is from Debian, I guess it makes some sense...
The identifier is considered valid, as SPDX GPL-2.0 is considered equivalent to DEP-5 GPL-2, and SPDX-3.0 GPL-2.0-only is equivalent to SPDX-2.0 GPL-2.0.
Cf. https://wiki.debian.org/Proposals/CopyrightFormat
I did not see any files in the git source labeled with the appropriate SPDX identifier for GPL-2.0*. Similar for LGPL. For BSD-3-Clause, licensecheck used a lower-case C, which then fails to match a valid license in rpmlint.
Am I missing something obvious or does licensecheck not work as expected? This is with licensecheck-3.3.0-2.fc36.noarch.
licensecheck does not follow/use SPDX-License-Identifier at all. It predates that scheme.
-- 真実はいつも一つ!/ Always, there's only one truth!
Neal Gompa wrote:
On Tue, Nov 15, 2022 at 11:05 AM Todd Zullinger tmz@pobox.com wrote:
Am I missing something obvious or does licensecheck not work as expected? This is with licensecheck-3.3.0-2.fc36.noarch.
licensecheck does not follow/use SPDX-License-Identifier at all. It predates that scheme.
Then it seems odd to recommend it when someone asks "Does licensecheck support SPDX identifiers?" I think. :)
The answer is closer to: "sort of, but not really in a way that helps Fedora maintainers do the SPDX license conversion."
On Tue, Nov 15, 2022 at 12:17 PM Todd Zullinger tmz@pobox.com wrote:
Neal Gompa wrote:
On Tue, Nov 15, 2022 at 11:05 AM Todd Zullinger tmz@pobox.com wrote:
Am I missing something obvious or does licensecheck not work as expected? This is with licensecheck-3.3.0-2.fc36.noarch.
licensecheck does not follow/use SPDX-License-Identifier at all. It predates that scheme.
Then it seems odd to recommend it when someone asks "Does licensecheck support SPDX identifiers?" I think. :)
The answer is closer to: "sort of, but not really in a way that helps Fedora maintainers do the SPDX license conversion."
It's very rare to see SPDX-License-Identifier in source code. Licensecheck will attempt to evaluate source files to determine a license for each file. This is because in Debian, the DEP-5 debian/copyright file needs per-file license declarations. The ability to return SPDX short names instead of full names is a courtesy.
-- 真実はいつも一つ!/ Always, there's only one truth!
On Tue, Nov 15, 2022 at 6:29 AM Miro Hrončok mhroncok@redhat.com wrote:
On 14. 11. 22 20:53, Miroslav Suchý wrote:
- You can use https://github.com/spdx/spdx-license-diff and use it to identify
your license. This is a Chrome and Firefox plugin and allows you to select the text; and in the context menu, you can choose to identify the license. It will print, e.g., that it matches 60% of the MIT-feh license and highlight the difference. Or...
Do we have a command line tool for this? Does licensecheck support SPDX identifiers?
(I find the use of browser extension for this very weird. I have the LICENSE file unpackaged with the sources on my machine, I am not browsing it on the web.)
Yeah, this tool was developed by a lawyer and I think is mainly aimed at lawyers. Since it does seem to be somewhat useful a good deal of the time I've wondered whether it would be straightforward to create a command line tool based on it (in addition to creating a similar tool that would target the Fedora license list data rather than SPDX license identifiers as such).
Richard
On Tue, Nov 15, 2022, at 8:23 PM, Miro Hrončok wrote:
On 14. 11. 22 20:53, Miroslav Suchý wrote:
- You can use https://github.com/spdx/spdx-license-diff and use it to identify your license. This is a Chrome and Firefox plugin and allows you to select the text; and in the context menu, you can choose to identify the license. It will print, e.g., that it matches 60% of the MIT-feh license and highlight the difference. Or...
Do we have a command line tool for this? Does licensecheck support SPDX identifiers?
(I find the use of browser extension for this very weird. I have the LICENSE file unpackaged with the sources on my machine, I am not browsing it on the web.)
There's also a cli tool and library called "askalono" https://github.com/jpeddicord/askalono that can detect license and output SPDX identifiers (the data set is sourced from SPDX). It also outputs a score for similarity. There are also other tools mentioned in the README, licensee (ruby), ScanCode (python).
Kan-Ru
On Tue, Nov 15, 2022, 12:24 Miro Hrončok mhroncok@redhat.com wrote:
On 14. 11. 22 20:53, Miroslav Suchý wrote:
- You can use https://github.com/spdx/spdx-license-diff and use it to
identify
your license. This is a Chrome and Firefox plugin and allows you to
select the
text; and in the context menu, you can choose to identify the license.
It will
print, e.g., that it matches 60% of the MIT-feh license and highlight
the
difference. Or...
Do we have a command line tool for this? Does licensecheck support SPDX identifiers?
(I find the use of browser extension for this very weird. I have the LICENSE file unpackaged with the sources on my machine, I am not browsing it on the web.)
-- Miro Hrončok -- Phone: +420777974800 IRC: mhroncok _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
I package askalono-cli which can detect license texts and outputs an SPDX identifier: https://github.com/jpeddicord/askalono
We use it in go2rpm.
Best regards,
Robert-André
Dne 16. 11. 22 v 7:47 Bob Mauchin napsal(a):
I package askalono-cli which can detect license texts and outputs an SPDX identifier: https://github.com/jpeddicord/askalono
This is is great. I have added it to Change documentation.
Mirek
On Tue, Nov 15, 2022, 12:24 Miro Hrončok <mhroncok(a)redhat.com> wrote:
I package askalono-cli which can detect license texts and outputs an SPDX identifier: https://github.com/jpeddicord/askalono
We use it in go2rpm.
Looking through this thread, it seems there are a range of tools that package maintainers are using to inspect license texts. It'd be great to capture this info along the lines of a list with info like where to get the tool, how it's used, how you are using it in this context, advantages/disadvantages. That could then be used as part of our documentation. What would be the best place to collect this info? (I could make a new documentation page and people could add MRs to it - one idea, but open to better ones!)
Thanks, Jilayne