I was curious how many packages are already converted to SPDX.
I downloaded all spec files and count how many of them contains "spdx" string (case insensitive). Assuming most packagers mention it either in changelog or in comment near License field.
The string is in 347 out of 23155 spec files. That is less than 2% of packages.
It is wild guess with many incorrect of assumtions, but I guess the order of magnitude is correct.
That is just FYI. I will not do anything to progress faster, because `fedora-license-data` has enough issues (and mainly flow of new issues).
Miroslav
On Mon Sep 19, 2022 at 9:22 AM CDT, Miroslav Suchý wrote:
I was curious how many packages are already converted to SPDX.
I downloaded all spec files and count how many of them contains "spdx" string (case insensitive). Assuming most packagers mention it either in changelog or in comment near License field.
The string is in 347 out of 23155 spec files. That is less than 2% of packages.
It is wild guess with many incorrect of assumtions, but I guess the order of magnitude is correct.
That is just FYI. I will not do anything to progress faster, because `fedora-license-data` has enough issues (and mainly flow of new issues).
FWIW, here are two other possible metrics:
$ rg -l '^License:.*(AND|OR|WITH)' | wc -l 394
This assumes that packages with SPDX identifiers will have uppercase boolean operators if they are multi licensed. It of course doesn't cover single licensed SPDX packages.
I was also curious how many packages are automatically compliant due to identifiers that are the same between Callaway and SPDX. This yields a much larger number.
$ rg -l '^License:\s*(MIT|Unlicense|Beearware|WTFPL|MIT-0|Zed|0BSD|OpenSSL|Ruby|PostgreSQL)$' | wc -l 4636
(There are probably more cases. I didn't bother to iterate over the entire license list to find all of the cases of shared license identifiers. I did this by hand.)
Maybe I'll write a script to get more accurate metrics if I find some spare time, but as you said, there are more important issues to deal with.
-- Maxwell G (@gotmax23) Pronouns: He/Him/His
On Mon, Sep 19, 2022 at 11:22 AM Maxwell G gotmax@e.email wrote:
I was also curious how many packages are automatically compliant due to identifiers that are the same between Callaway and SPDX. This yields a much larger number.
This raises the issue of what "automatically compliant" means. Nominally, "License: MIT" is both Callaway-compliant and SPDX-compliant, but of course using "MIT" in the Callaway sense is not what is expected in the SPDX/post-Callaway era. Even in those cases where the Callaway identifier is not conceived as an 'umbrella' label, I am not sure it is right to view, say, "License: Apache-2.0" resulting from a superficial translation of "License: ASL 2.0" as compliant with post-Callaway standards (or even strict application of Callaway standards, come to think of it). I think Jilayne may see this differently though. :)
Richard
On Mon Sep 19, 2022, Richard Fontana wrote:
On Mon, Sep 19, 2022 at 11:22 AM Maxwell G gotmax@e.email wrote:
I was also curious how many packages are automatically compliant due to identifiers that are the same between Callaway and SPDX. This yields a much larger number.
This raises the issue of what "automatically compliant" means. Nominally, "License: MIT" is both Callaway-compliant and SPDX-compliant, but of course using "MIT" in the Callaway sense is not what is expected in the SPDX/post-Callaway era.
That's a good point; it's impossible to tell whether "MIT" refers to the Callaway umbrella "MIT" or the more narrow SPDX "MIT." I brought up this issue when the licensing Change Proposal was initially proposed. I recall being told that it didn't make sense to explicitly mark packages that converted to SPDX and that the MIT ambiguity wasn't important for the first phase.
Even in those cases where the Callaway identifier is not concei
ved as
an 'umbrella' label, I am not sure it is right to view, say, "License: Apache-2.0" resulting from a superficial translation of "License: ASL 2.0" as compliant with post-Callaway standards (or even strict application of Callaway standards, come to think of it). I think Jilayne may see this differently though. :)
I don't think the post-Callaway guidelines are significantly different in this regard. The effective license analysis only applied to GPL family licensing. Searching for packages that were converted and have e.g. "GPL-3.0-or-later" isn't foolproof either; you still can't tell whether the maintainer did a full re-audit to find secondary licenses. Whether or not the multi-licensing is always handled properly (it's not) is orthogonal.
My goal wasn't to determine whether every package in this count is fully compliant. I just wanted to see which packages at least use the new license identifiers. That's about as far as you can get with the curren t implementation. For the packages I maintain with "License: MIT" or "License: Unlicense," I'm not going to add a "Adopt new licensing guidelines" changelog entry/commit if there's nothing that changed.
-- Best,
Maxwell G (@gotmax23) Pronouns: He/Him/His
On Mon, Sep 19, 2022 at 10:22 AM Miroslav Suchý msuchy@redhat.com wrote:
I was curious how many packages are already converted to SPDX.
I downloaded all spec files and count how many of them contains "spdx" string (case insensitive). Assuming most packagers mention it either in changelog or in comment near License field.
The string is in 347 out of 23155 spec files. That is less than 2% of packages.
It is wild guess with many incorrect of assumtions, but I guess the order of magnitude is correct.
That is just FYI. I will not do anything to progress faster, because `fedora-license-data` has enough issues (and mainly flow of new issues).
I'm actually pleasantly surprised there could be as many as 347 spec files already attempting to apply the current license metadata standards after a couple of months. :)
Richard
On Mon, 19 Sept 2022, 16:22 Miroslav Suchý, msuchy@redhat.com wrote:
I was curious how many packages are already converted to SPDX.
I downloaded all spec files and count how many of them contains "spdx" string (case insensitive). Assuming most packagers mention it either in changelog or in comment near License field.
The string is in 347 out of 23155 spec files. That is less than 2% of packages.
It is wild guess with many incorrect of assumtions, but I guess the order of magnitude is correct.
That is just FYI. I will not do anything to progress faster, because `fedora-license-data` has enough issues (and mainly flow of new issues).
Miroslav
Hello,
I didn't mention SPDX anywhere but I've converted dozens of Golang packages to it. Our tool, go2rpm, now outputs SPDX by default.
Best regards,
Robert-André
On Mon, Sep 19, 2022 at 8:56 PM Bob Mauchin zebob.m@gmail.com wrote:
On Mon, 19 Sept 2022, 16:22 Miroslav Suchý, msuchy@redhat.com wrote:
I was curious how many packages are already converted to SPDX.
I downloaded all spec files and count how many of them contains "spdx" string (case insensitive). Assuming most packagers mention it either in changelog or in comment near License field.
The string is in 347 out of 23155 spec files. That is less than 2% of packages.
It is wild guess with many incorrect of assumtions, but I guess the order of magnitude is correct.
That is just FYI. I will not do anything to progress faster, because `fedora-license-data` has enough issues (and mainly flow of new issues).
Miroslav
Hello,
I didn't mention SPDX anywhere but I've converted dozens of Golang packages to it. Our tool, go2rpm, now outputs SPDX by default.
Best regards,
Same goes for all Rust packages that were updated since SPDX expressions were allowed in License tags (probably a few hundred packages), because rust2rpm defaults to using upstream project's SPDX license identifier string directly since then. As a matter of fact, the "# SPDX license identifier: Foo" was present as a spec comment *before the switch to using SPDX in the License tag*, but this comment is no longer generated in most cases, so the string "SPDX" does not appear in any Rust packages that now use an SPDX license expression, but it *does* appear in Rust packages which were not yet converted yet :)
Looks like you "progress" check relies on the fact that somebody mentioned this switch in the package's changelog? That might explain why you missed so many packages, because many packages which were converted now use %autochangelog, and exported .spec files will not contain a changelog at all?
Fabio
On 19. 09. 22 16:22, Miroslav Suchý wrote:
I downloaded all spec files and count how many of them contains "spdx" string (case insensitive). Assuming most packagers mention it either in changelog or in comment near License field.
I only mentioned it in the commit message for my transitions.