On Mon, Oct 12, 2020 at 8:22 PM Jean-Baptiste Holcroft <jean-baptiste@holcroft.fr> wrote:
TL;DR: here are the translation memories for the 318 languages, built
from all software available in Fedora 32:
https://jibecfed.fedorapeople.org/partage/compendium-full/

Last august, I talked about my project "to provide translation memories
for translators and measure localization progress over version" [1].

Thanks to darknao's help with automation, I'm now able to analyze the
whole Fedora Linux distribution.

Thanks Jean-Baptiste, this is very interesting indeed.
I would like to make it a Fedora initiative and publish these files in
an official Fedora website.
Would someone be willing to help? Constraints is to use Hugo to allow
this website to be localized.

I can't help wondering if there is any way to integrate this with Transtats in the future.
For Fedora 32, it means:

* 21 000 srpm extracted (source of rpm packages)
* 121 000 po files detected (other formats exists, but I'm starting by
this) which represents 7Gio of data

 From that, I deducted:

* 318 languages. For each of them, it produce:
** a compendium [2]
** a terminology [3]
** a translation memory (tmx file)