On Thu, Mar 29, 2018 at 10:12 AM Kevin Kofler <kevin.kofler@chello.at> wrote:
Hi,

we did more debugging on #fedora-kde (thanks in particular to lupinix) and
we found what seems to be the primary source of the bloat: CJK fonts!

CJK fonts are by far the largest of all fonts due to the huge number of
characters used in those languages.

Up to Fedora 26, Fedora shipped 4 CJK fonts:
      <packagereq type="default">adobe-source-han-sans-cn-fonts</packagereq>
      <packagereq type="default">adobe-source-han-sans-tw-fonts</packagereq>
      <packagereq type="default">naver-nanum-gothic-fonts</packagereq>
      <packagereq type="default">vlgothic-fonts</packagereq>
The KDE and LXQt Spins actually opted to blacklist these fonts in their
kickstart, and ship one compact CJK font instead: wqy-microhei-fonts.

In Fedora 27, this Change:
https://fedoraproject.org/wiki/Changes/ChineseSerifFonts
added 2 additional fonts (for a total of 6):
      <packagereq type="default">adobe-source-han-serif-cn-fonts</packagereq>
      <packagereq type="default">adobe-source-han-serif-tw-fonts</packagereq>
which were unfortunately missing from the blacklist:
https://bugzilla.redhat.com/show_bug.cgi?id=1530006
already increasing the size of the image.

But now in Fedora 28, after:
https://fedoraproject.org/wiki/Changes/JPDefaultFontsToNoto
https://fedoraproject.org/wiki/Changes/KRDefaultFontsToNoto
https://fedoraproject.org/wiki/Changes/ChineseDefaultFontsToNoto
we actually ship a whopping 12 CJK fonts:
      <packagereq type="default">google-noto-sans-jp-fonts</packagereq>
      <packagereq type="default">google-noto-sans-kr-fonts</packagereq>
      <packagereq type="default">google-noto-sans-mono-cjk-jp-fonts</packagereq>
      <packagereq type="default">google-noto-sans-mono-cjk-kr-fonts</packagereq>
      <packagereq type="default">google-noto-sans-mono-cjk-sc-fonts</packagereq>
      <packagereq type="default">google-noto-sans-mono-cjk-tc-fonts</packagereq>
      <packagereq type="default">google-noto-sans-sc-fonts</packagereq>
      <packagereq type="default">google-noto-sans-tc-fonts</packagereq>
      <packagereq type="default">google-noto-serif-jp-fonts</packagereq>
      <packagereq type="default">google-noto-serif-kr-fonts</packagereq>
      <packagereq type="default">google-noto-serif-sc-fonts</packagereq>
      <packagereq type="default">google-noto-serif-tc-fonts</packagereq>
none of which are blacklisted in the Spins! According to lupinix, these
amount to a download size (and thus an xz-compressed size, which is also the
compression algorithm used for the live images) of 364 MiB!

The fix is to update the blacklists in the KDE and LXQt spin kickstarts, as
per the discussion under:
https://bugzilla.redhat.com/show_bug.cgi?id=1530006
IMHO, this needs to be implemented, urgently.

But I think we also need to generally consider whether it makes sense to
force 3 font variants for each CJK language on all users worldwide, and
whether there are smaller fonts that could be used. (E.g.,
wqy-microhei-fonts is very effective, but unfortunately it only covers
Simplified Chinese and the syllabic parts of Japanese and Korean, not the
Traditional Chinese, Japanese or Korean renderings of the CJK Unified
Ideographs.)

 
Kevin, thanks for the investigation and the detailed analysis. Great work tracking this down!

I think our font strategy is a bit complicated, because we do strive to be an international distribution, but I feel like there must be better ways to accomplish this goal than to simply install all possible fonts by default. I know that for GNOME at least, there's some PackageKit integration that allows users to be prompted to install fonts as-needed (though it also seems to trigger whenever I accidentally `cat` a binary file in the terminal :-) ). Perhaps we can look into finding a desktop-agnostic way of doing this and then limit our installed fonts to only a few minimal ones based on the language selected in Anaconda?