[Resending from correct address to make it to the list...]
I understand that some of the information might be missing, but even this is true, I think fontconfig should come up with a somewhat better default rendering than what it is now. The solution might just be putting Uming/Ukai or wqy fonts right before the Japanese "Mincho" or "Gothic" font series, simply because Japanese fonts do not have a large Unicode coverage than Uming/wqy (but Uming/wqy covers Japanese code points). For Japanese locales, we can match lang=ja and put Mincho/Gothic fonts in front of the Chinese fonts.
To remind you my motivation for keep asking for a better default Chinese rendering, attached is a screenshot of browsing a Chinese web page under a fresh F8 installation (en-us of course), I doubt anyone would like to read this on a regular basis.
On Nov 30, 2007 3:03 PM, Behdad Esfahbod behdad@behdad.org wrote:
This is because by default fontconfig doesn't come with a mind-reader. You have to tell it which CJK language you want it to prefer. You can do that by any of:
Setting $LANG to zh_CN for example.
Making sure your HTML pages have the lang="zh-cn" tag. No,
lang="zh" is not enough.
- With recent Pango and a Pango-enabled firefox, you can set
$LANGUAGE=en_US,zh_CN, or set $PANGO_LANGUAGE=en_US,zh_CN. It does the right thing then.
-- behdad http://behdad.org/
Fedora-fonts-list mailing list Fedora-fonts-list@redhat.com https://www.redhat.com/mailman/listinfo/fedora-fonts-list
hi
some new progress was made for the fontconfig file. The new version is attached.
two changes:
1. for replacing wqy-bitmap-song by Chinese vector fonts for displaying
16px or <10px
sizes, I changed match="font" to match="pattern", and now it works. I used 4 match blocks to handle serif>16px, serif<10px, sans>16px and sans<10px cases.
2. I added a match block to solve the high-priority of wqy fonts under zh locales and monospace alias. An additional test <lang contains zh> was inserted to the test list, to minimize the impact to non zh users.
using this config file, the rendering of en-us and zh-cn/zh-tw are all looking fine (almost). the screenshots of the en-us desktop is http://wenq.org/gallery/albums/userpics/10002/F8-wqy-newconfig_enUS.png and that for zh-cn desktop is http://wenq.org/gallery/albums/userpics/10002/F8-wqy-newconfig_zhCN.png
the "almost" bits come from the fact that when users specify "WenQuanYi Bitmap Song" as family rather than the genetic alias, larger/smaller fonts were still replaced by uming/ukai (see the right-bottom corner of the two test pages). However, this seems to be ok for me.
just want to mention, two non-wqy bugs can be found in the screenshots
1. on the zhCN screenshot, in the sans-serif test block, the text of "... lasy dog 0123456789" you can see the numbers were rendered by wqy bitmap fonts, rather than the smooth Dejavu. This also happened for all Chinese webpages, when a number follows a Chinese character. I was told that this is a bug in Pango, Behdad, do you have some insight on this?
2. the date-time applet used reversed language on both screenshots, I believe this is Gnome's bug.
please let me know what you think about this file.
thank you
Qianqian
On Sat, 2007-12-01 at 15:22 -0500, Qianqian Fang wrote:
- on the zhCN screenshot, in the sans-serif test block, the text of
"... lasy dog 0123456789" you can see the numbers were rendered by wqy bitmap fonts, rather than the smooth Dejavu. This also happened for all Chinese webpages, when a number follows a Chinese character. I was told that this is a bug in Pango, Behdad, do you have some insight on this?
My insight is, well, you are getting what you asked for. This is where some people track this issue, but I've got used to ignoring it:
http://bugzilla.gnome.org/show_bug.cgi?id=481210
- the date-time applet used reversed language on both screenshots, I
believe this is Gnome's bug.
hi Behdad
you may well be right and the behavior of pango is not logically flawed. Perhaps this problem should be filed as a feature-request rather than a bug.
From Chinese user perspective, Latin scripts and the Common scripts are both non-Hanzi or non-CJK characters, therefore, they are expecting a similar look-n-feel when rendering these characters. For other languages, I guess they more or less share the same view: numbers and basic Latin characters (or Basic ASCII, or keyboard characters) are the most frequently used, non-local-language dependent symbols. As long as their local language does not re-define these symbols, they are expected to be rendered with similar styles.
I don't know the exact definition of PANGO_SCRIPT_COMMON and PANGO_SCRIPT_LATIN, but I think it is more natural to render the numbers using a Latin font rather than a Chinese font, as numbers and Latins are much closer.
Huang Peng provided a patch to get the commonly expected behavior for this situation, if it can be implemented, or under the condition of Chinese locales, that would be a great help. I've seen this report many times on Mandriva, Debian, Redhat's bugzilla and almost all Chinese Linux forums.
Back to the original topic of this thread, how do you think the fontconfig file in my last email? I have heard complains at some Chinese forums about font changes due to removing the original fontconfig file. Hope I can get something to commit to cease their complains.
Qianqian
Behdad Esfahbod wrote:
My insight is, well, you are getting what you asked for. This is where some people track this issue, but I've got used to ignoring it:
On Mon, 2007-12-03 at 21:58 -0500, Qianqian Fang wrote:
hi Behdad
Hi,
you may well be right and the behavior of pango is not logically flawed. Perhaps this problem should be filed as a feature-request rather than a bug.
I'm not stuck at semantic issues like feature-request vs bug. When I say it's technically infeasible, I mean it.
From Chinese user perspective, Latin scripts and the Common scripts are both non-Hanzi or non-CJK characters, therefore, they are expecting a similar look-n-feel when rendering these characters. For other languages, I guess they more or less share the same view: numbers and basic Latin characters (or Basic ASCII, or keyboard characters) are the most frequently used, non-local-language dependent symbols. As long as their local language does not re-define these symbols, they are expected to be rendered with similar styles.
Let me repeat what's happening again: You are setting a Chinese locale, so when Pango see digits, it assumes that you want to use those digits with Chinese text, and you have provided a Chinese font that has glyphs for those digits, so it believes it's found the perfect font for them (your preferred font indeed) and uses it. If those digits are not desired, remove them from the font.
I don't know the exact definition of PANGO_SCRIPT_COMMON and PANGO_SCRIPT_LATIN, but I think it is more natural to render the numbers using a Latin font rather than a Chinese font, as numbers and Latins are much closer.
Then fix your font.
Huang Peng provided a patch to get the commonly expected behavior for this situation, if it can be implemented, or under the condition of Chinese locales, that would be a great help. I've seen this report many times on Mandriva, Debian, Redhat's bugzilla and almost all Chinese Linux forums.
That's not going to happen. Pango's core has nothing language or script specific hardcoded in it except for the data that is computer-generated from the Unicode Character Database. In Unicode, ASCII digits are marked script Common. There is a very small part of the issue you are seeing that can be improved in Pango:
http://bugzilla.gnome.org/show_bug.cgi?id=345386
but other than that, the behavior looks very reasonable to me. If you can think of an explanation of the behavior you want, without using "change character class of digits" and "special-case Chinese", I'm interested to hear that.
There are a few ways to fix your problem:
- Remove Latin and ASCII digits from your font. Why is it there if it's not desired? Nicolas suggested that fontconfig adds support for conditional blacklisting of individual blocks/glyphs in a font. That would help too, but it's not in fontconfig yet.
- If you were doing your font in an OpenType container, you could split Latin and Chinese parts into two different fonts stuffed into a single container and having the same name. Then Pango will not see your Chinese font having ASCII digits and not use them.
But at the end, it all comes down to real or hacky ways of removing those glyphs from the font.
Back to the original topic of this thread, how do you think the fontconfig file in my last email? I have heard complains at some Chinese forums about font changes due to removing the original fontconfig file. Hope I can get something to commit to cease their complains.
No idea.
Qianqian
hi
I respect your philosophy of structuring the style propagations based on the context and script natures. I think it is indeed an elegant solution to use a COMMON charset to represent the language-independent symbols and render them based on the context.
IMHO, the confusion comes from the fact that "language-neutrality" and "local-language dependent" are not distinguished for the COMMON scripts. In another word, the charset of COMMON is a mixture of the characters that are essentially not tied to any specific language (such as digits), and those are re-defined by local languages (such as some punctuations, geometric shapes U2500-U25FF). For the former case, I think they should not be influenced by local language preferences, rather, using system fall-back setup (likely Latin-preferred) should be the best solution; for the later case, using local font preference is the best, as in your current COMMON charset handling.
In short, I think the current COMMON set should be further refined into a NEUTRAL and a LOCAL_DEPENDENT char sets, and use system fall-back configuation for NEUTRAL set, and use local-language preferences for the LOCAL_DEPENDENT set. Specifically, for digits, they are language neutral and should be rendered by system fall-back settings rather than a local language settings.
Qianqian
Behdad Esfahbod wrote:
On Mon, 2007-12-03 at 21:58 -0500, Qianqian Fang wrote:
hi Behdad
Hi,
you may well be right and the behavior of pango is not logically flawed. Perhaps this problem should be filed as a feature-request rather than a bug.
I'm not stuck at semantic issues like feature-request vs bug. When I say it's technically infeasible, I mean it.
From Chinese user perspective, Latin scripts and the Common scripts are both non-Hanzi or non-CJK characters, therefore, they are expecting a similar look-n-feel when rendering these characters. For other languages, I guess they more or less share the same view: numbers and basic Latin characters (or Basic ASCII, or keyboard characters) are the most frequently used, non-local-language dependent symbols. As long as their local language does not re-define these symbols, they are expected to be rendered with similar styles.
Let me repeat what's happening again: You are setting a Chinese locale, so when Pango see digits, it assumes that you want to use those digits with Chinese text, and you have provided a Chinese font that has glyphs for those digits, so it believes it's found the perfect font for them (your preferred font indeed) and uses it. If those digits are not desired, remove them from the font.
I don't know the exact definition of PANGO_SCRIPT_COMMON and PANGO_SCRIPT_LATIN, but I think it is more natural to render the numbers using a Latin font rather than a Chinese font, as numbers and Latins are much closer.
Then fix your font.
Huang Peng provided a patch to get the commonly expected behavior for this situation, if it can be implemented, or under the condition of Chinese locales, that would be a great help. I've seen this report many times on Mandriva, Debian, Redhat's bugzilla and almost all Chinese Linux forums.
That's not going to happen. Pango's core has nothing language or script specific hardcoded in it except for the data that is computer-generated from the Unicode Character Database. In Unicode, ASCII digits are marked script Common. There is a very small part of the issue you are seeing that can be improved in Pango:
http://bugzilla.gnome.org/show_bug.cgi?id=345386
but other than that, the behavior looks very reasonable to me. If you can think of an explanation of the behavior you want, without using "change character class of digits" and "special-case Chinese", I'm interested to hear that.
There are a few ways to fix your problem:
- Remove Latin and ASCII digits from your font. Why is it there if
it's not desired? Nicolas suggested that fontconfig adds support for conditional blacklisting of individual blocks/glyphs in a font. That would help too, but it's not in fontconfig yet.
- If you were doing your font in an OpenType container, you could
split Latin and Chinese parts into two different fonts stuffed into a single container and having the same name. Then Pango will not see your Chinese font having ASCII digits and not use them.
But at the end, it all comes down to real or hacky ways of removing those glyphs from the font.
Back to the original topic of this thread, how do you think the fontconfig file in my last email? I have heard complains at some Chinese forums about font changes due to removing the original fontconfig file. Hope I can get something to commit to cease their complains.
No idea.
Qianqian