On CJK font selection (was Re: [Fwd: Re: Request for review and advice on wqy-bitmap-fonts fontconfig settings])

Behdad Esfahbod behdad at behdad.org
Thu Dec 20 22:42:07 UTC 2007


On Thu, 2007-12-20 at 15:30 -0500, Qianqian Fang wrote:
> hi Badhdad
> 
> I don't think the tone in your reply going to be helpful in any aspect 
> toward a solution of this problem.

I try to respond decently.  However, can't help when someone does not
spend the same effort that I put in my replies, has not read my previous
mails in the thread, and uses caps...

Writing these replies takes time.  No abundance of it here.  And I get
frustrated from saying the same thing again and again.


> I hope you understand that people raise these issues for the goods of 
> pango.

I don't agree.  I'm very clearly saying: Pango doesn't want to fix this
issue.  Fix it somewhere else if you want the issue fixed.


> They want to make it more powerful and logical under all possible 
> circumstances.

I want to keep Pango as clean in design as it is.  That means, no "if
(CJK)".  Pango is a true international text layout system.  It's quite
different from a MS Windows "Chinese Edition" or Adobe Photoshop "Asian
Edition", etc.  It is supposed to be able to render all languages and
scripts, in the same process, in the same document.


> Second, there must be a reason for this issue being raised again and again
> in the past many years.

Compared to other scripts:

  - No one has "fixed" it properly so far, so it keeps coming up.

  - Chinese people are a great majority.  The only comparable majorities
are: Latin/Cyrillic script users, Arabic script users, and Indic script
users.  Latin and Arabic pretty much work.  Indic has lots of issues,
and it comes up again and again and more than CJK, believe me.

There used to be a time that Arabic was a disaster too.  And people
complained about it, a lot.  But it's fixed now.  Because there were
people that fixed it all.  Not by attacking the maintainer BTW.  Not by
taking it personally.


> I think insufficient explanations and poor 
> guidance for
> users toward a good solution play roles here (I am sorry to say that 
> your "proof"
> in the last email still did not help because it was not what I was 
> asking for).

I knew it doesn't help.  Because it was an obvious fact for everyone
thinking about it without prejudice.  You asked that I say exactly why
it's impossible, and I did.  Now either read and try understanding that,
or take my word when I say it's impossible.


> As reading the replies in the past few days, I came to realize the key of
> problem is to set up a "correct" fall-back path of the untagged
> (or COMMON) text. Obviously, you are reluctant to explicitly
> tag them as LATIN in pango. You may be right if differentiating
> COMMON with LATIN is practically necessary (I mean "practically", not
> semantically as in Unicode standard). You have your rationales
> here.

If I hardcode them to LATIN, I'm sure *you* come back and complain about
it too.  When you see in a monospace piece of text that you've got
bitmap crisp glyphs for Chinese glyphs, but a wider, fuzzy glyph for
your '['.


> Unfortunately, the current fall-back mechanism eventually assign
> the current locale info to these untagged text.

No.  Not current locale.  Adjacent scripts.  If there's none, then
current locale.

> And it turns out
> that for some users (if not all), particularly for CJK users (where the 
> practical differences
> between Latin/Common are not significant), it created unpleasant
> formating results due to the mixing of fonts.

Read above.  It's going to create an unpleasant result in one case or
the other.  There's no magic bullet here.


> So, it seems obvious that additional info is needed to assist the fall-back
> of these untagged text to the preferred settings. This info can be 
> introduced
> by the patched fontconfig, using block preference font list; or using 
> the current
> keyboard layout as suggested by Sergey and Chris. Maybe a third
> way is to create a LC variable, say LC_COMMON, independent
> of LC_ALL/LANG, taking care of the untagged text formating. I actually
> felt that this is probably more suitable than the other two approaches.
> Because this is a locale-based preference, not font or keyboard preferences
> (here is just my first thought on this, I may be wrong).

I don't agree completely, but do note that none of the above involves
Pango (at least initially).


> In any case, I "think" I understand your argument, although there are still
> details needs to be verified. But I think it will be useful if we focus on
> clarifying a solution rather than arguing who is right and who is wrong.

Except that I'm not interested in fixing it if it doesn't involve Pango.


Regards,

-- 
behdad
http://behdad.org/

...very few phenomena can pull someone out of Deep Hack Mode, with two
noted exceptions: being struck by lightning, or worse, your *computer*
being struck by lightning.  -- Matt Welsh




More information about the fonts mailing list