Re: [gtk-i18n-list] Re: On CJK font selection (was Re: [Fwd: Re: Request for review and advice on wqy-bitmap-fonts fontconfig settings])
by Behdad Esfahbod
On Fri, 2007-12-21 at 01:24 +0900, mpsuzuki(a)hiroshima-u.ac.jp wrote:
> Sorry, I slipped to attach the picture, here it is.
>
> On Fri, 21 Dec 2007 01:17:08 +0900
> mpsuzuki(a)hiroshima-u.ac.jp wrote:
>
> >On Thu, 20 Dec 2007 10:08:05 -0500
> >Behdad Esfahbod <behdad(a)behdad.org> wrote:
> >
> >>On Thu, 2007-12-20 at 23:04 +0900, mpsuzuki(a)hiroshima-u.ac.jp wrote:
> >>> On Thu, 20 Dec 2007 07:48:50 -0500
> >>> Behdad Esfahbod <behdad(a)behdad.org> wrote:
> >>> >Setting locale is actually enough. If that's not desired,
> >>> >$PANGO_LANGUAGE can be set as a fallback. So far seems like most of the
> >>> >issues happen because either the users are not setting locale correctly
> >>> >or are using crappy fonts. How do I don't care enough about those cases
> >>> >I'm not surprised.
> >>>
> >>> Excuse me, PANGO_LANGUAGE is the solution to modify the
> >>> Pango's behaviour that Qianqian & Abel ask for fix?
> >>
> >>It's a way to tell Pango which of the CJK languages to prefer. It's
> >>main use is when running under non-CJK locale (en_US for example) and
> >>the text doesn't have language tags. It solves most of the "multiple
> >>fonts used in the same line" issues with CJK characters.
> >
> >Excuse me again, please let me know more detail.
> >I attached a picture to describe the behaviour I want to fix.
Thanks for raising a concrete issue.
> >The picture (1), (2), (3) are screenshots under English.
> >
> >If I execute gedit as
> > $ env LANG=C PANGO_LANGUAGE=en gedit
> >font is not changed during I type "[" then "a".
> >
> >The picture (1'), (2'), (3'), (4') are screenshots under Japanese.
> >
> >If I execute gedit as
> > $ env LANG=ja_JP.euc-jp PANGO_LANGUAGE=ja gedit
As long as your LANG and PANGO_LANGUAGE are the same, you don't need
both. PANGO_LANGUAGE is mostly useful when you set LANG to en. That's
not relevant to your issue here though.
> >and I type "[" then "a" then "あ". The font to display
> >"[" is dynamically changed as (2'), (3'), (4') during
> >typing keys. The dynamically font switching shifts the
> >baseline up and down, it looks as strange zig-zag behaviour.
> >I could not stop this switching by setting PANGO_LANGUAGE=en
> >nor PANGO_LANGUAGE=ja. How can I stop this switching?
I tell you what's happening, you tell me what Pango is doing wrong and
how you think it can be fixed:
- In image 2', you are running under Japanese locale, you type a
COMMON character ('[') only, Pango assumes you are going to type
Japanese text, your preferred Japanese font has a glyph for '[', so
Pango uses it, hoping that it will use the same font when you enter
Japanese text.
- In image 3', you entered a Latin letter, not Japanese (an unexpected
event given that you run under Japanese locale), so Pango now associates
the bracket to the Latin text, because, well, that's the only non-COMMON
script there. You sure have a bracket and Latin text in it. So it
renders the bracket using the same font that it uses for the Latin text.
- In image 4', you add a Japanese character. No surprises here: you
have two fonts, the line takes the height of the taller font. So the
Latin text is shifted down a bit.
So, the issue comes down to the fact that:
- It's unexpected to enter Latin under Japanese locale.
- You have a COMMON character at the beginning of the line.
- Your Japanese and Latin fonts have different heights.
And this case is rare enough that I normally don't consider it an issue
at all. But apparently multiplying that by 1 billion makes it quite
visible!
One way one may suggest is that Pango should reserve a minimum line
height that is enough to fit the default Japanese font, because it's
running under Japanese locale after all. That would fix the jump from
3' to 4', but makes English-only paragraphs look very ugly and badly
spaced vertically, so that's not an option either.
The jump from 2' to 3' can't be fixed. I already proved that. If one
fixes it, it would introduce the bug that '[' followed by a Japanese
character will choose a separate fonts for those chars, OR, that font
used for '[' will change when you type a Japanese char. It's as simple
as this: Pango can't know what you are going to type next. It can just
guess, and it's guessing pretty good. It's just not reading your mind
yet :).
I have two suggestions for what you can do that may achieve better
results for you.
- Run under LC_LANG=en_US LC_MESSAGES=ja_JA
- Choose a non-generic font family in gedit. That is, something other
than Sans, Sans-serif, and Monospace.
> >Regards,
> >mpsuzuki
Regards,
--
behdad
http://behdad.org/
...very few phenomena can pull someone out of Deep Hack Mode, with two
noted exceptions: being struck by lightning, or worse, your *computer*
being struck by lightning. -- Matt Welsh
15 years, 11 months
Re: [gtk-i18n-list] Re: On CJK font selection (was Re: [Fwd: Re: Request for review and advice on wqy-bitmap-fonts fontconfig settings])
by Behdad Esfahbod
On Fri, 2007-12-21 at 09:12 +0900, mpsuzuki(a)hiroshima-u.ac.jp wrote:
> >Arabic is like Japanese in that regard, no difference. I actually see
> >that coming, should have clarified. By unexpected, I mean it's not the
> >most likely event. Japanese text coming is more expected.
> >
> >That said, we don't have that in issue as much in Arabic because it's
> >considered bad writing to start an Arabic/Persian paragraph with an
> >English word written in Latin. It also screws bidirectional code in
> >Pango and you end up with a left-to-right paragraph (because that's what
> >it looks like from your text), so people just avoid it.
>
> I see. Hearing "people just avoid it" is quite interesting.
As I said, it's not just technical. It's bad style to start a
Persian/Arabic paragraph with a Latin word.
> >But then when rendering a Japanese only text, all the punctuation marks
> >will be rendered using a different font! Now imagine that in a
> >monospace text, with bitmap Japanese font and non-bitmap punctuation
> >font.
>
> Yes. Do you think it's worse than contextual font switching?
I think rendering single-script text correctly is more important, yes.
If you have plain text in one script, it should only use the preferred
font of that script. Can't compromise here.
> I don't think so. But it's because my fonts have varied/inconsistent
> baselines and heights (and their inconsistency makes the contextual
> font switching quite ugly), so my disagree is not so strong at present.
>
> Anyway, your mention on bidi reminded me that binding a fixed font
> to COMMON characters may confuse bidi glyph shaping of punctuation.
> If so, it would be problematic and binding should be disabled even
> if it's possible. Oops.
No, bidi reordering is done independent of font selection. Those are
completely separate processes.
> >> >I have two suggestions for what you can do that may achieve better
> >> >results for you.
> >> >
> >> > - Run under LC_LANG=en_US LC_MESSAGES=ja_JA
> >> >
> >> > - Choose a non-generic font family in gedit. That is, something other
> >> >than Sans, Sans-serif, and Monospace.
> >>
> >> Oops, it's too application specific...
> >
> >No. Give it a try. It should have the effect you asked for. All
> >punctuation should be chosen from the non-generic font you choose. I
> >said do it in gedit just to test, otherwise it's nothing specific to
> >gedit, that's how fontconfig works.
>
> OK, I will try to setup ~/.fonts.conf.
I don't think that would do it. Just set it in gnome-font-properties.
> It seems that my
> request (binding a same font to COMMON character, at
> least in Latin & CJK context) can be realized by it
Not exactly. Hardcoding a font in your fontconfig config to always
return a certain font as the first font is not a good idea, and is
actually what started this thread at the beginning.
> - so it's off-topic to this list? Should I move to fontconfig?
I don't think forcing to use the same font for COMMON characters is
really a solution. The simplest solution for the case you showed is to
use a font that has both Japanese and Latin glyphs (plus all the
punctuation). Again, what started this thread was that the CJK font had
Latin glyphs, but crappy ones.
> Anyway, thank you for enlightening me.
>
> Regards,
> mpsuzuki
--
behdad
http://behdad.org/
...very few phenomena can pull someone out of Deep Hack Mode, with two
noted exceptions: being struck by lightning, or worse, your *computer*
being struck by lightning. -- Matt Welsh
15 years, 11 months
Re: [gtk-i18n-list] Re: On CJK font selection (was Re: [Fwd: Re: Request for review and advice on wqy-bitmap-fonts fontconfig settings])
by Behdad Esfahbod
On Fri, 2007-12-21 at 08:21 +0900, mpsuzuki(a)hiroshima-u.ac.jp wrote:
> Thank you very much!
You are very welcome.
[...]
> > - In image 2', you are running under Japanese locale, you type a
> >COMMON character ('[') only, Pango assumes you are going to type
> >Japanese text, your preferred Japanese font has a glyph for '[', so
> >Pango uses it, hoping that it will use the same font when you enter
> >Japanese text.
>
> Oh! It corrects my misunderstanding. I was misunderstanding as
> only '[' was given, the text was recognized as a Latin because
> '[' is included in ASCII. Now I understand that ASCII numerical
> digits are also COMMON character.
Yes.
> >So, the issue comes down to the fact that:
> >
> > - It's unexpected to enter Latin under Japanese locale.
> >
> > - You have a COMMON character at the beginning of the line.
> >
> > - Your Japanese and Latin fonts have different heights.
>
> I see. The first clause is quite important. I guess, inputting
> Latin text under Arabic locale might be possible but irregular
> (right guessing? please let me know), but an insertion of
> Latin text (to be more correctly, I mean a string of ASCII
> alphabets) under Japanese locale is popular, especially text
> around information technology. For example, please check the
> website of Japanese Standards Association http://www.jsa.or.jp/
Arabic is like Japanese in that regard, no difference. I actually see
that coming, should have clarified. By unexpected, I mean it's not the
most likely event. Japanese text coming is more expected.
That said, we don't have that in issue as much in Arabic because it's
considered bad writing to start an Arabic/Persian paragraph with an
English word written in Latin. It also screws bidirectional code in
Pango and you end up with a left-to-right paragraph (because that's what
it looks like from your text), so people just avoid it.
> >One way one may suggest is that Pango should reserve a minimum line
> >height that is enough to fit the default Japanese font, because it's
> >running under Japanese locale after all. That would fix the jump from
> >3' to 4', but makes English-only paragraphs look very ugly and badly
> >spaced vertically, so that's not an option either.
> >
> >The jump from 2' to 3' can't be fixed. I already proved that. If one
> >fixes it, it would introduce the bug that '[' followed by a Japanese
> >character will choose a separate fonts for those chars, OR, that font
> >used for '[' will change when you type a Japanese char.
>
> Umm. Is it possible for Pango to bind COMMON characters to single
> font? I understand the font switching in my example is caused by
> the fact that the appropriate font to show COMMON character is
> determined by its context. If the font to show COMMON character is
> fixed to single font, my problem will be slightly better although
> the line height shifting still occurs.
But then when rendering a Japanese only text, all the punctuation marks
will be rendered using a different font! Now imagine that in a
monospace text, with bitmap Japanese font and non-bitmap punctuation
font.
> >It's as simple as this: Pango can't know what you are going to type next.
> >It can just guess, and it's guessing pretty good. It's just not reading
> >your mind yet :).
>
> Indeed. I wish anybody can implement it in Pango2 :-)
That's already on my wishlist. I may as well open a bug for it.
> >I have two suggestions for what you can do that may achieve better
> >results for you.
> >
> > - Run under LC_LANG=en_US LC_MESSAGES=ja_JA
> >
> > - Choose a non-generic font family in gedit. That is, something other
> >than Sans, Sans-serif, and Monospace.
>
> Oops, it's too application specific...
No. Give it a try. It should have the effect you asked for. All
punctuation should be chosen from the non-generic font you choose. I
said do it in gedit just to test, otherwise it's nothing specific to
gedit, that's how fontconfig works. Lets see:
[behdad@behdad berlin-fest]$ fc-match 'sans:lang=en' --sort | head -4
DejaVuSans.ttf: "DejaVu Sans" "Book"
DejaVuSans-ExtraLight.ttf: "DejaVu Sans" "ExtraLight"
DejaVuSans-BoldOblique.ttf: "DejaVu Sans" "Bold Oblique"
luxisr.ttf: "Luxi Sans" "Regular"
[behdad@behdad berlin-fest]$ fc-match 'sans:lang=ja' --sort | head -4
sazanami-gothic.ttf: "Sazanami Gothic" "Regular"
DejaVuSans.ttf: "DejaVu Sans" "Book"
DejaVuSans-ExtraLight.ttf: "DejaVu Sans" "ExtraLight"
DejaVuSans-BoldOblique.ttf: "DejaVu Sans" "Bold Oblique"
[behdad@behdad berlin-fest]$ fc-match 'DejaVu Sans:lang=en' --sort |
head -4
DejaVuSans.ttf: "DejaVu Sans" "Book"
DejaVuSans-ExtraLight.ttf: "DejaVu Sans" "ExtraLight"
DejaVuSans-BoldOblique.ttf: "DejaVu Sans" "Bold Oblique"
luxisr.ttf: "Luxi Sans" "Regular"
[behdad@behdad berlin-fest]$ fc-match 'DejaVu Sans:lang=ja' --sort |
head -4
DejaVuSans.ttf: "DejaVu Sans" "Book"
DejaVuSans-ExtraLight.ttf: "DejaVu Sans" "ExtraLight"
DejaVuSans-BoldOblique.ttf: "DejaVu Sans" "Bold Oblique"
sazanami-gothic.ttf: "Sazanami Gothic" "Regular"
That is, if you ask for a non-generic font (DejaVu Sans here) for
language Japanese, it first gives you DejaVu Sans (even if it doesn't
cover Japanese), then the best Japanese font available.
> Regards,
> mpsuzuki
--
behdad
http://behdad.org/
...very few phenomena can pull someone out of Deep Hack Mode, with two
noted exceptions: being struck by lightning, or worse, your *computer*
being struck by lightning. -- Matt Welsh
15 years, 11 months
Re: On CJK font selection (was Re: [Fwd: Re: Request for review and advice on wqy-bitmap-fonts fontconfig settings])
by Behdad Esfahbod
On Thu, 2007-12-20 at 04:22 +0800, Abel Cheung wrote:
> Hi,
Hi,
> My reply is followed below, inline...
So is mine.
> On Dec 17, 2007 7:22 AM, Behdad Esfahbod <behdad(a)behdad.org> wrote:
> [..........tons of quasi-maths ...........]
> >
> > > Secondly, you said that "contextual font selection" is a "cool"
> > > feature, I am wondering what languages are beneficial from this feature?
> > > (I believe there are, but just want to know).
> >
> > Pretty much every non-Latin script. In some situations even the Latin
> > script.
> >
> > Take the Unicode character U+002E FULL STOP, aka ASCII period. It is
> > used in more than just Latin, in Arabic for example, in Hebrew, possibly
> > in Indic and many other scripts. If it was not grouped with neighboring
> > characters for font selection purposes all those people would have got
> > their Arabic/Hebrew/... text assigned an Arabic/Hebrew/... font while
> > the periods in at the end of sentences assigned a different (default
> > Latin for example) font.
> >
> > The same happens for Latin under a document tagged as non-Latin. It's
> > not a luxury thing. It's just how things are supposed to work.
>
> That means, font change depending on context is actually preferrred in
> some fonts or some langauges, is it? If that's true, then this would be
> a per-language preference, some want it, some don't.
>
> So does pango support toggling this behavior yet? (I guess not?)
What do you exactly mean by "this behavior"? Which behavior? Show me
the source code line. I'm getting tired of all the hand waving.
> > > > The main font issue though, is that Chinese (Simplified, Traditional),
> > > > Korean, and Japanese share some Unicode code points, but they require
> > > > slightly different renderings. Now if you don't tell Pango which
> > > > version is preferred, how can it know which font to choose? It
> > > > explicitly doesn't prefer any one over the others to avoid cultural
> > > > problems.
> > > >
> > > > The symptoms of this problem are "multiple fonts used in the same line".
> > > > Solution is: Either run under a CJK locale, or give hints to Pango about
> > > > your preferred CJK locale using the env var PANGO_LANGUAGE.
> > > >
> > > > Note that theoretically Pango can do text analysis to come up with a
> > > > best guess, but doing that would then introduce another bug with
> > > > symptoms "changes font when typing a few characters on the same line".
>
> Let me set the record straight here. Most people seeing this problem is not
> exactly complaining about the font changing, but about the font changing TO
> SOME BAD LATIN GLYPH THEY DON'T LIKE. It is understood that font changing is
> almost not avoidable, since typing just a few characters may not provide enough
> information on what kind of font should be picked, and typing more
> gives more info.
> So far it is determined per sentence, or per what?
Believe me, I know that. And I understand it if you don't WRITE IN CAPS
too. Does it help if I say THEN GO REMOVE THE CRAPPY FONT?
[...]
> Sadly this way absolutely won't satisfy everybody -- one party only. And in
> particular, the font picked is determined per glyph, causing a sentence to be
> intermixed by multiple CJK fonts as described.
This is totally wrong. Pango first tags each piece of text with a
language, then asks fontconfig to sort fonts for that language, then
uses the sorted list to assign font to each character. That is, if you
mark your text zh_CN (by either running under that locale, or setting
PANGO_LANGUAGE to that, or otherwise marking it), and have a suitable
font for that language and if you have crappy fonts for it, have
fontconfig configured to prefer the good one, then Pango chooses the
right font. Now all the "bugs" you show me are in all the steps
mentioned except for what Pango is doing.
> What if the font determination is not chopped glyph by glyph, but also
> determined heuristically with context?
Pango already does that. That's exactly what you call "contextual"
something above and condemn.
> If my guess is correct this would work most of the
> cases, even among language variants (think zh_CN and zh_TW).
No. You need to go back and read and understand my "tons of
quasi-maths".
> > > > Another symptom, "digits change font after typing character" is in fact
> > > > a very cool Pango feature, just badmouthed by the above problem. Fix
> > > > the problem.
>
> When a solution is not universal enough to be accepted by everybody,
> and caused more trouble then its worth for specific people, it would be
> badmouthed no matter what. Or not? I don't know the rule here.
You officially don't know what you are talking about.
behdad
> Abel
>
>
> > > >
> > > >
> > > >
> > > >> As you see from the bug lists, this problem has existed for many
> > > >> years, and I am pretty sure that it will come back again and again, as
> > > >> long as the expected rendering is not achieved. If the current pango
> > > >> formatting logic is not sufficient to handle the CJK preferences as
> > > >> said above, I think to refine the logic to take it into consideration
> > > >> is better than stick with a fixed but incomplete logic.
> > > >>
> > > >
> > > > I consider patches improving Pango's font selection algorithm, but none
> > > > that I've seen so far had been an improvement (from my point of view).
> > > > If it has words like CJK or "special case", I'm most probably not
> > > > interested. Of the bugs you listed, only the one I opened myself is
> > > > valid IMO. The rest is just left open because no matter how many times
> > > > I close them, they will be reopened... Oh well.
> > > >
> > > >
> > > >
> > > >> please let me know your thoughts and reasoning on whether this is
> > > >> feasible or not, if yes, where to get start.
> > > >>
> > > >
> > > > Does the above make sense? I understand that it's easier to apply a two
> > > > line patch to Pango instead of doing what of the things I listed above,
> > > > but that just doesn't fit in the design, and it introduces other
> > > > problems you don't see right now.
> > > >
> > > >
> > > >
> > > >> thank you for paying attention to this issue.
> > > >>
> > > >> Qianqian
> > > >>
> > > >
> > > > Regards,
> > > >
> > > > behdad
> > > >
> > > >
> > > >
> > > >> ===============================================================
> > > >> Bug 321113 - Wrong glyph subsituation algorithm for digital characters
> > > >> and punctuations
> > > >> http://bugzilla.gnome.org/show_bug.cgi?id=321113
> > > >>
> > > >>
> > > >> Bug 345072 - changes font when typing different scripts on the same
> > > >> line
> > > >> http://bugzilla.gnome.org/show_bug.cgi?id=345072
> > > >>
> > > >>
> > > >> Bug 345386 - Language and direction propagation in and between
> > > >> PangoLayouts
> > > >> http://bugzilla.gnome.org/show_bug.cgi?id=345386 (opened by yourself)
> > > >> https://bugzilla.redhat.com/bugzilla/attachment.cgi?id=103679
> > > >>
> > > >>
> > > >> Bug 481210 - [All lang] [firefox] - Face of the number is changing
> > > >> when enter number + Char, in any Locale
> > > >> http://bugzilla.gnome.org/show_bug.cgi?id=481210
> > > >>
> > > >>
> > > >> Bug 481188 - ascii text space too narrow for Chinese encodings
> > > >> http://bugzilla.gnome.org/show_bug.cgi?id=481188
> > > >>
> > > >>
> > > >> Bugzilla Bug 129541: changes font when typing different scripts on the
> > > >> same line
> > > >> https://bugzilla.redhat.com/show_bug.cgi?id=129541
> > > >>
> > > >>
> > > >> Bugzilla Bug 131218: [RHEL4] Characters get truncated in new pango
> > > >> https://bugzilla.redhat.com/show_bug.cgi?id=131218
> > > >>
> > > >>
> > > >> Bugzilla Bug 149991: [CJK pango] digits and punctuation in textbox
> > > >> give bad eol rendering and cursor placement
> > > >> https://bugzilla.redhat.com/show_bug.cgi?id=149991 (filed by Jens
> > > >> Petersen)
> > > >>
> > > >>
> > > >> https://bugzilla.redhat.com/show_bug.cgi?id=220885 (broken link)
> > > >>
> > > >>
> > > >> Bugzilla Bug 228804: [All lang] [firefox] - Face of the number is
> > > >> changing when enter number + Char, in any Locale
> > > >> https://bugzilla.redhat.com/show_bug.cgi?id=228804
> > > >>
> > > >>
> > > >> Bugzilla Bug 221361: [pango] ascii text space and punctuation is
> > > >> narrow for CJK
> > > >> https://bugzilla.redhat.com/show_bug.cgi?id=221361
> > > >>
> > > >>
> > > >> Bug 379125 - chinese punctuations after english letters are wrongly
> > > >> displayed
> > > >> https://bugzilla.mozilla.org/show_bug.cgi?id=379125
> > > >> https://bugzilla.mozilla.org/attachment.cgi?id=263185
> > > >> ===============================================================
> > > >>
> > > >
> > > >
> > >
> > --
> > behdad
> > http://behdad.org/
> >
> > ...very few phenomena can pull someone out of Deep Hack Mode, with two
> > noted exceptions: being struck by lightning, or worse, your *computer*
> > being struck by lightning. -- Matt Welsh
> >
> > _______________________________________________
> > gtk-i18n-list mailing list
> > gtk-i18n-list(a)gnome.org
> > http://mail.gnome.org/mailman/listinfo/gtk-i18n-list
> >
>
>
>
--
behdad
http://behdad.org/
...very few phenomena can pull someone out of Deep Hack Mode, with two
noted exceptions: being struck by lightning, or worse, your *computer*
being struck by lightning. -- Matt Welsh
15 years, 11 months
Re: [gtk-i18n-list] Re: On CJK font selection (was Re: [Fwd: Re: Request for review and advice on wqy-bitmap-fonts fontconfig settings])
by Behdad Esfahbod
On Thu, 2007-12-20 at 23:04 +0900, mpsuzuki(a)hiroshima-u.ac.jp wrote:
> On Thu, 20 Dec 2007 07:48:50 -0500
> Behdad Esfahbod <behdad(a)behdad.org> wrote:
> >Setting locale is actually enough. If that's not desired,
> >$PANGO_LANGUAGE can be set as a fallback. So far seems like most of the
> >issues happen because either the users are not setting locale correctly
> >or are using crappy fonts. How do I don't care enough about those cases
> >I'm not surprised.
>
> Excuse me, PANGO_LANGUAGE is the solution to modify the
> Pango's behaviour that Qianqian & Abel ask for fix?
It's a way to tell Pango which of the CJK languages to prefer. It's
main use is when running under non-CJK locale (en_US for example) and
the text doesn't have language tags. It solves most of the "multiple
fonts used in the same line" issues with CJK characters.
> Regards,
> mpsuzuki
--
behdad
http://behdad.org/
...very few phenomena can pull someone out of Deep Hack Mode, with two
noted exceptions: being struck by lightning, or worse, your *computer*
being struck by lightning. -- Matt Welsh
15 years, 11 months
Re: On CJK font selection (was Re: [Fwd: Re: Request for review and advice on wqy-bitmap-fonts fontconfig settings])
by Behdad Esfahbod
On Thu, 2007-12-20 at 15:41 +0600, Christopher Fynn wrote:
>
> Language specific rendering *can* be achieved using OpenType lookups -
> but, even
> if the font contains the necessary language specific lookups (and most
> don't),
> for this feature to function correctly the system somehow needs to
> know which
> language is being used. This cannot always be determined by current
> locale, the
> keyboard/IME used to type the text, or from the range of Unicode
> characters
> involved so especially with multilingual documents you need users to
> reliably
> mark up text. Language then needs to be indicated by some high-level
> form of
> mark-up or tagging within the documents - which right away excludes
> plain text.
Setting locale is actually enough. If that's not desired,
$PANGO_LANGUAGE can be set as a fallback. So far seems like most of the
issues happen because either the users are not setting locale correctly
or are using crappy fonts. How do I don't care enough about those cases
I'm not surprised.
> - Chris
--
behdad
http://behdad.org/
...very few phenomena can pull someone out of Deep Hack Mode, with two
noted exceptions: being struck by lightning, or worse, your *computer*
being struck by lightning. -- Matt Welsh
15 years, 11 months
Re: [Fwd: Re: Request for review and advice on wqy-bitmap-fonts fontconfig settings]
by Nicolas Mailhot
Le Mar 4 décembre 2007 08:35, Behdad Esfahbod a écrit :
> On Mon, 2007-12-03 at 21:58 -0500, Qianqian Fang wrote:
Hi,
I've let Behdad answer so far because he's the most qualified on the
pango front, but I've wanted to reafirm some points for a few days, so
I'll do it now:
Your core problem as I wrote in one of my first mails is your font is
providing bad glyphs for unicode blocks you don't really want to
touch, and you're changing locales you shouldn't change so the easier
and fastest solution for you has always beent to
> - Remove Latin and ASCII digits from your font. Why is it there if
> it's not desired?
You have the chance to package a free/open-libre font, this is
something that couldn't be done for most fonts but you can do it so
don't hesitate to do it.
> Nicolas suggested that fontconfig adds support for
> conditional blacklisting of individual blocks/glyphs in a font. That
> would help too, but it's not in fontconfig yet.
Unfortunately many fonts are not so open and users still depend on
them. So some sort of fontconfig blacklisting support is needed to
support those fonts and users. From these exchanges, it seems chinese
users are most affected by this problem.
Since you have contacts in the chinese fonts community do consider
reviving the patches posted on the fontconfig list in the past or
writing others. Have chinese users indicate on the fontconfig list
their support for them. It's not a short-term fix, but it's the right
long-term fix, and if you don't push it this year you'll hit the same
problem again and again till someone does this work.
Last time the problem was discussed on fontconfig lists almost no one
stepped in to write he needed this change. So fontconfig developpers
decided it was a lot of work with no real need, and passed.
The moral of this story is: your problems won't be fixed if you only
focus on workarounds (as you're doing now) and let others with no core
interest at stake drive changes. I know that culturally chinese people
tend to avoid open disagreement, but if you need fontconfig to change
silently hoping for fontconfig maintainers to realise this won't work.
Similarly, if you need good Chinese rendering in non-chinese locales,
chinifying en_US is not the solution. We've not heard from Japanese
users yet but I'm sure they would strongly object to chinese-oriented
defaults. That means you need to push for apps that do not do it yet
to pass language info for properly tagged text to pango (like firefox
does) and push for some sort of input language notification system.
You can of course pass and hope others will do it but in the meantime
you'll have to accept any workaround that affects users in other
locales won't be accepted in the distro. And since getting proper
localised input working is the only way to get your stuff working
without side-effects for those other users, that means chinese users
won't have optimal defaults in the meantime.
>> Back to the original topic of this thread, how do you think the
>> fontconfig file in my last email?
The version posted on
http://www.redhat.com/archives/fedora-fonts-list/2007-November/msg00088.html
looks mostly fine, except I'm not sure the DejaVu LGC Sans Mono in
monospace is needed and you rely on a high priority (61) to stomp on
other CJK fonts (and probably others). IMHO this needs to be approved
by Jens and the language teams affected.
For the version on
http://www.redhat.com/archives/fedora-fonts-list/2007-December/msg00002.html
I'm not sure what the selectfont is there for. And likewise you have
all sorts of stuff in monospace that assumes specific latin defaults
out of your control. Will probably work most of the time, but removing
the latin glyphs in your fonts would solve this in a more robust way.
Regards,
--
Nicolas Mailhot
15 years, 11 months
Re: [gtk-i18n-list] On CJK font selection (was Re: [Fwd: Re: Request for review and advice on wqy-bitmap-fonts fontconfig settings])
by Behdad Esfahbod
On Thu, 2007-12-13 at 12:23 +0900, mpsuzuki(a)hiroshima-u.ac.jp wrote:
>
>
> http://www.pango.org/ScriptGallery?action=AttachFile&do=get&target=Vertic...
> http://www.pango.org/ScriptGallery?action=AttachFile&do=get&target=Vertic...
>
> In the vertical texts, the 3rd character (punctuation
> after "好") seems (for me) to be located at too-low
> position, as if they were vertically-centerlined glyph
> based on horizontal-writing mode. Qianqian, for Chinese
> users' eyes, they seem to be correctly positioned?
Most probably the font doesn't have the vertical variants. If it has,
Pango will use it, as you can see in the brackets in the last line where
the brackets unlike other characters are actually rotated:
http://www.pango.org/ScriptGallery?action=AttachFile&do=get&target=Vertic...
> Regards,
> mpsuzuki
--
behdad
http://behdad.org/
...very few phenomena can pull someone out of Deep Hack Mode, with two
noted exceptions: being struck by lightning, or worse, your *computer*
being struck by lightning. -- Matt Welsh
15 years, 12 months
Re: [gtk-i18n-list] On CJK font selection (was Re: [Fwd: Re: Request for review and advice on wqy-bitmap-fonts fontconfig settings])
by Qianqian Fang
hi
For traditional Chinese used in Taiwan, people generally put commas
and periods (full periods) near the center lines of the Chinese characters.
But for simplified Chinese in mainland China, these marks are placed
within the lower quadrant from the bottom of the glyph, similar to Latins.
Funny that in traditional Chinese literatures, no punctuations were
systematically used until the last 100 years, as introduced from
western world :)
as to the screenshots, I guess those are related to the settings of the
font (arphic/uming in this case), and comma does looks to be a little
bit too low and overlaps with the next character in vertical mode.
mpsuzuki(a)hiroshima-u.ac.jp wrote:
>
> http://www.pango.org/ScriptGallery?action=AttachFile&do=get&target=Vertic...
> http://www.pango.org/ScriptGallery?action=AttachFile&do=get&target=Vertic...
>
> In the vertical texts, the 3rd character (punctuation
> after "好") seems (for me) to be located at too-low
> position, as if they were vertically-centerlined glyph
> based on horizontal-writing mode. Qianqian, for Chinese
> users' eyes, they seem to be correctly positioned?
>
> Regards,
> mpsuzuki
>
>
15 years, 12 months
Fwd: Re: Request for review and advice on wqy-bitmap-fonts fontconfig settings]]]
by Qianqian Fang
hi Nicolas
I got more feedbacks on the config file testing from Jens, it seems the new
config file
works fine (the difference between the ja/zh monospace fonts is because ja
users use
Gothic as default monospace, not the Dejavu Mono).
Do you think if it is ok for me to commit it to F8 as well?
thanks
Qianqian
---------- Forwarded message ----------
From: Jens Petersen <petersen(a)redhat.com>
Date: Dec 12, 2007 2:33 AM
Subject: [Fwd: Re: [Fwd: Re: [Fwd: Re: Request for review and advice on
wqy-bitmap-fonts fontconfig settings]]]
To: Qianqian Fang <fangqq(a)gmail.com>
Hi,
Here is a mail from Caius who tried to test your config a bit.
I didn't have time yet to review his comments.
Does it help at all?
Jens
Hi Jens,
I tested zh_TW.UTF-8 and zh_CN.UTF-8, the monospace fonts are displayed
properly.
For ja_JP.UTF-8, the monospace fonts are also displayed properly.
Between ja and zh, the monospace fonts used seems are different. ja is
using a narrower width monospace than zh ones (they all monospaced).
Please kindly point out if the info is not what you are inquiring.
Best Regards,
Caius.
Jens Petersen さんは書きました:
> Caius,
>
> Could you take a look at this please, test the fontconfig file
> and follow up on the list with your findings.
>
> Thanks,
>
> Jens
>
15 years, 12 months