Please do not reply directly to this email. All additional comments should be made in the comments box of this bug.
Summary: Lohit Malayalam font does not have support for 0D4E MALAYALAM LETTER DOT REPH
https://bugzilla.redhat.com/show_bug.cgi?id=799565
Summary: Lohit Malayalam font does not have support for 0D4E MALAYALAM LETTER DOT REPH Product: Fedora Version: 16 Platform: Unspecified OS/Version: Unspecified Status: NEW Severity: unspecified Priority: unspecified Component: lohit-malayalam-fonts AssignedTo: psatpute@redhat.com ReportedBy: samjnaa@gmail.com QAContact: extras-qa@fedoraproject.org CC: fonts-bugs@lists.fedoraproject.org, psatpute@redhat.com, i18n-bugs@lists.fedoraproject.org Classification: Fedora Story Points: --- Type: --- Regression: --- Mount Type: --- Documentation: ---
Description of problem:
It was announced (https://www.redhat.com/archives/lohit-devel-list/2012-February/msg00011.html) that the latest 2.5.1 version Lohit fonts support latest Unicode 6.0 characters. Especially Malayalam was also mentioned.
However I find that the Lohit Malayalam 2.5.1 release downloadable from https://fedorahosted.org/releases/l/o/lohit/lohit-malayalam-ttf-2.5.1.tar.gz does NOT provide support for 0D4E MALAYALAM LETTER DOT REPH.
As this is one of the three Malayalam characters encoded for Unicode 6.0 (see http://www.unicode.org/Public/UNIDATA/DerivedAge.txt and search for 0D4E) it should also be supported.
[The other two characters are provided but 0D3A has a wrong glyph which I have reported as bug 798870.]
Version-Release number of selected component (if applicable):
2.5.1
Steps to Reproduce: 1. Install Lohit Malayalam 2.5.1 font. 2. Try to use 0D4E MALAYALAM LETTER DOT REPH
Actual results:
This character is not available.
Expected results:
This character was encoded to support the old Malayalam orthography. As such it should be made available for full Unicode 6.0 (or 6.1) support.
Additional info:
You might need to do some smart font programming to position the dot reph correctly. Note that this character is a special rendering character (hence the dotted box around it in the code chart http://www.unicode.org/charts/PDF/U0D00.pdf).
The special rendering is that it should be placed on top of the character *following* it. See the original proposal bottom of page 3 and top of page 4.
I think the e-Malayalam OTC font (http://www.aai.uni-hamburg.de/indtib/INDOLIPI/Malayalam.zip) has pre-composed glyphs using this character on top of other consonants which might help you in positioning this character. Note that most often it is found with doubled consonants (i.e. DOT_REPH + GA + VIRAMA + GA etc) so you will have to be able to position this character above stacked consonant clusters.
I hope this is sufficient feedback for supporting this character which is important for old Malayalam orthography.
Please do not reply directly to this email. All additional comments should be made in the comments box of this bug.
https://bugzilla.redhat.com/show_bug.cgi?id=799565
--- Comment #1 from Pravin Satpute psatpute@redhat.com 2012-03-04 23:33:08 EST --- can you provide screenshot of its rendering?
Please do not reply directly to this email. All additional comments should be made in the comments box of this bug.
https://bugzilla.redhat.com/show_bug.cgi?id=799565
--- Comment #2 from Shriramana Sharma samjnaa@gmail.com 2012-03-05 01:24:56 EST --- The original proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3676.pdf has many examples on pp 3-5. Is that enough?
Please do not reply directly to this email. All additional comments should be made in the comments box of this bug.
https://bugzilla.redhat.com/show_bug.cgi?id=799565
Pravin Satpute psatpute@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |MODIFIED
--- Comment #3 from Pravin Satpute psatpute@redhat.com 2012-03-05 04:33:43 EST --- yes, fixed in upstream, latest ttf http://pravins.fedorapeople.org/Lohit-Malayalam.ttf
Please do not reply directly to this email. All additional comments should be made in the comments box of this bug.
https://bugzilla.redhat.com/show_bug.cgi?id=799565
--- Comment #4 from Shriramana Sharma samjnaa@gmail.com 2012-03-05 09:42:05 EST --- Created attachment 567649 --> https://bugzilla.redhat.com/attachment.cgi?id=567649 Revised Lohit Malayalam for bugs #799565 and #798870
There are some corrections. Please find attached a revised font.
The correction is that the dot reph should have a positive LSB. Otherwise it will become positioned on the previous letter rather than on the following letter. Please see original proposal. PA DOT_REPH VA പൎവ should place dot reph on VA, and not PA. So I have now moved the dot reph to the right.
However, while it seems to be working correctly (at least in LibreOffice) with medium-size consonants like ഗ വ etc, it still does not look good with wide consonants like ണ. It should ideally be centered on top of the consonant as you can see in the proposal samples. Can you please implement proper glyph positioning? I don't know how to do that.
I will attach ODT and PDF samples of font as currently modified by me for testing.
Please do not reply directly to this email. All additional comments should be made in the comments box of this bug.
https://bugzilla.redhat.com/show_bug.cgi?id=799565
--- Comment #5 from Shriramana Sharma samjnaa@gmail.com 2012-03-05 09:44:23 EST --- Created attachment 567651 --> https://bugzilla.redhat.com/attachment.cgi?id=567651 ODT and PDF for test-case
Please do not reply directly to this email. All additional comments should be made in the comments box of this bug.
https://bugzilla.redhat.com/show_bug.cgi?id=799565
--- Comment #6 from Pravin Satpute psatpute@redhat.com 2012-03-06 00:40:27 EST --- this is peculiar case, where marks come first and then base character. need to check it.
Please do not reply directly to this email. All additional comments should be made in the comments box of this bug.
https://bugzilla.redhat.com/show_bug.cgi?id=799565
--- Comment #7 from Shriramana Sharma samjnaa@gmail.com 2012-03-06 01:06:54 EST --- Yes as I told you that is why the character in the Unicode chart has a dotted box around it to indicate that it is a special rendering character.
I quote from: http://www.unicode.org/versions/Unicode6.0.0/ch09.pdf p 310:
Dot Reph. U+0D4E MALAYALAM LETTER DOT REPH is used to represent the dead consonant form of U+0D30 MALAYALAM LETTER RA, when **it is displayed as a dot over the consonant following it**. Conceptually, the dot reph is analogous to the sequence <RA, VIRAMA>, but when followed by another consonant, the Malayalam cluster <RA, VIRAMA, C2> normally assumes the C2 conjoining form. **U+0D4E MALAYALAM LETTER DOT REPH occurs first, in logical order, even though it displays as a dot above the succeeding consonant**. It has the character properties of a letter, and is not considered a combining mark.
Please do not reply directly to this email. All additional comments should be made in the comments box of this bug.
https://bugzilla.redhat.com/show_bug.cgi?id=799565
--- Comment #8 from Pravin Satpute psatpute@redhat.com 2012-03-07 02:03:13 EST --- If we consider this as a base character, then there is no feature in OT spec for doing base to base positioning. We can use dist/kern feature but i dont think it will give expected results.
Dunno, do we need reordering for this character? we are reordering "ra+virama" in Devanagari script.
Please do not reply directly to this email. All additional comments should be made in the comments box of this bug.
https://bugzilla.redhat.com/show_bug.cgi?id=799565
Santhosh Thottingal santhosh.thottingal@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |santhosh.thottingal@gmail.c | |om
--- Comment #9 from Santhosh Thottingal santhosh.thottingal@gmail.com 2012-03-08 12:55:52 EST --- I used akhn for this in Meera. The dot positioning using positive lsb is not optimal and will result the dot appearing in wrong positions size the ligature below is variable width. The DOT should come in center top position in general, but there are exceptions too. Meera has separate glyphs, for all valid dot rephs. But I got some issues in this implementation too.
Please do not reply directly to this email. All additional comments should be made in the comments box of this bug.
https://bugzilla.redhat.com/show_bug.cgi?id=799565
--- Comment #10 from Shriramana Sharma samjnaa@gmail.com 2012-03-09 11:43:21 EST --- I asked about this on Unicore list. One Microsoft engineer replied that they use reordering for this. I am guessing that they are treating the dot_reph character just like any other reph in any Indic script -- move it to the end of the syllable. The only difference is that Malayalam has a distinct character for this whereas in other scripts it is just RA + VIRAMA which is rendered as the reph.
So probably you can just treat it like the reph of other scripts. However, as Santosh says (and I already said) there will be a problem with the positioning.
One solution is to use akhanda ligatures as Santosh says but that is just a hack. The proper solution is to use GPOS. After all, that is what GPOS is for, isn't it? To position combining marks properly?
That said, I myself am not knowledgeable about all this GPOS-GSUB thing -- I'm more a Graphite person. So I leave it to you people to decide how to implement this.
I would only suggest that if you do use akhanda ligatures that you use composite glyphs instead of duplicating existing outlines. Would help in keeping the size of the font under check.
Please do not reply directly to this email. All additional comments should be made in the comments box of this bug.
https://bugzilla.redhat.com/show_bug.cgi?id=799565
--- Comment #11 from Pravin Satpute psatpute@redhat.com 2012-03-12 07:46:57 EDT --- Santhosh, i think with reordering dot-reph to end of the syllable, we can solve this problem.
We can simply use positioning/kerning for it, Once it get reordered same way like Anuswara in Devanagari script U+0902 we can position it.
For time being for testing purpose we can type u0d4e at the end of syllable and check.
In between i am not finding any rule written for dot-reph ligatures in Meera font.
Shriramana, can you ask Microsoft guy regarding any link for it in specification, we can ask behdad to do that change in harfbuzz.
Please do not reply directly to this email. All additional comments should be made in the comments box of this bug.
https://bugzilla.redhat.com/show_bug.cgi?id=799565
--- Comment #12 from Shriramana Sharma samjnaa@gmail.com 2012-03-12 12:09:29 EDT --- (In reply to comment #11)
For time being for testing purpose we can type u0d4e at the end of syllable and check.
Agreed. If the kerning is done first, we can later bring in reordering support from the software side.
Shriramana, can you ask Microsoft guy regarding any link for it in specification, we can ask behdad to do that change in harfbuzz.
AFAIK this is not present in the Microsoft OT docs on Malayalam (http://www.microsoft.com/typography/otfntdev/malayot/shaping.aspx). The Unicode publication only says right now that it should be placed after the following consonant but clearly this is insufficient description. I will ask the Unicode people to update the wording. Hopefully Behdad can implement this as you say.
https://bugzilla.redhat.com/show_bug.cgi?id=799565
Behdad Esfahbod behdad@fedoraproject.org changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |behdad@fedoraproject.org
--- Comment #13 from Behdad Esfahbod behdad@fedoraproject.org --- Hi all,
We understand the Repha now. Will implement for Malayalam soon (eg. tomorrow).
Cheers, behdad
https://bugzilla.redhat.com/show_bug.cgi?id=799565
--- Comment #14 from Pravin Satpute psatpute@redhat.com --- Thats nice. http://pravins.fedorapeople.org/Lohit-Malayalam.ttf Test font with added GPOS 'abvm' for U+0D4E (though not very accurate positioning)
https://bugzilla.redhat.com/show_bug.cgi?id=799565
--- Comment #15 from Shriramana Sharma samjnaa@gmail.com --- Created attachment 599365 --> https://bugzilla.redhat.com/attachment.cgi?id=599365&action=edit Desired rendering of dot reph and comparison of Bengali syllable structure
@Pravin/Behdad: I'm not much knowledgeable about but isn't the feature tag for reph characters spelt as "reph" or "rphf" or something?
@Behdad: Basically if you treat this 0D4E character equivalent to the cluster-initial RA + VIRAMA sequences of other Indic scripts, especially Bengali, I think it would be sufficient. Why Bengali? Because it also has two part vowel signs ো ৌ like Malayalam ൊ ോ ൌ and the reph also. But Bengali doesn't seem to have post-base VA unlike Malayalam so you may have to look out for that.
FWIW I have attached a document (ODT and PDF) showing the desired rendering of the reph (using the e-Malayalam OTC font from the "Indolipi" package [link above] which hack-renders the Malayalam RA + Virama combination as the reph) and the equivalent Bengali sequences in two standard Bengali fonts.
You might also like to see https://sites.google.com/site/jamadagni/files/utcsubmissions/12106-ed-update... §3 (on p 4) for more details on the Malayalam dot reph.
https://bugzilla.redhat.com/show_bug.cgi?id=799565
--- Comment #16 from Pravin Satpute psatpute@redhat.com --- Feature tag will not come into picture for U+0D4E, once reordering is done by OTLS as per syllable structure, we can simply apple GPOS tag 'abvm' and get desired positioning.
I will update Lohit as per desired positioning.
Product: Fedora https://bugzilla.redhat.com/show_bug.cgi?id=799565
--- Comment #17 from Pravin Satpute psatpute@redhat.com --- (In reply to comment #13)
Hi all,
We understand the Repha now. Will implement for Malayalam soon (eg. tomorrow).
I just checked with Harfbuzz NG. Still we are not reordering U+0D4E at the end of syllable.
"ൎക" hb-shape returns :: [uni0D4E=0|U0D15=1+1015]
Any specific plan for this?
Product: Fedora https://bugzilla.redhat.com/show_bug.cgi?id=799565
--- Comment #18 from Behdad Esfahbod behdad@fedoraproject.org --- Ok, I'll work on this now. Thanks for pinging, I was out of issues to fix and was getting bored... :)
Product: Fedora https://bugzilla.redhat.com/show_bug.cgi?id=799565
--- Comment #19 from Behdad Esfahbod behdad@fedoraproject.org --- Fixed upstream. Please test.
Product: Fedora https://bugzilla.redhat.com/show_bug.cgi?id=799565
--- Comment #20 from Pravin Satpute psatpute@redhat.com --- (In reply to comment #7)
I quote from: http://www.unicode.org/versions/Unicode6.0.0/ch09.pdf p 310:
Dot Reph. U+0D4E MALAYALAM LETTER DOT REPH is used to represent the dead consonant form of U+0D30 MALAYALAM LETTER RA, when **it is displayed as a dot over the consonant following it**. Conceptually, the dot reph is analogous to the sequence <RA, VIRAMA>, but when followed by another consonant, the Malayalam cluster <RA, VIRAMA, C2> normally assumes the C2 conjoining form. **U+0D4E MALAYALAM LETTER DOT REPH occurs first, in logical order, even though it displays as a dot above the succeeding consonant**. It has the character properties of a letter, and is not considered a combining mark.
I still not get this output. do i need to use different OT feature here?
Product: Fedora https://bugzilla.redhat.com/show_bug.cgi?id=799565
--- Comment #21 from Behdad Esfahbod behdad@fedoraproject.org --- Testing with the font in comment 14, I see the reordering happening. Here's the hb-shape output:
$ ./hb-unicode-encode d4e,d15 | build/util/hb-shape ./Lohit-Malayalam.ttf --shaper ot [U0D15=0+1015|uni0D4E=0@-971,-41+0]
Product: Fedora https://bugzilla.redhat.com/show_bug.cgi?id=799565
--- Comment #22 from Behdad Esfahbod behdad@fedoraproject.org --- This is what I get with Lohit-Malayalam 2.5.2:
$ ./hb-unicode-encode d4e,d15 | build/util/hb-shape indic-fonts-lohit/malayalam.ttf [U0D15=0+1015|uni0D4E=0@-971,-41+0]
Product: Fedora https://bugzilla.redhat.com/show_bug.cgi?id=799565
--- Comment #23 from Pravin Satpute psatpute@redhat.com --- My mistake. Messed up with git, its working fine now. I will do release of lohit with this fix. Thanks a lot Behdad
Product: Fedora https://bugzilla.redhat.com/show_bug.cgi?id=799565
--- Comment #24 from Behdad Esfahbod behdad@fedoraproject.org --- Cool. Please close when you do.
Product: Fedora https://bugzilla.redhat.com/show_bug.cgi?id=799565
Pravin Satpute psatpute@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|MODIFIED |CLOSED Resolution|--- |UPSTREAM Last Closed| |2013-01-03 00:10:57
--- Comment #25 from Pravin Satpute psatpute@redhat.com --- Completed GPOS in upstream, will be available with the next release of lohit-malayalam-fonts.
i18n-bugs@lists.fedoraproject.org