line-breaking rules for CJK

Daiki Ueno ueno at unixuser.org
Thu Dec 16 06:55:53 UTC 2010


Hi,

I'm now collecting line-breaking rules for East Asian languages[1] to
make groff man-pages rendering better[2][3].

The current data sets are taken from Emacs' kinsoku.el and the OOXML
specification.  If anyone could check other data sources (TeX, etc), it
would be nice I think.

The data sets we could improve are:

- Characters that are not allowed at the start of a line
http://ueno.fedorapeople.org/groff/make-cjk-tmac/prepunct

- Characters that are not allowed at the end of a line
http://ueno.fedorapeople.org/groff/make-cjk-tmac/postpunct

Footnotes: 
[1]  https://secure.wikimedia.org/wikipedia/en/wiki/Line_breaking_rules_in_East_Asian_language

[2]  http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=552201

[3]  https://bugzilla.redhat.com/show_bug.cgi?id=596900

Regards,
-- 
Daiki Ueno


More information about the i18n mailing list