[Issue 113558] Change Case broken by language tags and/or ligatures

Sun Aug 1 04:25:40 UTC 2010

One more comment...

After posting this report yesterday, I starting playing with the new user
dictionary interface in M85 (the default for new user dictionary files has
changed from binary to UTF-8). There are bugs there, too, which may possibly be
related to the casing errors. So, please don't treat the following as a separate
bug report (it's in the wrong place for that, I know), but instead as a clue to
the possible cause of the casing errors.

In short, when I add a word to a user dictionary that contains a double-byte
character (eg a letter combined with an unusual accent, such as dot underneath),
or if the user dictionary already contains such words, things start getting
buggy: in some cases, an *incomplete* copy of the last word in the list gets
appended to the dictionary; in other cases, the word is not added to the
selected dictionary at all, but to another one.

Again, I'm not a programmer, but if I were to bet on it, I'd guess there's a
possibility that both sets of errors are caused by a bug in a text parsing
library used by both the casing and user dictionary routines.

Reason for saying this is that all the errors - casing and dictionary - appear
to involve miscalculating text bounds.

The parallel is particularly compelling when comparing Capitalize Every Word's
mangling of text with ligatures (which could be counted as one, two or more
characters), to the dictionary parser's mangling of user dictionaries that
contain non-compiled characters with combining accents (which could also be
counted as one, two or more characters).

Has something recently changed in a text parsing component?

