[FZH] ibus-table update to 1.8.1

Mike FABIAN maiku.fabian at gmail.com
Sat Jun 7 16:59:46 UTC 2014


Mike FABIAN <maiku.fabian at gmail.com> さんはかきました:

> Felix Yan <felixonmars at gmail.com> さんはかきました:
>
>> On Thursday, June 05, 2014 12:17:45 Mike FABIAN wrote:
>>> As the database format has been changed, the ibus-table-chinese*
>>> packages have to be updated as well. ibus-table-chinese* packages which
>>> have been build against old versions of ibus-table will not work due to
>>> the change in the database format.
>>
>> Hi, I've tested with ibus-table-chinese* 1.4.6 built against ibus-table 1.7.0, 
>> which still seems to work with 1.8.1 even without a rebuild. Is this normal, 
>> or am I missing something?
>
> This is normal, ibus-table-1.7.0 already had the new database format.

You can check what database format you have by doing:

sqlite3 /usr/share/ibus-table/tables/wubi-jidian86.db
sqlite> select * from phrases;
...
136829|zzzy|ㄦ|392|0
sqlite> 

This is the new database format, first column is some integer rowid, 2nd
column is the input sequence (“zzzy”), 3rd column is the phrase for that
input sequence “ㄦ”, 4th column is the system frequency (392) and the 5th
column is the user frequency (always 0 if this is a system database).

The old database format looks like this:

sqlite3 /usr/share/ibus-table/tables/wubi-jidian86.db.orig
sqlite> select * from phrases;
...
136829|4|1|26|26|26|25|3|ㄦ|392|0
sqlite> 

The first column is again the integer rowid, the 2nd column is the
length of the input sequence (4), the 3rd column is the length of the
phrase (1), the next 4 columns are integer codes for the input sequence
“zzzy”, the 8th column is a bitfield indicating whether this phrase is
simplified or traditional Chinese (3 means it is both), the 9th row is
the phrase (ㄦ) and then again system frequency and user frequency.

The new database does not need the bitfield for traditional and
simplified Chinese anymore because it calculates this at runtime, not at
build time of the database. The separate columns with integer codes for
the input sequence have been merged into one row for the input sequence
as a string. And columns for the length of the input sequence and the
phrase are not needed as these can be easily calculated at runtime.

-- 
Mike FABIAN <mfabian at redhat.com>
☏ Office: +49-69-365051027, internal 8875027
睡眠不足はいい仕事の敵だ。


More information about the Chinese mailing list