2nd intl.workshop on FOSS rule-based machine translation

Jesús Franco tezcatl at fedoraproject.org
Wed Jan 19 11:11:08 UTC 2011

Yaron, thank you very much for your quick reply, i'll comment interleaving.

Yaron Shahrabani wrote:

> On Wed, Jan 19, 2011 at 10:28 AM, Jesús Franco
> <tezcatl at fedoraproject.org>wrote:
>> There is no doubt than machine automated translation is helping in the
>> work of every translator at Fedora/FOSS at a large. But we shouldn't take
>> for sure "Google is going better and better and i don't need to do
>> anything after copypasting my strings for translation on it" as someone
>> as said in the past.

I use Google just marginally, when i'm not able to find a logic translation 
(done by another human), among shared TM in TAUS and maybe too in forums. I 
prefer quality on the translation sense over quantity over number of strings 
translated (unfortunately, i can't talk by other members of spanish 
translation team :S).

> I think most of the efforts should be aimed at building Apertium corpuses.

Unfortunately i think apertium is not packaged at Fedora, and in my own 
experience translating from English to Spanish it returns something like you 
call "stupid TM". Maybe i'm not getting the point, but my feeling about this 
is than machines are far of replacing people at this moment, whatever app 
you use.

> The problem is that we can't build corpuses so easily, in order to create
> a working corpus many hours of work are needed, if there was only a
> simpler way of creating them and maintaining them using AI it would be
> great (or even creating a graphical tool to aid in this mission).

My thought about the sharing ideas come from my own experience using virtaal 
which lets me access to my own TM. Domingo Becker has come with a thought 
about connecting his translation app to Google or something kinda much the 

> Building "stupid" TMs with statistical data can be sometimes false and
> they need lots of AI to get better, building good corpuses with the right
> rules for every language will help computers understand human language in
> source and destination languages.

I think we are talking about different directions. I don't believe computers 
can understand the obscure corners of a whole human language soon. But i'm 
pretty sure people is not so stupid than machines if they try to do their 
best. This is why i'm talking about sharing TM among people, not CPUs.

> I want to present you with a problem I had with translating text from
> Arabic to Hebrew, Google's mechanism is doing the following procedure:
> Translated the Arabic text to English, English is then translated to
> Hebrew, You can't even possible imagine how strong is the phrase "Lost in
> translation" in this case.

Imagine what could be happened if a "translator" has taken that result just 
"because is Google". That's exactly my point.

> BTW, Microsoft translator does much better job than Google's translator
> when translating from English to Hebrew.

I can't have an idea about that, i use and promote just free (as in freedom) 
software. It's not about "morality", it's just i can't suggest to a fellow 
translator "buy this software" (whatever provider he/she can through).

> Hebrew and Arabic share basic rules, instead of using English in the
> middle we can use a mechanism that will take advantage of their
> similarities to translate between them without going through 3rd language.
> Same for Czech and Slovak, apparently many Czech translators are using the
> Slovak translation instead of translating from English and vice versa.

I think the same about Português (and even more Catalá) "twins" languages of 
Castilian (Spanish), it would be easier to work among similar languages, 
than putting english in the middle. Actually that wicked idea is the 
"official" approach in Fedora for guides written in a language other than 
english, translating first to en_US and from there to another languages.

Maybe we should stay using english as a common language, but i'm sure than 
for brazilian and catalan people its easier to understand me in my mother 
language than through a machine smashing my words in a statistical based 

> Apertium website: http://www.apertium.org/

I'm going to apertium site if there are new ideas i can get through there.

> Kind regards,
> Yaron Shahrabani
> <Hebrew translator>

Thanks for sharing your vision about this.
Best regards.
Jesús Franco - Fedora Ambassador and Translator

More information about the trans mailing list