Re: 2nd intl.workshop on FOSS rule-based machine translation

Wednesday, 19 January 2011

On Wed, Jan 19, 2011 at 10:28 AM, Jesús Franco <tezcatl(a)fedoraproject.org&gt;wrote:

...
 Hi!

 An event than could interest all of us, related to translation *and*
 free/open-source it's coming up and it would be hosted by the prestigious
 Universitat Oberta de Catalunya.

 http://www.uoc.edu/freerbmt11/

 The introduction of the site:

        This workshop aims to bring together the experience of researchers
        and developers in the field of rule-based machine translation who
        have decided to get on board the free/open-source train and are
        *effectively contributing to creating a commons of explicit
        knowledge*: machine translation rules and dictionaries, and
        machine translation systems whose behaviour is transparent and
        clearly traceable through their explicit logic.

 (I've added the *bold*) This is an area of real interest to me, there are
 pretty good translation memories we can access today *because people is
 sharing*. You can read all about this here:

 http://www.tausdata.org/blog/2010/08/everyones-sharing-translations/

 My own discussion about why this is worth of your attention:

 There is no doubt than machine automated translation is helping in the work
 of every translator at Fedora/FOSS at a large. But we shouldn't take for
 sure "Google is going better and better and i don't need to do anything
 after copypasting my strings for translation on it" as someone as said in
 the past.

 By the way, i'm in the way of starting a project for sharing among
 Fedora/FOSS translators translation memories in a common repository (by
 language), and i'd like to know your experience about this:

 Which is the best way to share your translation memories on a repo we can
 fit to our needs?

 I've thought about putting all together in a git repo (by language pairs),
 where everybody can push his updates to the memories, and pull the
 contributed by their peers, loading it into their preferred translation
 software.

 But i think is easier to say it than doing it? Or not?

 Which is the way do you reuse your own translation memories? How do you
 share with your fellows at your Fedora translation team?

 Thanks in advance for every comment (even the tiniest) you want to share.
 I think most of the efforts should be aimed at building Apertium corpuses.

English is a Germanic language, Apertium allows languages with the same
basic rules to more transparent to one another (Although corpuses work in
one direction only).

The problem is that we can't build corpuses so easily, in order to create a
working corpus many hours of work are needed, if there was only a simpler
way of creating them and maintaining them using AI it would be great (or
even creating a graphical tool to aid in this mission).

Building "stupid" TMs with statistical data can be sometimes false and they
need lots of AI to get better, building good corpuses with the right rules
for every language will help computers understand human language in source
and destination languages.

Why is it so important?
I want to present you with a problem I had with translating text from Arabic
to Hebrew, Google's mechanism is doing the following procedure: Translated
the Arabic text to English, English is then translated to Hebrew, You can't
even possible imagine how strong is the phrase "Lost in translation" in this
case.
BTW, Microsoft translator does much better job than Google's translator when
translating from English to Hebrew.

Hebrew and Arabic share basic rules, instead of using English in the middle
we can use a mechanism that will take advantage of their similarities to
translate between them without going through 3rd language.
Same for Czech and Slovak, apparently many Czech translators are using the
Slovak translation instead of translating from English and vice versa.

Apertium website: http://www.apertium.org/

Kind regards,
Yaron Shahrabani

<Hebrew translator>

...

 --
 "We cannot solve our problems with the same thinking we used when we
 created
 them." Albert Einstein
 Jesús Franco - Fedora Ambassador and Translator
 http://fedoraproject.org/wiki/User:Tezcatl
 http://identi.ca/tzk

 --
 trans mailing list
 trans(a)lists.fedoraproject.org
 https://admin.fedoraproject.org/mailman/listinfo/trans 

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

Re: 2nd intl.workshop on FOSS rule-based machine translation