Transifex expression check problem

John Dennis jdennis at redhat.com
Tue May 7 17:55:13 UTC 2013


On 05/07/2013 12:41 PM, Dimitris Glezos wrote:
>
>
> On Tue, May 7, 2013 at 8:34 AM, John Dennis <jdennis at redhat.com
> <mailto:jdennis at redhat.com>> wrote:
>
>     To solve this problem we've written tools that analyzes every po
>     file downloaded from TX to make sure substitutions done via format
>     conversions are not lost or mangled. I'm thrilled that TX has now
>     apparently added consistency checks to prevent these problems much
>     like we've been forced to do.
>
>
> Hi John,
>
> Can you share some more information on specific checks which are/were
> not supported in Tx? Ideally, please open them as issues on GitHub.
> There are more than 50 checks in the editor for many years now, but
> there is a possibility we missed something.

I can't tell you what might be slipping through today I can only tell 
you about issues we've seen in the past and proactively built defenses 
against. Most of our code is Python with a small amount of C. The 
problems we saw were msgstr's which would cause the Python string parser 
to raise an exception, examples of things we've seen in the past are:

* unmatched parentheses,  e.g. "%foo)s"
* using braces instead of parenthesis, e.g. "%{foo}s"
* dropping the format specifier, e.g. "%(foo)"
* adding an extra parentheses, e.g. "%(foo))s"
* mixing parentheses and braces, e.g. %{foo)s
* misspelling the named substitution, e.g. "%(fooo)s"
* completely omitting a substitution, e.g. "%(foo)s"
   when it should have been "%(foo)s failed because %(bar)s"
* replacing a named substitution with an unnamed one.
* dropping or inserting escape characters

I'm sure there were others, this is just what I remember off the top of 
my head. We also noticed it was very translator specific, some were 
quite fastidious and others remarkably careless (no surprises there).

Some of the problems were more pernicious before we demanded our 
developers to *always* used named or indexed substitutions (C code). In 
Python if you didn't used a named substitution you would often get "Not 
enough arguments for format specifiers" or "Not all arguments converted" 
Python exceptions. Now that everything is using named substitutions we 
just need to assure all the dict keys are correct and there are no 
syntax violations.

Are the TX checks only done in the editor? If so how does that protect 
uploaded po files? Shouldn't the error checking also be done when a po 
is uploaded ?

FWIW our tools also enforce proper coding conventions on developers. For 
instance you can't check in code that fails to use named or indexed 
substitutions or have any of the problems mentioned above. This keeps 
the POT file clean (at times we were guilty of producing bad msgid's).

At the time we put these checks into place (2-3 years ago) I was not 
aware TX was attempting to rectify them as well.

-- 
John Dennis <jdennis at redhat.com>

Looking to carve out IT costs?
www.redhat.com/carveoutcosts/


More information about the trans mailing list