[Zope-dev] Translator 0.1 released

Fri, 24 Mar 2000 21:29:42 +0100

Hello David,

Just a quick note about your message. I haven't tested your product yet but 
I will this week-end.

I have a lot of misgivings about any automated translation system based on 
words (I am a freelance translator and I turn green whenever I see the 
output of automated translators on the Web).

It just doesn't work because you almost never have a one to one mapping 
between words in languages. The example you decribed below could generate 
quite a mess in French for instance because of the complex syntax (eg. 
"green" can be "vert", "verte", "verts" or "vertes" depending on the context).

IMO no translation should be performed if there isn't a 100% match between 
the glossary table and the localizable string.

Example:

Translation glossary:

{'Squids are beautiful': {'fr': 'Les poulpes sont magnifiques'}, ...}

Localizable strings in a text:

1st instance: <dtml-translate>Squids are beautiful</dtm-translate>
2nd instance: <dtml-translate>Squids are beautiful in summer</dtm-translate>

The 1st example should be translated because it matches. The 2nd shouldn't. 
If no match is found then the default (source) language is left as is.

If you want to create a glossary of common phrases (a so-called 
"translation memory") you need to store individual sentences or groups of 
sentences. Then you can reuse translated sentences across texts. Automated 
sentence translation is a bit safer because sentences provide much more 
context than individual words.

To summarize, there is a trade-off : when you use longer translation units, 
you get much higher translation quality but less leverage (i.e reuse) 
across texts.

This is a complex issue... if I get started on it I will talk you to sleep :-).

Cheers.

Alexandre

At 03:37 24/03/2000 -0800, you wrote:
>Download the translator from
>
>   http://www.zope.org/Members/jdavid/translator
>
>The translator product is just the old vocabulary product renamed
>following the Michell advices and with a new translate tag which
>is a first implementation of the Shane's proposal.
>
>Suppose the data contained by our Translator instance is:
>
>  {'my': {'es': 'mi'},
>   'house': {'es': 'casa'},
>   'is': {'es': 'es'},
>   'green': {'es: 'verde'}
>  }
>
>Then if you type:
>
>   "<dtml-translate>my house is green</dtm-in>"
>
>and "languages" is "['es']" then the result is "mi casa es verde"
>(and it's right!!).
>
>It does it by:
>
>  - split the string
>  - translate each word
>  - join the string
>
>Yes, this is only a bogus algorithm but it shows how it would be.
>Future releases should allow to plug more intelligent translators.