[Zope-dev] Re: Translator Rules...How should we attack the issue?
Alexandre Ratti
alex@gabuzomeu.net
Sat, 25 Mar 2000 00:03:05 +0100
Hello Stephan,
One suggestion: just don't try it in Zope :-).
Sorry if I sound overly positive but as they say in the Jargon File this
problem is "AI-complete" (i.e. just too difficult).
Very complex algorithms are needed to implement even basic automated
translation. And they still output gobbledygook.
So IMO we just need to implement a translation memory (i.e. a string
repository where human-created translations are stored). A translation
glossary + a block substitution system should be sufficient to support
multilingual pages in Zope. This is what is needed in the short term.
Cheers.
Alexandre
PS: I don't want to sound finicky but you need to say "ma maison" in French
:-).
At 15:22 24/03/2000 -0600, Stephan Richter wrote:
>Another issue we all should be concerned with is grammar. I know that the
>chinese grammar is as simple as it gets (maybe, mandarin should be the
>world language). Furthermore, the European languages (especially the
>latin-based ones) are similar. But there are small differences:
>
>So for example (I will try my best with French):
>
>I seek my red home.
>Ich suche mein rotes zu Hause.
>Je cherche mon maison rouge.
>
>I see so many problems here. For example, the sentance structure between
>German and English is pretty much the same. BUT, in German you have to
>conjugate "suchen" (seek) as well as "rot" (red). The translator certainly
>should not attempt translate the entire sentence. It should be smart about
>grammar.
>But it gets much worse than that. In french adjectives are "usually"
>(geez, another issue) placed behind the noun. Aditionally, to conjugate
>"chercher" (seek), you have to conjugate "mon" (my), since that depends
>whether you are guy or a girl.
>
>So here are some of my thoughts about the issue (they are not organized or
>well thought through):
>
>- Evaluating a sentence should work like parsing an algebratic expression
>into reverse polish notation (RPN) using stacks or a tree.
>- Each subtree will automatically represent a phrase.
>- Each word and phrase is an object that contains many information,
>containing grammar.
> So for example, "suchen" should contain all its conjugations. The same
> for the adjective "rot".
> I think an abstract base class called WORD should be written, and then
> derived classes called ADJECTIVE, VERB, NOUN, ADVERB ...
> These classes should also reference each other, since in German you can
> easily make out of adverbs --> nouns --> verbs.
> Mmmh, that brings me to another point, specific for German. We have a
> lot of compound words... That will be hard...
>
>So here an example for: I seek my red home.
> / \
> I seek my red home
> / \ / \
> I seek my red home
> / \
>
>home red
>
>So I would translate (walking the tree):
>
>I --> Ich (Note: we know that it is first person singular)
>seek --> suchen ---> suche (since we have first person singular)
>I seek --> Ich suche
>
>my --> mein (signals posession: 4th case--> Akusativ)
>home --> zu Hause (Note: we know it is neutral, because: das zu Hause)
>red --> rot --> rotes (because 4th case neutral)
>red home --> rotes zu Hause
>my red Home --> mein rotes zu Hause
>
>I seek my red home. --> Ich suche mein rotes zu Hause.
>
>Now that was built on German grammar rules. We certainly could do this
>with French and Spanish as well.
>
>We probably need some language experts which can tell us, which words are
>more important in defining the grammar than others. While I was doing the
>example, I noticed that nouns are more important than adjectives.
>Furthermore, we should consult a graph theorist who can help us with
>creating trees, based on these rules. He might be able to use some math to
>optimize the algorithm.
>
>As I said, these are just some ideas. Any comments?
>
>Regards,
>Stephan
>--
>Stephan Richter - (901) 573-3308 - srichter@cbu.edu
>CBU - Physics & Chemistry; Framework Web - Web Design & Development
>PGP Key: 735E C61E 5C64 F430 4F9C 798E DCA2 07E3 E42B 5391
>