[Zope-dev] Translator Rules...How should we attack the issue?
Stephan Richter
srichter@cbu.edu
Fri, 24 Mar 2000 15:22:17 -0600
Another issue we all should be concerned with is grammar. I know that the
chinese grammar is as simple as it gets (maybe, mandarin should be the
world language). Furthermore, the European languages (especially the
latin-based ones) are similar. But there are small differences:
So for example (I will try my best with French):
I seek my red home.
Ich suche mein rotes zu Hause.
Je cherche mon maison rouge.
I see so many problems here. For example, the sentance structure between
German and English is pretty much the same. BUT, in German you have to
conjugate "suchen" (seek) as well as "rot" (red). The translator certainly
should not attempt translate the entire sentence. It should be smart about
grammar.
But it gets much worse than that. In french adjectives are "usually" (geez,
another issue) placed behind the noun. Aditionally, to conjugate "chercher"
(seek), you have to conjugate "mon" (my), since that depends whether you
are guy or a girl.
So here are some of my thoughts about the issue (they are not organized or
well thought through):
- Evaluating a sentence should work like parsing an algebratic expression
into reverse polish notation (RPN) using stacks or a tree.
- Each subtree will automatically represent a phrase.
- Each word and phrase is an object that contains many information,
containing grammar.
So for example, "suchen" should contain all its conjugations. The same
for the adjective "rot".
I think an abstract base class called WORD should be written, and then
derived classes called ADJECTIVE, VERB, NOUN, ADVERB ...
These classes should also reference each other, since in German you can
easily make out of adverbs --> nouns --> verbs.
Mmmh, that brings me to another point, specific for German. We have a
lot of compound words... That will be hard...
So here an example for: I seek my red home.
/ \
I seek my red home
/ \ / \
I seek my red home
/
\
home
red
So I would translate (walking the tree):
I --> Ich (Note: we know that it is first person singular)
seek --> suchen ---> suche (since we have first person singular)
I seek --> Ich suche
my --> mein (signals posession: 4th case--> Akusativ)
home --> zu Hause (Note: we know it is neutral, because: das zu Hause)
red --> rot --> rotes (because 4th case neutral)
red home --> rotes zu Hause
my red Home --> mein rotes zu Hause
I seek my red home. --> Ich suche mein rotes zu Hause.
Now that was built on German grammar rules. We certainly could do this with
French and Spanish as well.
We probably need some language experts which can tell us, which words are
more important in defining the grammar than others. While I was doing the
example, I noticed that nouns are more important than adjectives.
Furthermore, we should consult a graph theorist who can help us with
creating trees, based on these rules. He might be able to use some math to
optimize the algorithm.
As I said, these are just some ideas. Any comments?
Regards,
Stephan
--
Stephan Richter - (901) 573-3308 - srichter@cbu.edu
CBU - Physics & Chemistry; Framework Web - Web Design & Development
PGP Key: 735E C61E 5C64 F430 4F9C 798E DCA2 07E3 E42B 5391