----- Original Message ----- From: "Joachim Werner" <joe@iuveno-net.de> To: "Andreas Jung" <andreas@zope.com> Cc: <zope@zope.org> Sent: Wednesday, January 09, 2002 13:16 Subject: Re: [Zope] ISO-Splitter again: German Umlaute
Sorry, I haven't had the time to look into the Splitter code yet. So that's why I am asking again:
I don't get the concept of having to specifiy the locale at Zope startup for the catalog to work properly. What happens if I WANT en_US locale settings in general, but the catalog should be able to handle French, German, or Spanish words? How can I build multi-lingual Zope systems with that concept?
ZopeSplitter + locale settings should fit your needs for all western european languages - they are all ISO-8859-1. ISO-8859-1 splitter should fulfill your needs without change your locales.
Shouldn't the catalog always split words correctly? I am not talking about languages like Japanese that have a different concept of splitting. Those need a different splitter code of course. But is there ANY reason why
German
Umlauts or other language-specific special characters are supposed to be splitting characters, other than that the programmers of the original splitter code might have taken the easy way of making all characters that are not A-Z splitting characters?
A splitter is currently bound to a vocabulary. This means you can not change the splitter during indexing. For a multilingual environment you should use Unicode and use the new UnicodeSplitter. Andreas