[Zope] ZCatalog and foreign characters
Dieter Maurer
dieter@handshake.de
Tue, 29 Aug 2000 23:57:08 +0200 (CEST)
Radim Gelner writes:
> is it possible to make ZCatalog work correctly with words containing
> characters other then those given in ISO-8859-1.
>
> Now, it reports "no found" for all such queries even when these words
> are present inside the documents on site.
I have made a very crude patch to "splitter.c" which lets it
treat every non-ascii character as a letter.
Obviously, this is not correct. It may include punctution into
words. This will lead to not find the words unless searched for
with the exact same punctuation.
Furthermore, non-ascii letters are not translated to lowercase.
Up to now, it gives acceptable results for us.
However, we did not yet make stress tests.
For a correct solution, splitter must be informed about
the encoding and a unicode letter classification and
case transformation must be applied.
If you work with a fixed locale, you can use the "-L" switch
to inform Zope about your locale. Then the splitter should
work correctly (for your locale).
Dieter