Radim Gelner writes:
is it possible to make ZCatalog work correctly with words containing characters other then those given in ISO-8859-1.
Now, it reports "no found" for all such queries even when these words are present inside the documents on site. I have made a very crude patch to "splitter.c" which lets it treat every non-ascii character as a letter.
Obviously, this is not correct. It may include punctution into words. This will lead to not find the words unless searched for with the exact same punctuation. Furthermore, non-ascii letters are not translated to lowercase. Up to now, it gives acceptable results for us. However, we did not yet make stress tests. For a correct solution, splitter must be informed about the encoding and a unicode letter classification and case transformation must be applied. If you work with a fixed locale, you can use the "-L" switch to inform Zope about your locale. Then the splitter should work correctly (for your locale). Dieter