Re: [Zope] ZCatalog and foreign characters
Radim Gelner writes:
is it possible to make ZCatalog work correctly with words containing characters other then those given in ISO-8859-1.
Now, it reports "no found" for all such queries even when these words are present inside the documents on site. I have made a very crude patch to "splitter.c" which lets it treat every non-ascii character as a letter.
Obviously, this is not correct. It may include punctution into words. This will lead to not find the words unless searched for with the exact same punctuation. Furthermore, non-ascii letters are not translated to lowercase. Up to now, it gives acceptable results for us. However, we did not yet make stress tests. For a correct solution, splitter must be informed about the encoding and a unicode letter classification and case transformation must be applied. If you work with a fixed locale, you can use the "-L" switch to inform Zope about your locale. Then the splitter should work correctly (for your locale). Dieter
On Tue, Aug 29, 2000 at 11:57:08PM +0200, Dieter Maurer wrote:
If you work with a fixed locale, you can use the "-L" switch to inform Zope about your locale. Then the splitter should work correctly (for your locale).
Yes, that helped. I've called Zope with -L cs_CZ switch, rebuilt the catalog and now searching works as it should. Thanks. Radim
On Wed, 30 Aug 2000, Radim Gelner wrote:
On Tue, Aug 29, 2000 at 11:57:08PM +0200, Dieter Maurer wrote:
If you work with a fixed locale, you can use the "-L" switch to inform Zope about your locale. Then the splitter should work correctly (for your locale).
Yes, that helped. I've called Zope with -L cs_CZ switch, rebuilt the catalog and now searching works as it should.
Great, but what about FreeBSD ? I work on Linux at home, but few days ago someone decided that our production server will be FreeBSD. What is the right locale string format on FreeBSD (version 3.5) ? I've tried "pl_PL", "iso-8859-2", "pl_PL.iso-8859-2" - no success... :( For now, I made hack in Spliter.c - I wrote my own replacements for isalpha and isalnum functions. ololo@zeus.polsl.gliwice.pl /--------------------------------------\ | `long long long' is too long for GCC | \--------------------------------------/
For now, I made hack in Spliter.c - I wrote my own replacements for isalpha and isalnum functions.
Oops, my solution doesn't work. Probably it needs more hacking in python code.... So, how to do it under FreeBSD ? ololo@zeus.polsl.gliwice.pl /--------------------------------------\ | `long long long' is too long for GCC | \--------------------------------------/
participants (3)
-
Aleksander Salwa -
Dieter Maurer -
Radim Gelner