Toby Dickenson wrote:
All except TextIndex should 'just work' with unicode. Note that you cant mix unicode and non-ascii plain strings in the same index.
Works and works, well here I describe the problem little more in detail. I have created a Catalog with ZCTextIndex and Lexicon, and indexed some documents into ZCatalog. Documents are PropertyObjects that have some properties as unicode strings. When I look into the Lexicon - I see that all the words are stripped from the unicode characters ( which were these familiar scandinavian characters - äöÄÖ - that are normally in latin-1 ). I've created standard search and report interfaces ( Pagetemplates ) and tried the searches, which seem to work - however these ä,Ä,ö and Ö characters have been thought as a separate common words. Ie. if I try to search for ö - I will get an error message: Error Value: Query contains only common words: '\xf6' Or an example with real word that is in the content: lähiviikot I will not find it with: lähi* *viikot but for example with: hiviikot l hiviikot lähiviikot .. For some cases the search looks like it would work, since content with those words is found. ,-) Locale on my Zope is set to: fi_FI@EURO.ISO-8859-1 Any ideas on how to progress on this? -huima