Thanks to all who responded to my original post, particularly to Toby who pointed me in the right direction: I was mistakenly (and stupidly ...) using: <meta http-equiv="content-type" content="text/html;charset=&dtml-encoding;"> instead of: <dtml-call "RESPONSE.setHeader('content-type','text/html;charset=utf-8')"> in my standard_html_header, so I was encoding on the browser, but not over http !!! This solved everything, but an issue remains: I started fiddling with encoding, when I wanted to full text index my utf-8 encoded unicode content with ZCTextIndex and the lexicon gave me the usual ordinal not in range decoding error when building the index. Now I have a clean unicode setup (i.e. no locale when starting Zope and no sys.setdefaultendoding when starting python 2.1.3) and the lexicon started again to give me errors, for example when indexing a string containing "isn't" (the errors are generated at line 133 in lexicon.py). I searched the mail list archives and I found references to an old ZCTextIndex bug (597 in the collector), whose resolution seems to require starting zope with a -L option. Now I am a little bit confused and I ask if someone has a firm understanding on the status of Zope find/search support of unicode string containing high chars. Specifically: 1. Does the standard ZCTextIndex coming with Zope 2.6.1 support this ? 2. If yes, do I need to start Zope with a particular locale ? 3. Regarding these issues, is the recently released TextIndexNG ver.2 a better solution ? NB: if this matters, I have utf-8 encoded content in various languages, so I would prefer not to have to use any -L setting when starting Zope as I do not need to support TTW content editing. TIA, --peppo
-----Original Message----- From: Toby Dickenson [mailto:tdickenson@geminidataloggers.com] Sent: giovedì 24 luglio 2003 9.33 To: giuseppe.bonelli@tiscali.it; Giuseppe Bonelli; zope@zope.org Subject: Re: [Zope] strange unicode behaviour
[...snip...]
I have utf-8 as sys.defaultencoding and I do not load any locale when starting Zope.
That is old advice that predates Zope 2.6. It was never a particularly good idea, because it affects all of pythons internals. You only need to encode your unicode as utf-8 (or other encoding) before sending it over the network, and ZPublisher is capable of doing that itself if you tell it the encoding in the header.
-- Toby Dickenson - http://www.geminidataloggers.com/people/tdickenson
Want a job like mine? http://www.geminidataloggers.com/jobs for Software Engineering jobs at Gemini Data Loggers in Chichester, West Sussex, England