unicode, zmi, utf-8, management_page_charset and default-zpublisher-encoding
Hi *, I spent hours now reading documentation, blogs and mails, and still did not get my head around the whole unicode issue. This is what I would like to do: 1. Have content and forms displayed using zpts, encoded in utf-8. 2. Get the data back into zope as unicode. 3. Store unicode in properties of objects, e.g. folders. 4. Edit those properties in the ZMI. 5. Search them in the catalog. My findings so far I have documented in http://www.baach.de/content/unicode basically using 1. setHeader('Content-Type','text/html;; charset=utf-8') 2. marking form fields (firstname:ustring:utf8) 2. using ustrings as properties. Works fine, but - what do I do about the title property of objects, e.g. folder? Its seems that I can't change them to anything, because once deleted I can't replace them by ustrings. I found that there is default-zpublisher-encoding and management_page_charset, but these make ustring etc. vanish from the zmi. So, should I use the management_page_charset and default-zpublisher-encoding instead of ustrings for storing data? I can't find anything that explains those six components (setHeader, form-field-ustring:utf8, ustrings, management_page_charset, default-zpublisher-encoding, title property) together, but would really like to understand how its working, preferably accross Zope versions. I am very willing to write a tutorial on it, so I am happy about any pointers. I am currently testing with Zope 2.8.5-final + python 2.4.4, and im the process of testing with the newest 2.10.x Cheers, Joerg
Joerg Baach wrote at 2007-5-23 16:13 +0100:
... 1. Have content and forms displayed using zpts, encoded in utf-8.
That's easy: you store content in "utf-8" and have your ZPTs encoded in "utf-8" and tell the HTTP clients that you deliver "utf-8" by a 'RESPONSE.setHeader("Content-Type", "text/html; charset=utf-8")'.
2. Get the data back into zope as unicode.
That does not work. At system boundaries (e.g. between the browser and Zope), you always get some encoding. Usually, the browser uses the same encoding as the one it found on the page delivering a from.
3. Store unicode in properties of objects, e.g. folders.
You use the "u*" form for the properties. The "u" stands for "unicode". There is "ustring", "utext", "ulines", "utokens",
4. Edit those properties in the ZMI.
You set the "management_page_charset" to "utf-8" and you will be able to edit your "u*" properties through the ZMI. Almost surely, other textual properties should be encoded in "utf-8".
5. Search them in the catalog.
No problem, as long as you take care that you do not mix unicode and non unicode (with non ASCII characters) in the same index. -- Dieter
participants (2)
-
Dieter Maurer -
Joerg Baach