[Zope] Re: Zope iso-8859-1 to utf-8

Pascal Peregrina Pperegrina at Lastminute.com
Tue Sep 13 15:36:22 EDT 2005


Thanks Dieter, you are right, I was a little confused.

I think that if my pages did not break so far it's because we use html
entities for non standard characters...

I will have to convert everything in a rush now to avoid issues :(

--------------------------
(sent from my BlackBerry)


-----Original Message-----
From: Dieter Maurer <dieter at handshake.de>
To: Pascal Peregrina <Pperegrina at Lastminute.com>
CC: 'Max M' <maxm at mxm.dk>; zope at zope.org <zope at zope.org>
Sent: Tue Sep 13 19:10:08 2005
Subject: RE: [Zope] Re: Zope iso-8859-1 to utf-8

Pascal Peregrina wrote at 2005-9-13 14:21 +0100:
>I see...  And what python function would you use for conversion ?

   unicode(iso_string, 'iso-8859-1').encode('utf-8')

>I made some tests and was surprised of the results... 
>I switched ZMI to UTF-8 (management_page_charset) and edited some of my
>documents / properties and all went fine.

Strange. I had expected that non-ASCII characters were displayed
in a wrong way.

>The generated documents are still sent to browsers as iso-8859-1, and are
>not broken.

If you switched to "utf-8", then *you* should ensure that
they are sent as "utf-8".

>So my question would be : which valid UTF-8 characters (for typical Western
>languages like English, French, Spanish, ...) would be invalid in
iso-8859-1

This is a strange question...
The problem does not lie with the characters but with their codes.

The code agrees between UTF-8 and iso-8859-1 for precisely the
ASCII characters (unicode chars 0-127). Unicode characters
128-255 use 2 bytes in UTF-8 but 1 in "iso-8859-1". Unicode characters
256 and up can be represented encoded in "UTF-8" but not "iso-8859-1".

> ...
>Last thing, if ZMI is switched to UTF-8, then what is the difference
between
>ustring/string, etc properties ?

"ustring" is a unicode string: stored inside Zope as unicode,
sent to the browser UTF-8 encoded and expected to come back
UTF-8 encoded.

"string" is a plain (non unicode) string. It should use
the encoding of your page (UTF-8, once you switched to UTF-8).

-- 
Dieter


**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

This footnote also confirms that this email message has been swept by
MIMEsweeper for the presence of computer viruses.

www.mimesweeper.com
**********************************************************************



More information about the Zope mailing list