Re: [Zope] Zope 2.6.1 and UTF-8

10 Sep 2003

      Most of this is discussion is over my head. But there's one pretty basic
misunderstanding exhibited:
...
...
...
...
I've got some stuff that's in strings, so I guess not unicode, but
which is UTF-8 encoded, and I'm wondering how I make sure Zope does
"the
right thing" here. Are there any docs about?
and
...
...
Hmmm, that's interesting. I'd been planning on keeping everything as
UTF-8 encoded strings rather than actual unicode. What leads you to
suggest storing everything as unicode?
and
...
...
Finally, is ZCTextIndex compatible with either unicode or strings that
contain UTF-8 encoding?
UTF-8 is one way of encoding Unicode character-sets. They are not
different things. When you use UTF-8 you are using Unicode.

UTF-8 exists to allow systems to migrate gently, because it translates a
very large character set into a format that will not normally break file
systems which expect 8-bit character data. There are 16-bit and 32-bit
representations of Unicode.

UTF-8's representation of ASCII is identical to ASCII's. So for
applications which internally process only ASCII, the encoding is moot.
But if you have user input you need to watch out: Windows and MacOS
support UTF-8 input in browser windows for forms input. This input can
seriously break old apps.

best

Mark Barratt