[Zope] Zope 2.6.1 and UTF-8
Mark Barratt
markb at textmatters.com
Wed Sep 10 17:56:54 EDT 2003
Most of this is discussion is over my head. But there's one pretty basic
misunderstanding exhibited:
>>>> I've got some stuff that's in strings, so I guess not unicode, but
>>>> which is UTF-8 encoded, and I'm wondering how I make sure Zope does
>>>> "the
>>>> right thing" here. Are there any docs about?
and
>> Hmmm, that's interesting. I'd been planning on keeping everything as
>> UTF-8 encoded strings rather than actual unicode. What leads you to
>> suggest storing everything as unicode?
and
>> Finally, is ZCTextIndex compatible with either unicode or strings that
>> contain UTF-8 encoding?
UTF-8 is one way of encoding Unicode character-sets. They are not
different things. When you use UTF-8 you are using Unicode.
UTF-8 exists to allow systems to migrate gently, because it translates a
very large character set into a format that will not normally break file
systems which expect 8-bit character data. There are 16-bit and 32-bit
representations of Unicode.
UTF-8's representation of ASCII is identical to ASCII's. So for
applications which internally process only ASCII, the encoding is moot.
But if you have user input you need to watch out: Windows and MacOS
support UTF-8 input in browser windows for forms input. This input can
seriously break old apps.
best
Mark Barratt
More information about the Zope
mailing list