[Zope-dev] RFV: Unicode in Zope 2
Jim Fulton
jim at zope.com
Tue Dec 13 09:46:17 EST 2005
Martijn Faassen wrote:
> Jim Fulton wrote:
>
>> I forgot a very important need:
>>
>> - Common approach to Unicode
>>
>> In particular, In Zope 3, all text is stored and managed as Unicode.
>> The publisher decodes request data and encodes response data. The vast
>> majority of application and library code can ignore encoding issues.
>> (The exceptions are applications and frameworks that need to exhange
>> text with non-Unicode-aware external systems.) This has provided
>> great simplifications and allowed us to avoid common pitfals from
>> mixing Unicode and encoded text.
>>
>> We need to migrate Zope 2 to use a similar strategy. We need volunteers
>> to brainstorm how this can be done and make one or more proposals.
>> This is likely a prerequisite for finishing the publisher and ZPT
>> work.
>
>
> This is definitely a scary topic, and I speak from years of experience
> with Zope 2 unicode here. This sounds like a very hard transition that
> would touch *a lot* of code in non-Zope 2 core. How do you envision all
> the form inputs to suddenly produce unicode strings, for instance?
>
> We've struggled hard with Formulator to make it work with unicode for
> instance (and still it's buggy, as I wanted to support the non-unicode
> scenarios too). I can imagine any system in Zope that uses forms at all
> would need to be touched.
>
> I'll volunteer to help brainstorm on this, but right now my brainstorm
> is only very dark and full of lightning.
You and I brainstormed this a few months ago. I think this was on the
list. I think that, for starters, we would arrange that all Zope 3
views used in Zope 2 would get unicode input. If you like, I can try
to find this discussion. :)
> Anyway, in some basics, Zope 2 does have an approach to unicode for
> *output* that's fairly similar to Zope 3's: if you use unicode strings
> your entire output (including page templates) will be unicode (if you
> don't mix with non-unicode non-ascii strings..). Then the response
> encoding setting is read and everything is transformed once to unicode
> text. Silva uses this. It also struggles to make sure all its input is
> transformed to unicode (among other ways using Formulator).
>
> In Plone, the situation is quite different -- its
> PlacelessTranslationService monkeypatches into the page template engine
> and puts in ways so that you can mix UTF-8 and unicode strings together.
> This then goes on to break assumptions of code that uses the page
> template engine in a unicode-pure environment (which is what happened to
> Silva).
Ick.
I'm not suggesting this is easy. We may have some messy deprecation
and backward compatibility code. But we *do* need to solve this problem
eventually, and the solution doesn't get any closer without taking steps.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
More information about the Zope-Dev
mailing list