[Zope-CMF] Unicode for ReST?
Charlie Clark
charlie.clark at clark-consulting.eu
Mon Apr 26 16:05:42 EDT 2010
Am 26.04.2010, 11:24 Uhr, schrieb yuppie <y.2010 at wcm-solutions.de>:
> Actually *all* strings passed to PageTemplates should be decoded, no
> matter which browser you use. That's the only sane way to mix encoded
> strings with unicode strings.
'Tis true but those are still most pathetic as it means they get offered
Latin-1
>> I looked a bit into the system and saw that we still use ReST in a very
>> Wallace& Gromit way: ReST encodes the generated HTML using the default
>> encoding from zope.conf and we promptly decode it back to unicode every
>> time we want to display it, and make sure default-encoding and
>> rest-encoding match. Adding "output='unicode' to Document's CookedBody()
>> removes the double-encoding and doesn't break any tests. Would it be
>> okay
>> to add this for Document and News objects and adjust the views
>> accordingly?
> Not sure I understand what you propose. Would that mean calling
> CookedBody(output='unicode') converts the persistent cooked_text to
> unicode and calling CookedBody() converts it back?
Sorry, very poor explanation of me - the underlying conversion from ReST
to HTML can accept an output_encoding:
def HTML(src,
writer='html4css1',
report_level=1,
stylesheet=None,
input_encoding=default_input_encoding,
output_encoding=default_output_encoding,
language_code=default_language_code,
initial_header_level = initial_header_level,
warnings = None,
settings = {}):
And later on:
if output_encoding != 'unicode':
return output.encode(output_encoding)
else:
return output
So, really quite braindead not add the output_encoding='unicode' to the
ReST-call in Document.py
> CookedBody() is meant to *get* the cooked body. It only updates
> cooked_text if you use a new STX or ReST level. (BTW a nasty
> write-on-read.)
Yes, probably more important to fix these warts.
> _edit() normally *sets* cooked_text.
> On interface level, I think we can explicitly allow CookedBody() to
> return encoded strings *or* unicode. I'd prefer that strategy over
> adding an 'output' argument to all get methods.
> On implementation level, content types shipped with CMF could always set
> cooked_text as unicode.
> The most work would be to write an upgrade step (including tests) that
> works reliable. So far we don't have any upgrade steps that update
> content items.
Okay. We'll see how it goes.
Charlie
--
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Helmholtzstr. 20
Düsseldorf
D- 40215
Tel: +49-211-600-3657
Mobile: +49-178-782-6226
More information about the Zope-CMF
mailing list