[Zope-dev] zope.publisher and ZPublisher: decoding form input
yuppie
y.2011 at wcm-solutions.de
Mon Mar 7 03:58:46 EST 2011
Hi!
As discussed in a different thread, zope.publisher compatible decoding
should be added to the ZPublisher.
But does that code from zope.publisher make any sense?
def _decode(self, text):
"""Try to decode the text using one of the available charsets."""
if self.charsets is None:
envadapter = IUserPreferredCharsets(self)
self.charsets = envadapter.getPreferredCharsets() or ['utf-8']
for charset in self.charsets:
try:
text = unicode(text, charset)
break
except UnicodeError:
pass
return text
Using getPreferredCharsets()[0] is correct because zope.publisher uses
the same charset for encoding responses. (For ZPublisher we decided we
don't want to support charset negotiation.) But what about the other
charsets?
AFAICS
- There are no tests in zope.publisher for that fallback behavior.
- That fallback behavior doesn't cause trouble because it is very rarely
or never used.
- The fact no error is raised by unicode(text, charset) doesn't mean we
have the right charset. Here some background information:
http://chardet.feedparser.org/docs/index.html
- Returning the encoded strings if all attempts fail might not be the
best choice.
Proposal:
Just use unicode(text, charset, 'replace') with the same charset used
for encoding responses.
If there are no objections, I'll implement it that way in ZPublisher.
What about zope.publisher? I don't use zope.publisher, but I think it
should always use 'utf-8' instead of trying to be smart.
Cheers,
Yuppie
More information about the Zope-Dev
mailing list