[zope2-tracker] [Bug 530620] [NEW] UnicodeDecodeError when using IE, Safari
Ole Christian Helset
ochelset at gmail.com
Tue Mar 2 06:35:00 EST 2010
Public bug reported:
Using Zope 2.11.5, default-zpublisher-encoding utf-8, rendering content
fails in IE and Safari, as they (at the time of writing) doesn't provide
the Accept-Charset header, if the content contains a string in utf-8.
In http.py (zope/publisher/http.py), the
HTTPCharsets.getPreferredCharsets() method returns an empty list,
causing a UnicodeDecodeError in zope, when a tal:content string contains
utf-8 encoded string with fi. norwegian characters (ø > \xc3\xb8).
I made a simple test, just a default page template, giving it a title with such a character (fi. Pølse):
<html>
<head>
<meta http-equiv="content-type" content="text/html;charset=utf-8">
</head>
<body>
<tal:block content="python:repr(template.title)" /><br />
<tal:block content="python:repr(template.title.encode('latin-1'))" /><br />
<tal:block content="python:repr(template.title.encode('utf-8'))" /><br />
<tal:block content="python:title" define="title python:template.title" /><br />
<tal:block content="python:title" define="title python:template.title.encode('utf-8')" /><br />
</body>
</html>
In Firefox the output is fine:
u'P\xf8lse'
'P\xf8lse'
'P\xc3\xb8lse'
Pølse
Pølse
In IE and Safari it raises a UnicodeDecodeError
If HTTPCharsets.getPreferredCharsets() returns ['utf-8'], it works fine
in IE and Safari as well.
My changes to http.py:
from zope.publisher.base import RequestDataGetter
+from ZPublisher import Converters
...
# Quoting RFC 2616, $14.2: If no "*" is present in an Accept-Charset
# field, then all character sets not explicitly mentioned get a
# quality value of 0, except for ISO-8859-1, which gets a quality
# value of 1 if not explicitly mentioned.
# And quoting RFC 2616, $14.2: "If no Accept-Charset header is
# present, the default is that any character set is acceptable."
if not sawstar and not sawiso88591 and header_present:
- charsets.append((1.0, 'iso-8859-1'))
+ charsets.append((1.0, Converters.default_encoding))
# UTF-8 is **always** preferred over anything else.
# Reason: UTF-8 is not specific and can encode the entire unicode
# range , unlike many other encodings. Since Zope can easily use very
# different ranges, like providing a French-Chinese dictionary, it is
# always good to use UTF-8.
charsets.sort(sort_charsets)
charsets = [charset for quality, charset in charsets]
- if sawstar and 'utf-8' not in charsets:
+ if not sawstar and 'utf-8' not in charsets: # IS THIS BAD, TO FORCE IN UTF-8???
charsets.insert(0, 'utf-8')
The question is then, is this a problem, forcing utf-8 here (or the
default-zpublisher-encoding) when the HTTP_ACCEPT_CHARSET is missing in
the request?
** Affects: zope2
Importance: Undecided
Status: New
--
UnicodeDecodeError when using IE, Safari
https://bugs.launchpad.net/bugs/530620
You received this bug notification because you are a member of Zope 2
Developers, which is subscribed to Zope 2.
More information about the zope2-tracker
mailing list