[Zope3-dev] Re: zope.tal.xmlparser.XMLParser() dislikes unicode
Chris Withers
chris at simplistix.co.uk
Sun Jan 14 13:14:45 EST 2007
Dieter Maurer wrote:
> A halfway intelligent parser would accept Unicode when it gets it
> and concentrate on the remaining part of its task: either reporting
> structural events or building a parse tree.
The trivial fix I use in Twiddler is as follows:
if isinstance(source,unicode):
source = source.encode('utf-8')
Of course, this assumes a heading of either <?xml version="1.0"
encoding="utf-8"?> or a missing encoding attribute, in which case the
xml spec states that the string must be utf-8 encoded.
The problem comes when someone sends you something like:
u'<?xml version="1.0" encoding="something-else"?><node />'
What should be done then?
Chris
--
Simplistix - Content Management, Zope & Python Consulting
- http://www.simplistix.co.uk
More information about the Zope3-dev
mailing list