[Zope3-dev] Re: zope.tal.xmlparser.XMLParser() dislikes unicode
Andreas Jung
lists at zopyx.com
Mon Jan 15 07:42:00 EST 2007
--On 15. Januar 2007 13:26:16 +0100 Martijn Faassen
<faassen at startifact.com> wrote:
>
> How would you propose to parse the following unicode string?
>
> u"<?xml version="1.0" encoding="ISO-8859-1"?><foo />"
If your parser is unicode-aware then the encoding of the preamble
does not matter since you have already unicode internally and can process
your file totally on XML.
If your parser isn't unicode-aware then you will likely convert it to
utf-8 and work internally with utf-8 encoded strings. In fact
xml.parsers.expat since to support unicode (it can return unicode strings
to the handlers, see 'returns_unicode' property). However you need to
reconstruct the XMl preamble when you reconstruct your XML from the
parsed data.
Or am I missing something?
Andreas
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 186 bytes
Desc: not available
Url : http://mail.zope.org/pipermail/zope3-dev/attachments/20070115/660de6f7/attachment.bin
More information about the Zope3-dev
mailing list