[Zope3-dev] Re: zope.tal.xmlparser.XMLParser() dislikes unicode
Dieter Maurer
dieter at handshake.de
Tue Jan 16 14:06:32 EST 2007
Martijn Faassen wrote at 2007-1-15 15:44 +0100:
> ....
>Hey,
>
>On 1/15/07, Andreas Jung <lists at zopyx.com> wrote:
>[snip]
>> ok, got it. But this problem can be solved easily by changing the encoding
>> within the preamble.
>
>I would say refusing to guess and bailing out with an error message is
>better in this case.
I disagree with you.
Logically, parsing an encoded XML document consists of two
passes: decode the encoded string into unicode and reconstruct
the XML info elements from the serialization.
Traditionally, these two passes are not performed one after
the other but folded together in a single pass.
But that tradition should not prevent to separate out the
(Unicode) decoding phase. And after this phase is done,
there is not ambiguity left with the "XML declaration".
Its encoding attribute is simply irrelevant for the second phase
(apart from generating the PI info element).
Thus, there is no guessing; someone else has just performed
the first phase of the complete process -- maybe using the
"encoding" attribute or some overriding information.
--
Dieter
More information about the Zope3-dev
mailing list