On Fri, 23 Jul 2004 16:09:02 -0400, Passin, Tom <tpassin@mitretek.org> wrote:
Don't want to harp on this but the XML Rec does not agree with the notion of "delimiter in context". A CDATA section exists specifically to say "This may look like markup but it isn't". The XML 1.0 Rec says
That's why we like XML better than SGML. ;-) This problem isn't the XML parser; that ones doing fine (it's Expat ;). The input is being treated as HTML intentionally (because that's what the content type is).
I'd say that is pretty definitive, wouldn't you? If there is some folk "knowledge" about CDATA sections built into the parser that thinks otherwise, I'd say the parser is non-conformant about CDATA sections (hmm, I almost wrote that "C-sections"!).
For XML, yes. We have SGML (using HTML). Perhaps there's something in the SGML Handbook or the ISO spec., but those are outside my price range the last time I checked.
Anyway, as I said I have seen this problem myself, and it occurred entirely within the browser (without any CDATA sections) - no server
You got an exception from the Python HTMLParser module from a browser? That's pretty interesting! -Fred -- Fred L. Drake, Jr. <fdrake at gmail.com>