[Zope-dev] Re: Decoding of source for text/xml ZPTs
Tres Seaver
tseaver at palladion.com
Sat Oct 8 14:16:34 EDT 2005
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Chris Withers wrote:
> During complication, the XML parser that processes non-HTML mode ZPT's
^
+- "compilation", I'm guessing, but see below ;)
> decodes the string of the source into unicode instructions.
>
> In HTML mode, the parse does no decoding and so we get string instructions.
>
> My question as a result is: what characterset does the XML parser in
> non-HTML mode assume and can it be controlled in any way?
XML is UTF-8, unless specified in the top-level
processing-directive-like thingy the "xml declaration"), e.g.:
<?xml version="1.0" encoding="iso-8859-1"b?>
*or* unless the transmission channel spells the encoding (the HTTP
"Content-type" header, for instance). See Mark Pilrgrim's rant[1] on
the "insanely compilated" interactions between the Content-type header
and the document encoding.
XML files on the filesystem *must* be encoded as UTF-8, or have an
explicity encoding in the declaration.
[1] http://diveintomark.org/archives/2004/02/13/xml-media-types
Tres.
- --
===================================================================
Tres Seaver +1 202-558-7113 tseaver at palladion.com
Palladion Software "Excellence by Design" http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFDSA0C+gerLs4ltQ4RAhj+AJ0YVYNJVCmS5Nm7aYm3LMLiq0QUjACdHZge
8S/aikU+0/ZCcBrEZu2fV70=
=0O2y
-----END PGP SIGNATURE-----
More information about the Zope-Dev
mailing list