Content Type Meta tag stripping in zope.pagetemplate
Hello all, I'm a fairly new zope developer, came across a "bug" in my application that <meta http-equiv="content-type" content="text/html;charset=UTF-8" /> tags were being stripped out from ZPT templates. Is there a reason for this? This is done in the _prepare_html function of zope.pagetemplate.pagetemplatefile.PageTemplateFile. My application produces XHTML that contains non-ASCII characters that is then used by other applications so it needs to have the content type set on the document itself in addition to the HTTP headers. Secondly, finding and stripping of the meta tag is done using a regular expression so simply changing the order of the attributes on the <meta> tag would make the reg-exp not match. Attached is a patch that uses HTMLParser to find the content type meta tag instead of a regex. It stops parsing the html as soon as it encounters the required meta tag. Miano
On Wed, Feb 22, 2012 at 10:28 AM, Miano Njoka <mianonjoka@gmail.com> wrote:
<meta http-equiv="content-type" content="text/html;charset=UTF-8" /> tags were being stripped out from ZPT templates. Is there a reason for this?
As I recall, the rationale goes like this: 1. We're sniffing the input encoding from the charset setting. 2. We're storing the content-type on the instance (I hope tihs is still true). 3. The template/application/publisher is responsible for delivering the the output with an appropriate content-type header. -- Fred L. Drake, Jr. <fred at fdrake.net> "A storm broke loose in my mind." --Albert Einstein
On Wed, Feb 22, 2012 at 8:08 PM, Fred Drake <fred@fdrake.net> wrote:
On Wed, Feb 22, 2012 at 10:28 AM, Miano Njoka <mianonjoka@gmail.com> wrote:
<meta http-equiv="content-type" content="text/html;charset=UTF-8" /> tags were being stripped out from ZPT templates. Is there a reason for this?
As I recall, the rationale goes like this:
1. We're sniffing the input encoding from the charset setting.
2. We're storing the content-type on the instance (I hope tihs is still true).
3. The template/application/publisher is responsible for delivering the the output with an appropriate content-type header.
Yes, this is true, but why strip out the meta tag from the resulting HTML?
On Thu, Feb 23, 2012 at 2:54 AM, Miano Njoka <mianonjoka@gmail.com> wrote:
Yes, this is true, but why strip out the meta tag from the resulting HTML?
Two reasons: 1. It may be incorrect. 2. If multiple templates are used to construct a response, different values may be included from each template, which may be inconsistent. Since the meta element is unnecessary, it seemed better to leave it out of the result, and rely on other components to render the correct values without requiring them to insert correct values into the rendered template. (The publisher, for instance, shouldn't need to know how to edit that into the finished HTML.) -Fred -- Fred L. Drake, Jr. <fred at fdrake.net> "A storm broke loose in my mind." --Albert Einstein
participants (2)
-
Fred Drake -
Miano Njoka