HTML post processing in Zope
Hi all, I am trying to perform a post-processing on all HTTP responses, before they get sent to the browsers. I am using Zope 2.7.3 nad Plone 2.0.5. I had a look at the ZServer class: it seems to be the right place, but I don't understand all the code there and I am afraid to break something :-( Am I on the right tracks there? Could someone with great Zope knowledge be kind enough to point out which method I should change? (I want to change the HTML content, not the headers). Any pointer would be appreciated. Cheers. Cyrille.
Cyrille Bonnet wrote:
Hi all,
I am trying to perform a post-processing on all HTTP responses, before they get sent to the browsers. I am using Zope 2.7.3 nad Plone 2.0.5.
I had a look at the ZServer class: it seems to be the right place, but I don't understand all the code there and I am afraid to break something :-(
Am I on the right tracks there? Could someone with great Zope knowledge be kind enough to point out which method I should change? (I want to change the HTML content, not the headers).
Any pointer would be appreciated.
I would recommend getting Apache or something like it to act as a proxy and do the rewriting there. That would likely be much cleaner than futzing with ZServer. Though I can't say exactly how this would be done. But, yes, that's probably the right place to go. I believe, based on a very quick look at the code, that continue_request in lib/python/ZServer/HTTPServer.py is probably the last place to get a hold of the response before it's sent. --jcc -- http://plonebook.packtpub.com/
Am Donnerstag, den 05.05.2005, 14:58 +1200 schrieb Cyrille Bonnet:
Hi all,
I am trying to perform a post-processing on all HTTP responses, before they get sent to the browsers. I am using Zope 2.7.3 nad Plone 2.0.5.
I had a look at the ZServer class: it seems to be the right place, but I don't understand all the code there and I am afraid to break something :-(
Am I on the right tracks there? Could someone with great Zope knowledge be kind enough to point out which method I should change? (I want to change the HTML content, not the headers).
I wonder what kind of post processing do you want? Since Zope creates all responses anyway, so why not creating the responses you want in the first place?
Cyrille Bonnet wrote at 2005-5-5 14:58 +1200:
I am trying to perform a post-processing on all HTTP responses, before they get sent to the browsers. I am using Zope 2.7.3 nad Plone 2.0.5.
I had a look at the ZServer class: it seems to be the right place, but I don't understand all the code there and I am afraid to break something :-(
I think "ZPublisher.HTTPResponse.HTTPResponse" is the better place for tidying up. -- Dieter
Thanks for all your answers, I usually use Apache to change HTTP headers. But here, I need to post-process the HTML. The reason is that the NZ Government Webguidelines require HTML 4.01 :-( and I'd like to keep Plone content and templates XHTML compliant. One way to do that is obviously to post-process the HTML with a language that is good at regular expressions (Perl?). But I thought it could be neat if the post-processing could be done in Zope itself. Anyway, I am looking at ZPublisher.HTTPResponse.HTTPResponse and it looks like the right place. Thanks for your help! Cyrille Dieter Maurer wrote:
Cyrille Bonnet wrote at 2005-5-5 14:58 +1200:
I am trying to perform a post-processing on all HTTP responses, before they get sent to the browsers. I am using Zope 2.7.3 nad Plone 2.0.5.
I had a look at the ZServer class: it seems to be the right place, but I don't understand all the code there and I am afraid to break something :-(
I think "ZPublisher.HTTPResponse.HTTPResponse" is the better place for tidying up.
On 06/05/05, Cyrille Bonnet <cyrille@3months.com> wrote:
Thanks for all your answers,
I usually use Apache to change HTTP headers. But here, I need to post-process the HTML.
The reason is that the NZ Government Webguidelines require HTML 4.01 :-( and I'd like to keep Plone content and templates XHTML compliant.
One way to do that is obviously to post-process the HTML with a language that is good at regular expressions (Perl?).
But I thought it could be neat if the post-processing could be done in Zope itself.
Anyway, I am looking at ZPublisher.HTTPResponse.HTTPResponse and it looks like the right place.
Thanks for your help!
Yes, it's irritating isn't it? I've worked on an NZ government site before, but in PHP. You can use Apache to parse the HTML, mod_python allows you to use PythonOutputFilters, where you can use a regexp to clean up the HTML. I'd personally think that using Apache would be faster, but I may be wrong. -- Phillip Hutchings http://www.sitharus.com/ sitharus@gmail.com / sitharus@sitharus.com
Answering my own questions: Testing for text/html does the trick as WebDAV follows some other content type (probably text/xml). It works great! Cyrille Cyrille Bonnet wrote:
Thanks for all your answers,
I usually use Apache to change HTTP headers. But here, I need to post-process the HTML.
The reason is that the NZ Government Webguidelines require HTML 4.01 :-( and I'd like to keep Plone content and templates XHTML compliant.
One way to do that is obviously to post-process the HTML with a language that is good at regular expressions (Perl?).
But I thought it could be neat if the post-processing could be done in Zope itself.
Anyway, I am looking at ZPublisher.HTTPResponse.HTTPResponse and it looks like the right place.
Thanks for your help!
Cyrille
Dieter Maurer wrote:
Cyrille Bonnet wrote at 2005-5-5 14:58 +1200:
I am trying to perform a post-processing on all HTTP responses, before they get sent to the browsers. I am using Zope 2.7.3 nad Plone 2.0.5.
I had a look at the ZServer class: it seems to be the right place, but I don't understand all the code there and I am afraid to break something :-(
I think "ZPublisher.HTTPResponse.HTTPResponse" is the better place for tidying up.
_______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Hi all, I got the filter to work. I just added 3 lines of code in "ZPublisher.HTTPResponse.HTTPResponse" (thanks for your suggestion, Dieter): doctype_str_search = re.compile(r'<!DOCTYPE.*>') body = doctype_str_search.sub('<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">', body) body = body.replace('/>', '>') It works great except that... it breaks the WebDAV ( I can still connect but can't see the files). I'd like to add a condition there: if WebDAV, don't do anything. But all I have is the body. No port, for instance... I thought of testing: if content type = 'text/html', byt WebDAV will probably be of that type. Any suggestion will be wecome. Cyrille Dieter Maurer wrote:
Cyrille Bonnet wrote at 2005-5-5 14:58 +1200:
I am trying to perform a post-processing on all HTTP responses, before they get sent to the browsers. I am using Zope 2.7.3 nad Plone 2.0.5.
I had a look at the ZServer class: it seems to be the right place, but I don't understand all the code there and I am afraid to break something :-(
I think "ZPublisher.HTTPResponse.HTTPResponse" is the better place for tidying up.
Cyrille Bonnet wrote:
Hi all,
I got the filter to work. I just added 3 lines of code in "ZPublisher.HTTPResponse.HTTPResponse" (thanks for your suggestion, Dieter):
doctype_str_search = re.compile(r'<!DOCTYPE.*>') body = doctype_str_search.sub('<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">', body)
body = body.replace('/>', '>')
You can't be serious, right? The above do NOT suddenly make it HTML 4.01, I'm 90% sure ;-) Really, you should be customising the templates to serve HTML in the format you need rather than persuing this insanity... Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
Hi Chris, Well, you'd better believe it :-) It works for me so far. But if you have specific examples that I can use to improve the filter, they would be very welcome. Two additional things that I had to do to be HTML 4.01 compliant: * replace <html ...> with <html> (remove the namespace information) * remove the login portlet: Plone uses form parameters __ac_name and __ac_password, which the W3C validator rejects as invalid. I have been customising the templates in the past and it takes a lot of work, on many templates, all over the place. In addition, I'd like to keep the content stored in the ZODB as XHTML. And, last but not least, I can upgrade Plone without having to rework all my templates now. But if you have a better idea, your suggestions are most weclome. Back to the insanity ;-) Cheers. Cyrille Chris Withers wrote:
Cyrille Bonnet wrote:
Hi all,
I got the filter to work. I just added 3 lines of code in "ZPublisher.HTTPResponse.HTTPResponse" (thanks for your suggestion, Dieter):
doctype_str_search = re.compile(r'<!DOCTYPE.*>') body = doctype_str_search.sub('<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">', body)
body = body.replace('/>', '>')
You can't be serious, right?
The above do NOT suddenly make it HTML 4.01, I'm 90% sure ;-)
Really, you should be customising the templates to serve HTML in the format you need rather than persuing this insanity...
Chris
Cyrille Bonnet wrote:
It works for me so far. But if you have specific examples that I can use to improve the filter, they would be very welcome.
I think a filter is a totally abhorent way of attempting to tackle this...
* replace <html ...> with <html> (remove the namespace information) * remove the login portlet: Plone uses form parameters __ac_name and __ac_password, which the W3C validator rejects as invalid.
Tee hee, so much for Plone's amazing standards compliance ;-)
I have been customising the templates in the past and it takes a lot of work, on many templates, all over the place.
Well, your filter only changes things that are in main_template...
In addition, I'd like to keep the content stored in the ZODB as XHTML.
Why?
And, last but not least, I can upgrade Plone without having to rework all my templates now.
Bwahahaha... the other great myth ;-) Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
Chris Withers wrote:
Cyrille Bonnet wrote:
It works for me so far. But if you have specific examples that I can use to improve the filter, they would be very welcome.
I think a filter is a totally abhorent way of attempting to tackle this...
OK, but again, if you have a better idea, it is welcome. Modifying 12 templates does not look much better to me.
* replace <html ...> with <html> (remove the namespace information) * remove the login portlet: Plone uses form parameters __ac_name and __ac_password, which the W3C validator rejects as invalid.
Tee hee, so much for Plone's amazing standards compliance ;-)
Looking closer, it was actually the ids that were causing the problem. Ids can't start with _ in HTML 4.01 but it is perfectly legitimate in XHTML. So, Plone is compliant with XHTML.
I have been customising the templates in the past and it takes a lot of work, on many templates, all over the place.
Well, your filter only changes things that are in main_template...
??? The filter runs on the HTTPResponse object, thus changing all the HTML output, not just the ouput from main_template.
In addition, I'd like to keep the content stored in the ZODB as XHTML.
Why?
Well, looking forward, if the NZ government guidelines finally support XHTMl, we'll just need to remove the filter. In addition, we want to be able to transform the content with XSL transformations. Finally, Kupu and Epoz are good at producing XHTML, but don't support HTML 4.01.
And, last but not least, I can upgrade Plone without having to rework all my templates now.
Bwahahaha... the other great myth ;-)
Before, i had to modify 10-12 templates at least. Between Plone 2.0.4 and 2.0.5, these templates got changed and I had to spend 20 hours or so reworking the HTML ouput and testing. Now, when we move to Plone 2.1, I hope to do no work at all. I don't think it is a myth :-) Cyrille
Dne čtvrtek, 5. května 2005 04:58 Cyrille Bonnet <cyrille@3months.com> napsal(a):
Hi all,
I am trying to perform a post-processing on all HTTP responses, before they get sent to the browsers. I am using Zope 2.7.3 nad Plone 2.0.5.
I had a look at the ZServer class: it seems to be the right place, but I don't understand all the code there and I am afraid to break something :-(
Am I on the right tracks there? Could someone with great Zope knowledge be kind enough to point out which method I should change? (I want to change the HTML content, not the headers).
I have unitws product (from somebody else) which does modifications on the resulted html code. It is able to do any modifications, I was modify them to do wml compliant national characters and more. You call it simple with <dtml-unitws> your content here </dtml-unitws> Place this file __init__.py into "unitws" folder of your Products: # cat /var/zope/lib/python/Products/unitws/__init__.py from string import split, strip, join, find import DocumentTemplate.DT_Util from DocumentTemplate.DT_String import String class UnitWhitespaceTag: """ Removes redundant whitespace (not very conservatively) """ name='unitws' blockContinuations=() def __init__(self, blocks): tname, args, section = blocks[0] args = DocumentTemplate.DT_Util.parse_params(args) self.blocks = section.blocks def render(self,md): a = [] for line in split(DocumentTemplate.DT_Util.render_blocks(self.blocks, md),'\n'): line = strip(line) if line: b = [] for word in split(line,' '): if word!='': b.append(strip(word)) a.append(join(b,' ')) return join(a,'\n') __call__=render String.commands['unitws']=UnitWhitespaceTag -- Jaroslav Lukesh ----------------------------------------------------------- This e-mail can not contain any viruses because I use Linux
participants (7)
-
Chris Withers -
Cyrille Bonnet -
Dieter Maurer -
J Cameron Cooper -
Jaroslav Lukesh -
Phillip Hutchings -
Tino Wildenhain