[Zope] Patch to check all pages with html-tidy

Thomas Guettler Thomas Guettler <thomas@thomas-guettler.de>
Wed, 9 Apr 2003 11:08:07 +0200


--PNTmBPCT7hxwcZjr
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

Hi!

Maybe someone is interested in this patch:

If you have html-tidy[1] installed, you can apply this patch to
lib/python/ZPublisher/HTTPResponse.py to scan every html page with
html tidy.

Warnings of html-tidy will be displayed in the debug logs. 

[1]: http://tidy.sourceforge.net/

You apply this patch like this:

cd zope/lib/python/ZPublisher
cat html-tidy-patch.txt | patch

 thomas

-- 
Thomas Guettler <guettli@thomas-guettler.de>
http://www.thomas-guettler.de


--PNTmBPCT7hxwcZjr
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="html-tidy-patch.txt"

--- HTTPResponse.py.orig	Wed Apr  9 08:37:36 2003
+++ HTTPResponse.py	Wed Apr  9 08:26:52 2003
@@ -176,6 +176,46 @@
         self.stdout = stdout
         self.stderr = stderr
 
+    def html_tidy(self):
+        """
+        Small hack to call html-tidy for every html
+        page which is serverd by zope.
+        Call it from lib/python/ZPublisher/HTTPResponse.setBody()
+        after self.body is set
+        
+        if content_type == 'text/html':
+            self.html_tidy()
+        """
+        import tempfile
+        import popen2
+        ignore=[
+            'Warning: <table> lacks "summary" attribute',
+            "Can't open",
+            "Warning: <nobr> is not approved by W3C",
+            "Warning: inserting missing 'title' element"]
+        htmlfile=tempfile.mktemp()
+        fd=open(htmlfile, "wt")
+        fd.write(self.body)
+        fd.close()
+        stdout, stdin = popen2.popen4("tidy -q -errors %s" % htmlfile)
+        out=stdout.readlines()
+        os.unlink(htmlfile)
+        for line in out:
+            line=line.strip()
+            cont=0
+            for ign in ignore:
+                if line.find(ign)!=-1:
+                    cont=1
+                    continue
+            if cont:
+                continue
+            base="unknown base"
+            if hasattr(self, "base"):
+                base=self.base
+            print "HTML-Tidy: %s %s" % (
+                self.base, line)
+        
+
     def retry(self):
         """Return a response object to be used in a retry attempt
         """
@@ -329,6 +369,8 @@
             body = '&gt;'.join(body.split('\233'))
 
         self.setHeader('content-length', len(self.body))
+        if content_type == 'text/html':
+            self.html_tidy()
         self.insertBase()
         if self.use_HTTP_content_compression and \
             not self.headers.get('content-encoding',None):

--PNTmBPCT7hxwcZjr--