Evan Simpson:
I've got a rather crude module going which parses an input string for HTML-ish tags. It allows only tags from an explicit list, and ensures that non-empty tags are closed (either by complaining or adding closing tags). If 'script' is not one of the allowed tags, it also disallows all "On*" attributes and "javascript:*" attribute values in any tag.
Unfortunately, it isn't very efficient (based on sgmllib.py) and is rather crude. I had wanted to make it use SAX to do the parsing, so that sgmlop or another high-performance library could be plugged in, but never got there. Also, it has no DTML-level interface; you'd have to wrap it in an External Method to use it from DTML.
I've gone ahead and put it up at http://www.zope.org/Members/4am/SafeHTML to see if anyone can make anything of it.
This looks a lot like the code I have lying around, only yours is more comprehensive and user friendly :) Anyway, I assume you are familiar with SAX for Python? http://www.stud.ifi.uio.no/~lmariusg/download/python/xml/saxlib.html It supports sgmlop, like you mentioned. Your code will do beautifully for our project, we are not dependant upon fast code in that specific part. Thanks a lot. Now, can somebody tell me how to help Zope with spitting out XHTML 1.0-compliant tags? :] -- Alexander Limi alexander@limi.net