----- Original Message ----- From: Alexander Limi <alexander@limi.net>
2. Malicious HTML tags - is anything being done here? Filtering of these is one of the features Zope 2.2 really shouldn't go without. Most Zope sites have user interaction in some way, and the concept of a post containing a stray </html>, or even worse - script-tags, destroying a page is totally unacceptable IMHO. I'd just like to query what the status is on this, as I think it is one of the most overlooked areas that are lacking in Zope.
I know Evan Simpson (malicious tags) and Christopher Petrilli (HTML quality of zope) have been talking about this earlier, any comments?
I've got a rather crude module going which parses an input string for HTML-ish tags. It allows only tags from an explicit list, and ensures that non-empty tags are closed (either by complaining or adding closing tags). If 'script' is not one of the allowed tags, it also disallows all "On*" attributes and "javascript:*" attribute values in any tag. Unfortunately, it isn't very efficient (based on sgmllib.py) and is rather crude. I had wanted to make it use SAX to do the parsing, so that sgmlop or another high-performance library could be plugged in, but never got there. Also, it has no DTML-level interface; you'd have to wrap it in an External Method to use it from DTML. I've gone ahead and put it up at http://www.zope.org/Members/4am/SafeHTML to see if anyone can make anything of it. Cheers, Evan @ digicool & 4-am