Malicious HTML - Zope - Zope lists

newer
RE: [newbie] MySQLDA Installation...

Malicious HTML

older
Re: [Zope] Creation of an ZClass...

Graham Chiu

8 Mar 2000 8 Mar '00

12:55 a.m.

I wish to allow users to enter comments into my database which are then viewable thru the browser. Is there a Zope function that I can pass their text thru to remove all HTML? ------- Regards, Graham Chiu gchiu<at>compkarori.co.nz http://www.compkarori.com/dynamo - The Homebuilt Dynamo http://www.compkarori.com/dbase - The dBase bulletin

Reply

Sign in to reply online Use email software

Show replies by date

Michel Pelletier

8 Mar 8 Mar

10:12 a.m.

New subject: [Zope] Malicious HTML

Graham Chiu wrote:

I wish to allow users to enter comments into my database which are then viewable thru the browser.

Is there a Zope function that I can pass their text thru to remove all HTML?

Nope. There may be some standard python library module that I don't know about, however. Otherwise, you will have to write your own. -Michel

Reply

Sign in to reply online Use email software

Evan Simpson

1:05 p.m.

New subject: [Zope] Malicious HTML

----- Original Message ----- From: Michel Pelletier <michel@digicool.com>

Graham Chiu wrote:

...
I wish to allow users to enter comments into my database which are then viewable thru the browser.

Is there a Zope function that I can pass their text thru to remove all HTML?

Nope. There may be some standard python library module that I don't know about, however. Otherwise, you will have to write your own.

On the other hand, if you tell the user up front that HTML tags are not allowed, you can simply html_quote the text. That will prevent any tags from rendering, and prevent normal uses of '<>&' characters from breaking anything. Cheers, Evan @ 4-am & digicool

Reply

Sign in to reply online Use email software

Duncan Booth

5:31 p.m.

New subject: [Zope] Malicious HTML

Graham Chiu wrote:

...
I wish to allow users to enter comments into my database which are then viewable thru the browser.

Is there a Zope function that I can pass their text thru to remove all HTML?

Nope. There may be some standard python library module that I don't know about, however. Otherwise, you will have to write your own.

-Michel

Try htmllib. The following bit of python will strip all formatting from some HTML. It replaces all anchors with footnote style references and images with their alt text. If you want something a bit fancier you could add methods to the MyParser class to pass through particular tags (see the commented out methods as an example). It shouldn't be too hard to wrap something like this up in an external method (as presented it is a complete runnable program that retrieves a URL and displays the text). --------- File strip.py -------------- # Strip all HTML formatting. import sys,formatter,StringIO,htmllib,string from urllib import urlretrieve,urlcleanup class MyParser(htmllib.HTMLParser): def __init__(self): self.bodytext = StringIO.StringIO() writer = formatter.DumbWriter(self.bodytext) htmllib.HTMLParser.__init__(self, formatter.AbstractFormatter(writer)) def gettext(self): return self.bodytext.getvalue() # Uncomment these to pass through bold tags. # def start_b(self, attrs): # self.formatter.add_flowing_data('<b>') # # def end_b(self): # self.formatter.add_flowing_data('</b>') def GetPage(url): try: fn, h = urlretrieve(url) text = open(fn, "r").read() finally: urlcleanup() return text if __name__=='__main__': data = GetPage(sys.argv[1]) p = MyParser() p.feed(data) p.close() text = string.replace(p.gettext(), '\xa0', ' ') print text anchors = p.anchorlist for i in range(len(anchors)): print "[%d]: %s" % (i+1, anchors[i]) --------- end of strip.py -------------- -- Duncan Booth duncan@dales.rmplc.co.uk int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3" "\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure? http://dales.rmplc.co.uk/Duncan

Reply

Sign in to reply online Use email software

9526

Age (days ago)

9526

Last active (days ago)

3 comments

4 participants

tags

participants (4)

Duncan Booth
Evan Simpson
Graham Chiu
Michel Pelletier