[Zope] Malicious HTML
Marcus Collins
mcollins@sunesi.com
Wed, 8 Mar 2000 16:09:12 +0200
Or, use something like w3m
http://ei5nazha.yz.yamagata-u.ac.jp/~aito/w3m/eng/index.html
to take the html and turn it into plaintext.
Pop it into an external method something like the following, and it should
do the trick:
---html2msg.py
from popen2 import popen2
def html2msg(htmlstring):
try:
(r, w) = popen2('/usr/local/bin/w3m -dump -S -cols 70 -T "text/html"')
w.write(htmlstring)
w.close()
s = r.read()
r.close()
return s
except:
return 'Error converting article to plaintext.'
---
It's an ugly hack, and you lose things like <br> and <p>, so you may want to
replace \n with <br> afterwards. It does, however, do a brilliant job
layout-wise: tables and lists remain fairly well formatted, for example.
hth,
-- Marcus
> -----Original Message-----
> From: Evan Simpson [mailto:evan@digicool.com]
> Sent: 08 March 2000 15:06
> To: Michel Pelletier; Graham Chiu
> Cc: zope@zope.org
> Subject: Re: [Zope] Malicious HTML
>
>
> ----- Original Message -----
> From: Michel Pelletier <michel@digicool.com>
> > Graham Chiu wrote:
> > >
> > > I wish to allow users to enter comments into my database
> which are then
> > > viewable thru the browser.
> > >
> > > Is there a Zope function that I can pass their text thru
> to remove all
> > > HTML?
> >
> > Nope. There may be some standard python library module that I don't
> > know about, however. Otherwise, you will have to write your own.
>
> On the other hand, if you tell the user up front that HTML
> tags are not
> allowed, you can simply html_quote the text. That will
> prevent any tags
> from rendering, and prevent normal uses of '<>&' characters
> from breaking
> anything.
>
> Cheers,
>
> Evan @ 4-am & digicool
>
>
> _______________________________________________
> Zope maillist - Zope@zope.org
> http://lists.zope.org/mailman/listinfo/zope
> ** No cross posts or HTML encoding! **
> (Related lists -
> http://lists.zope.org/mailman/listinfo/zope-announce
> http://lists.zope.org/mailman/listinfo/zope-dev )
>