[Zope] Squishdot 1.1.0 (stripogram.py): HTML filtering problems
J M Cerqueira Esteves
jmce@artenumerica.com
Tue, 8 May 2001 20:25:00 +0100
On Tue, May 08, 2001 at 12:48:10AM +0100, Chris Withers wrote:
> - HTML parsing now done using the Strip-o-Gram library.
A few minutes ago, I made a few tests with the html2safehtml function in
stripogram.py and found that it is possible to force inclusion of arbitrary
tags in the output text.
html2safehtml ('Roses <b>are</B> red,<br>violets <i>are</i> blue',
valid_tags=['b', 'i', 'br'])
returns
'Roses <b>are</b> red,<br>violets <i>are</i> blue'
as expected, but
html2safehtml ('Roses <b>are</B> red,<br/>violets <i>are</i> blue',
valid_tags=['b','i','br'])
returns
'Roses <b>are</b> red,<br>>violets <i>are<i> blue'
Notice that the (valid for XHTML) '<br/>' becomes '<br>>'
and the closing '</i>' at the end comes out as... '<i>'.
But it gets more interesting: the result of
html2safehtml ('Roses <b>are</B> red,<br/QUACK>violets <i>are</i> blue',
valid_tags=['b','i','br'])
is
'Roses <b>are</b> red,<br>QUACK>violets <i>are<i> blue'
inspiring one to write
html2safehtml ('Roses <b>are</B> red,<br/<QUACK>violets <i>are</i> blue',
valid_tags=['b','i','br'])
getting 'Roses <b>are</b> red,<br><QUACK>violets <i>are<i> blue'
or even
html2safehtml ('Roses <b>are</B> red,<br/<blink>QUACK<//blink> violets '
'<i>are</i> blue',
valid_tags=['b','i','br'])
successfully smuggling a <blink>...</blink> inside the result:
'Roses <b>are</b> red,<br><blink>QUACK</blink> violets <i>are</i> blue'
(Notice that the closing '</i>' is now OK again, and that I had to use
'<//blink>' in order to get '</blink>'.
Maybe a problem with sgmllib? I have no time for further tests now...
--
jmce: +351 919838775 ~ http://jmce.artenumerica.org/