[Zope-CMF] script crashing zope
Tres Seaver
tseaver@palladion.com
Fri, 21 Dec 2001 10:42:23 -0500
Dan Keshet wrote:
> I've written an external python script to convert plain old html documents
> (stored as DTML documents) to CMF Documents. It works on small documents,
> but anything moderately large and it crashes the server. I'm not quite
> sure what moderately large is b/c I wasn't keen to keep crashing the
> server, but it's between 1709 bytes and 5094 bytes.
>
> So...
>
> 1) Is this a general Zope bug (well, clearly it shouldn't be crashing) or
> a CMF-specific bug?
>
> 2) Does anybody have a workaround or a script that they've written for the
> same purpose?
>
> Thanks,
>
> Dan
>
>
> Setup: CMF1.1, Zope 2.4.3, python 2.1.1, freebsd4.1:
>
> ---Begin script----
> def convert(self):
> from Products.CMFDefault.Document import addDocument
>
> text = self.document_src()
> title = self.title_or_id()
> id = self.getId()
> self.manage_renameObject(id, id + '.dtml')
> self.manage_addProduct['CMFDefault'].addDocument( id, title, '',
> "html", text)
> ---- End Script----
Dan,
There was a bug in CMF 1.1 which had similar symptoms. Can you
try either CMF 1.2 beta1 or a CVS checkout and let us know if the
problem persists?
You could also apply the fix yourself. Here is the diff between
CMF-1_1-release and the fix for the bug::
--- CMF/CMFDefault/Document.py 2001/06/05 18:23:53 1.24
+++ CMF/CMFDefault/Document.py 2001/08/13 21:00:18 1.28
@@ -239,10 +243,8 @@
security.declarePrivate('guessFormat')
def guessFormat(self, text):
""" Simple stab at guessing the inner format of the text """
- if bodyfinder.search(text) is not None:
- return 'html'
- else:
- return 'structured-text'
+ if utils.html_headcheck(text): return 'html'
+ else: return 'structured-text'
security.declarePrivate('handleText')
def handleText(self, text, format=None, stx_level=None):
@@ -260,9 +262,9 @@
headers.update(parser.metatags)
if parser.title:
headers['Title'] = parser.title
- bodyfound = bodyfinder.search(text)
+ bodyfound = bodyfinder(text)
if bodyfound:
- cooked = body = bodyfound.group('bodycontent')
+ cooked = body = bodyfound
else:
headers, body = parseHeadersBody(text, headers)
cooked = _format_stx(text=body, level=level)
and here is the diff to CMFDefault.utils::
--- CMF/CMFDefault/utils.py 2001/06/05 23:01:12 1.6
+++ CMF/CMFDefault/utils.py 2001/08/13 21:08:00 1.8
@@ -141,10 +141,18 @@
self.setliteral()
-bodyfinder = re.compile(r'<body.*?>(?P<bodycontent>.*?)</body>',
- re.DOTALL|re.I)
-htfinder = re.compile(r'<html', re.DOTALL|re.I)
+_bodyre = re.compile(r'<body.*?>', re.DOTALL|re.I)
+_endbodyre = re.compile(r'</body', re.DOTALL|re.I)
+
+def bodyfinder(text):
+ bod = _bodyre.search(text)
+ if not bod: return text
+ end = _endbodyre.search(text)
+ if not end: return text
+ else: return text[bod.end():end.start()]
+
+htfinder = re.compile(r'<html', re.DOTALL|re.I)
def html_headcheck(html):
""" Returns 'true' if document looks HTML-ish enough """
if not htfinder.search(html):
@@ -156,5 +164,5 @@
continue
elif lower(line[:5]) == '<html':
return 1
- elif line[:2] not in ('<!', '<?'):
+ elif line[0] != '<':
return 0
Tres.
--
===============================================================
Tres Seaver tseaver@zope.com
Zope Corporation "Zope Dealers" http://www.zope.com