[Zope] DTML, Zope and Regex

Jim Penny jpenny@universal-fasteners.com
Wed, 10 Jul 2002 13:17:50 -0400


On Wed, Jul 10, 2002 at 09:56:32AM -0700, Charlie Reiman wrote:
> I was agreeing with Toby, until it dawned on me that string.* is available
> unrestricted. Yes, my regexs may be vulnerable to a DOS attack if someone
> foists a 4M string at me. But so is string.index and string.rindex and (even
> worse) string.lower. Besides, as Oliver points out, limiting access to re
> doesn't mean I can't write code that wantonly consumes all CPU and memory.
> His example is artificial but it could easily be modified to take paramters
> from the HTTP REQUEST and still do stupid things.

Yes, but at least each is linear w.r.t. input size.  regexes can be
exponential.  Damn, I am trying to remember: it feels to me that they
can be factorial (but this would be hard to do accidentally).

> 
> If the issue is resource (CPU or memory or disk) consumption, then trying to
> limit package availability is never going to be a 100% solution. To limit
> resource consumption, you must (wait for it....) limit resource consumption.
> In other words, requests need CPU timeouts and memory quotas.

True -- it is not, and was never intended to be a 100% solution.  It was
an engineering tradeoff.  And I suspect that the needs of Zope Hosting
providers was weighed heavily.  They would want to be able to look at a
user's code that was taking a lot of resource and quickly make a
decision on whether to continue to have him as a customer.  regexes
would certainly make that more difficult.  I don't know any of this, I
am as far outside Zope decision making circles as can be.

> 
> So to rephrase the original question: Assuming I'm willing to risk the DOS
> attacks, is there any other security risk to opening up regexs for Zope use?
> Is there some way a hacker can assume control of my Zope server or change
> its content because I decided to utilize regexes in my Python scripts?

Not to my knowledge.  In fact, I doubt it; the regex compilation process
is completely uncontrollable by input, and I would be surprised if there
were any problems in the match algorithm that could be exploited by
input (although I seem to recall dimly problems with Unicode).

> 
> You don't have to tell me how, of course. Just let me know if it is
> possible.

Jim Penny