Hi Zopers: I am trying to do some very basic pattern matching, but to my surprise Zope does not seem to support it very well now (or I have not been able to find the right docs). I found a couple of items from Runyaga in ZopeZen and ZopeLabs regarding regexes: <http://zopezen.org/SDot/993647662/index_html> <http://www.zopelabs.com/cookbook/993591088> The above postings are old (about 1 year), and I wonder if there's better support for regexes now...? Where can I look? What I want to do is awfully simple: I have a folder object with other objects (pictures, html docs, etc.) which I want to traverse, and based on the object id's (file names are a string consisting of date + obj type, for example: 2002_July_08_pic_1.jpg), I want to group them, and print headers for each group, for example: Photos for July 7 <dtml-if expr="id =~ /2002_July_7/"> <- Just a 'perlarized' example <img src="&dtml-id;"> <dtml-if> Photos for July 8 <dtml-if expr="id =~ /2002_July_8/"> <- Just a 'perlarized' example <img src="&dtml-id;"> <dtml-if> I know I could go to properties, and add some kind of properties to each object, but that is not possible as I have lots, and it would be to hard to go TTW to add some property to each one of them....Or should I just give up DTML in this case? Any idea is welcome. Thanks!, Jorge M. -- Jorge O. Martinez MIS Senior Associate FDCH-eMedia Inc. 2400 Forbes Blvd., Suite 200 Lanham, MD 20706 E-mail => jmartinez@eMediaMillWorks.com Phone => (301)731-1228 ext. 105 Fax => (301)731-0937
On Tuesday 09 July 2002 16:31, Jorge O. Martinez wrote:
<dtml-if expr="id =~ /2002_July_7/"> <- Just a 'perlarized' example <img src="&dtml-id;"> <dtml-if>
What is this? First, You should use _python_. Second, regexps not allowed in Zope: use external method or Product. -- Sincerely yours, Bogdan M. Maryniuck "If the future navigation system [for interactive networked services on the NII] looks like something from Microsoft, it will never work." (Chairman of Walt Disney Television & Telecommunications)
On Tuesday 09 July 2002 16:51, Bo M. Maryniuck wrote:
regexps not allowed in Zope Sorry, this is FUD: I meant not allowed in Python Script Method. Excuse me... :-)
-- Sincerely yours, Bogdan M. Maryniuck "All language designers are arrogant. Goes with the territory..." (By Larry Wall)
Jorge, regexps and some other libraries have been deemed unsafe for use in zope python scripts, and not included by default. however, you can override this, at your own risk. I have ;) In the Control Panel of the Root Folder, select the Products list, and click on PythonScripts. Click the README tab, and it gives you instructions on how to allow use of the re library, and any others. Ben Avery YouthNet UK Bo M. Maryniuck wrote:
On Tuesday 09 July 2002 16:51, Bo M. Maryniuck wrote:
regexps not allowed in Zope
Sorry, this is FUD: I meant not allowed in Python Script Method. Excuse me... :-)
Ben Avery wrote:
regexps and some other libraries have been deemed unsafe for use in zope python scripts, and not included by default. however, you can override this, at your own risk. I have ;)
Ben, would you mind expanding on this? What dangers are there? Regexes are so handy, and if I turn them on I'd like to know what the risks are... TIA. Kirk
I'm afraid I'll have to open that one up to the audience. I'm pretty new to Zope, but I don't see why they're unsafe to use in your scripts, as long as you're not doing really stupid things with the results... Kirk Lowery wrote:
Ben Avery wrote:
regexps and some other libraries have been deemed unsafe for use in zope python scripts, and not included by default. however, you can override this, at your own risk. I have ;)
Ben, would you mind expanding on this? What dangers are there? Regexes are so handy, and if I turn them on I'd like to know what the risks are...
TIA.
Kirk
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
Kirk Lowery wrote:
Ben, would you mind expanding on this? What dangers are there? Regexes are so handy, and if I turn them on I'd like to know what the risks are...
A badly written regex can easily run in an infinite loop, hanging a thread of your Zope server. Once the number of people who cause this regex to execute is equal to the number of threads on your Zope server, you have a hung server ;-) cheers, Chris
[My first attempt to send this failed to post it to the list] Kirk Lowery <klowery@wts.edu> wrote in news:3D2B0727.1030401@wts.edu:
Ben, would you mind expanding on this? What dangers are there? Regexes are so handy, and if I turn them on I'd like to know what the risks are...
Regular expressions can easily lead to unbounded CPU usage. Provided you don't let any untrusted users edit anything that can call regular expressions, and provided that you fully understand the issues involved, then you may be able to use regular expressions. If the regular expression comes from an untrusted source, or if you don't think it through carefully enough, then you may find that you can effectively block one thread of your Zope server. For the sort of use you proposed, this is unlikely to happen. The problem is that it is almost impossible to work out in advance which regular expressions will cause problems. Try this at a Python interactive prompt:
import re s = 'a'+'b'*1024+'c' re.match('a.*.*b.*c',s) re.match('a.*.*b.*d',s)
The first match will complete instantly, but the second one will probably take a few seconds to fail. (My memory may be a bit rusty on this next bit) There are traditionally two ways to match regular expressions. Deterministic Finite-state Automata (DFA), and Non-deterministic Finite- state Automata (NFA). Most software uses NFA matching, which requires a small amount of memory to compile the regex, but requires a potentially unbounded time to do the match. DFA matching can match a regex in linear time (i.e. linear on the search string), but may require a very large amount of memory (and CPU) to compile the regular expression. If someone produced a DFA engine for Python which had the option of limiting the size of the compiled expression (and throwing an exception for any that were too complex), then you could probably expose this safely in Zope. There may be other problems with DFAs though, e.g. I'm not sure if you can match groups easily with a DFA. DFA matches are best used when you have a fixed pattern so that the overhead for compilation is only hit once (e.g. parsers). With a DFA engine you might implement a persistent regex object in the ZODB. Not all regexes would be compilable into the object, but once you managed to compile one you would know it would work. -- Duncan Booth duncan@rcp.co.uk int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3" "\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure? -- Duncan Booth duncan@rcp.co.uk int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3" "\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure?
Yes I know that is not 'real' code that does not exist, I was just saying if something like it could be done with DTML. Guess not, I had seen the external method solution from Runyaga, but was hoping something could be done with DTML only. Thanks, Jorge M. Bo M. Maryniuck wrote:
On Tuesday 09 July 2002 16:31, Jorge O. Martinez wrote:
<dtml-if expr="id =~ /2002_July_7/"> <- Just a 'perlarized' example <img src="&dtml-id;"> <dtml-if>
What is this? First, You should use _python_. Second, regexps not allowed in Zope: use external method or Product.
-- Jorge O. Martinez MIS Senior Associate FDCH-eMedia Inc. 2400 Forbes Blvd., Suite 200 Lanham, MD 20706 E-mail => jmartinez@eMediaMillWorks.com Phone => (301)731-1228 ext. 105 Fax => (301)731-0937
On Tuesday 09 July 2002 17:15, Jorge O. Martinez wrote:
Yes I know that is not 'real' code that does not exist, I was just saying if something like it could be done with DTML. Guess not, I had seen the external method solution from Runyaga, but was hoping something could be done with DTML only.
You want do this via External Method, not via DTML. DTML is gone. -- Sincerely yours, Bogdan M. Maryniuck Fatal Error: Found [MS-Windows] System -> Repartitioning Disk for Linux... (By cbbrown@io.org, Christopher Browne)
Just to clarify: DTML is not "gone". This scares people when they hear it. DTML will be around probably forever. It's just not the right tool for this particular job. ----- Original Message ----- From: "Bo M. Maryniuck" <b.maryniuk@forbis.lt> To: "Jorge O. Martinez" <jmartinez@eMediaMillWorks.com>; <zope@zope.org> Sent: Tuesday, July 09, 2002 12:31 PM Subject: Re: [Zope] DTML, Zope and Regex On Tuesday 09 July 2002 17:15, Jorge O. Martinez wrote:
Yes I know that is not 'real' code that does not exist, I was just saying if something like it could be done with DTML. Guess not, I had seen the external method solution from Runyaga, but was hoping something could be done with DTML only.
You want do this via External Method, not via DTML. DTML is gone. -- Sincerely yours, Bogdan M. Maryniuck Fatal Error: Found [MS-Windows] System -> Repartitioning Disk for Linux... (By cbbrown@io.org, Christopher Browne) _______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
Hi: Thanks for clarifying. I know it's not gone, and I've seen several threads debating that. Just wonder if someone could expand how enabling regexes can be unsafe for python scripts on Zope as Ben was suggesting. I feel regexes ARE an incredibly useful tool, and wish they were available for DTML and python scripts by default. I feel doing it via an external method complicates things unnecessarily IMHO (I am a Zope newbie, so don't know the reasoning behind this, but if PHP, for example, allows its use and makes developers life easier, I don't see why Zope restricts it). Thanks, Jorge M. Chris McDonough wrote:
Just to clarify: DTML is not "gone". This scares people when they hear it. DTML will be around probably forever. It's just not the right tool for this particular job.
----- Original Message ----- From: "Bo M. Maryniuck" <b.maryniuk@forbis.lt> To: "Jorge O. Martinez" <jmartinez@eMediaMillWorks.com>; <zope@zope.org> Sent: Tuesday, July 09, 2002 12:31 PM Subject: Re: [Zope] DTML, Zope and Regex
On Tuesday 09 July 2002 17:15, Jorge O. Martinez wrote:
Yes I know that is not 'real' code that does not exist, I was just
saying
if something like it could be done with DTML. Guess not, I had
seen the
external method solution from Runyaga, but was hoping something
could be
done with DTML only.
You want do this via External Method, not via DTML. DTML is gone.
-- Sincerely yours, Bogdan M. Maryniuck
Fatal Error: Found [MS-Windows] System -> Repartitioning Disk for Linux... (By cbbrown@io.org, Christopher Browne)
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
-- Jorge O. Martinez MIS Senior Associate FDCH-eMedia Inc. 2400 Forbes Blvd., Suite 200 Lanham, MD 20706 E-mail => jmartinez@eMediaMillWorks.com Phone => (301)731-1228 ext. 105 Fax => (301)731-0937
Thanks for clarifying. I know it's not gone, and I've seen several threads debating that. Just wonder if someone could expand how enabling regexes can be unsafe for python scripts on Zope as Ben was suggesting. I feel regexes ARE an incredibly useful tool, and wish they were available for DTML and python scripts by default.
Actually, I don't think that it's as much that they're unsafe (although apparently you can create a regex that will infinitely recurse at least until you run out of stack space), but that the re module is a C module and returns types that are C-defined and can't be protected with normal security declarations. Nobody has bothered to wrap it this in something that is usable in through the web code. (It would be a nice Product for someone to write).
I feel doing it via an external method complicates things unnecessarily IMHO (I am a Zope newbie, so don't know the reasoning behind this, but if PHP, for example, allows its use and makes developers life easier, I don't see why Zope restricts it).
PHP has no notion of untrusted code. It isn't possible (as far as I know) to safely delegate the writing of PHP code "through the web", where you have assurances that the people writing the code can't bollix up your world too much. In Zope, it's not only possible, it's the default (although it of course can be changed). - C
well, external methods are python scripts with no safety measures at all, so are potentially much more unsafe than any use of regexps in a python script. So I'd say it's better to allow the re module in your python scripts (see my previous post) than resort to external methods. but I also haven't come across a reason to consider regexps unsafe. I'm sure it's been discussed here before - could someone point us to a post on this subject, pls ? Jorge O. Martinez wrote:
Hi:
Thanks for clarifying. I know it's not gone, and I've seen several threads debating that. Just wonder if someone could expand how enabling regexes can be unsafe for python scripts on Zope as Ben was suggesting. I feel regexes ARE an incredibly useful tool, and wish they were available for DTML and python scripts by default.
I feel doing it via an external method complicates things unnecessarily IMHO (I am a Zope newbie, so don't know the reasoning behind this, but if PHP, for example, allows its use and makes developers life easier, I don't see why Zope restricts it).
Thanks,
Jorge M.
Chris McDonough wrote:
Just to clarify: DTML is not "gone". This scares people when they hear it. DTML will be around probably forever. It's just not the right tool for this particular job.
----- Original Message ----- From: "Bo M. Maryniuck" <b.maryniuk@forbis.lt> To: "Jorge O. Martinez" <jmartinez@eMediaMillWorks.com>; <zope@zope.org> Sent: Tuesday, July 09, 2002 12:31 PM Subject: Re: [Zope] DTML, Zope and Regex
On Tuesday 09 July 2002 17:15, Jorge O. Martinez wrote:
Yes I know that is not 'real' code that does not exist, I was just
saying
if something like it could be done with DTML. Guess not, I had
seen the
external method solution from Runyaga, but was hoping something
could be
done with DTML only.
You want do this via External Method, not via DTML. DTML is gone.
-- Sincerely yours, Bogdan M. Maryniuck
Fatal Error: Found [MS-Windows] System -> Repartitioning Disk for Linux... (By cbbrown@io.org, Christopher Browne)
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
On Wed, Jul 10, 2002 at 03:17:14PM +0100, Ben Avery wrote:
well, external methods are python scripts with no safety measures at all,
For one thing, they live on the filesystem. If somebody has read/write access to your filesystem, you have much bigger problems than what th can do to your external methods. e.g. rm -f var/Data.fs. For another thing, you can control via zope's security interface who has permission to add External Methods. So you can restrict them to trusted developers. At least, I think that's the idea... -- Paul Winkler home: http://www.slinkp.com "Muppet Labs, where the future is made - today!"
Paul Winkler wrote:
On Wed, Jul 10, 2002 at 03:17:14PM +0100, Ben Avery wrote:
well, external methods are python scripts with no safety measures at all,
For one thing, they live on the filesystem. If somebody has read/write access to your filesystem, you have much bigger problems than what th can do to your external methods. e.g. rm -f var/Data.fs.
I understand your concern with a situation like the above, but that is not exactly what I had in mind, I was thinking about matching/replacing strings, and take actions based on matches, not executing commands at the system level. Additionally, when I am talking about regex functionality, I think it would help if it's enabled within the context of Zope (inside Data.fs only) as a default, and not allowed to interact with outside stuff in the filesystem. Then people who wanted even more functionality could enable filesystem fuctionality at their own risk. For enhanced security only members of some group could be give the right to execute regexe's, or, even better, only certain folders could be enabled for that. Just some ideas...
For another thing, you can control via zope's security interface who has permission to add External Methods. So you can restrict them to trusted developers.
At least, I think that's the idea...
-- Jorge O. Martinez MIS Senior Associate FDCH-eMedia Inc. 2400 Forbes Blvd., Suite 200 Lanham, MD 20706 E-mail => jmartinez@eMediaMillWorks.com Phone => (301)731-1228 ext. 105 Fax => (301)731-0937
On Wed, Jul 10, 2002 at 01:58:39PM -0400, Jorge O. Martinez wrote:
For one thing, they live on the filesystem. If somebody has read/write access to your filesystem, you have much bigger problems than what th can do to your external methods. e.g. rm -f var/Data.fs.
I understand your concern with a situation like the above, but that is not exactly what I had in mind, I was thinking about matching/replacing strings, and take actions based on matches, not executing commands at the system level.
I know that. I was responding to the mistaken assertion that external methods have no security at all. They have much more security than python scripts. -- Paul Winkler home: http://www.slinkp.com "Muppet Labs, where the future is made - today!"
On Wed, Jul 10, 2002 at 03:17:14PM +0100, Ben Avery wrote:
well, external methods are python scripts with no safety measures at all, so are potentially much more unsafe than any use of regexps in a python script. So I'd say it's better to allow the re module in your python scripts (see my previous post) than resort to external methods.
but I also haven't come across a reason to consider regexps unsafe. I'm sure it's been discussed here before - could someone point us to a post on this subject, pls ?
As I understand it, the problem is not so much security, pro se, but denial of service. That is, it is extremely easy to write regular expressions which take enormous amounts of time or memory to process. Worse, the processing time and space is extremely dependent on input, so that apparently well-tested code can suddenly become a liability when exposed to a less than friendly audience. (Think about a line-oriented regex that is furnished multi-megabyte line.) To say it another way, using regex does not make it more likely that you will be cracked. It does make it more likely that your system will appear to be unresponsive, and, if memory exhaustion occurs, dead. Jim Penny
Jorge O. Martinez wrote:
Jim Penny wrote:
On Wed, Jul 10, 2002 at 03:17:14PM +0100, Ben Avery wrote:
well, external methods are python scripts with no safety measures at all, so are potentially much more unsafe than any use of regexps in a python script. So I'd say it's better to allow the re module in your python scripts (see my previous post) than resort to external methods.
but I also haven't come across a reason to consider regexps unsafe. I'm sure it's been discussed here before - could someone point us to a post on this subject, pls ?
As I understand it, the problem is not so much security, pro se, but denial of service. That is, it is extremely easy to write regular expressions which take enormous amounts of time or memory to process.
Oh, come on. my_bigasslist=[] i=0 while(1): i=i+1 my_bigasslist.append('bla'*i) Gets zope to use >>100M in less than 2 secs on a lowly PII 350.
Worse, the processing time and space is extremely dependent on input, so that apparently well-tested code can suddenly become a liability when exposed to a less than friendly audience. (Think about a line-oriented regex that is furnished multi-megabyte line.)
if inputvar='killmyserver': my_bigassarray=[] i=0 while(1): i=i+1 my_bigassarray.append('bla'*i) else: return 'whoa, I was lucky'
To say it another way, using regex does not make it more likely that you will be cracked. It does make it more likely that your system will appear to be unresponsive, and, if memory exhaustion occurs, dead.
While the examples above wouldn't be written from anybody non-malicious in his right mind, I nonetheless think these arguments are dubious (mind you, I know you just cited them). The arguments Chris brought up in another post seem more convincing, but I just wanted to make sure that the reasoning you stated gets a rebuttal. cheers, oliver
On Wednesday 10 Jul 2002 4:49 pm, Oliver Bleutgen wrote:
As I understand it, the problem is not so much security, pro se, but denial of service. That is, it is extremely easy to write regular expressions which take enormous amounts of time or memory to process.
Oh, come on.
my_bigasslist=[] i=0 while(1): i=i+1 my_bigasslist.append('bla'*i)
Gets zope to use >>100M in less than 2 secs on a lowly PII 350.
It doesnt matter how easy it is to write a program that exhibits the problem. Rather, how easy it is to write a program that provably cannot exhibit the problem. The issue with regular expressions is similar to the problems that cause buffer overflow vulnerabilities in C programs. Even experts find it hard to write a non-trivial program that is completely free from problems. That doesnt mean that C or regular expressions do not have their uses, but I am pleased with the restriction that you cannot use them TTW.
Toby Dickenson wrote:
On Wednesday 10 Jul 2002 4:49 pm, Oliver Bleutgen wrote:
As I understand it, the problem is not so much security, pro se, but denial of service. That is, it is extremely easy to write regular expressions which take enormous amounts of time or memory to process.
Oh, come on.
my_bigasslist=[] i=0 while(1): i=i+1 my_bigasslist.append('bla'*i)
Gets zope to use >>100M in less than 2 secs on a lowly PII 350.
It doesnt matter how easy it is to write a program that exhibits the problem. Rather, how easy it is to write a program that provably cannot exhibit the problem.
The issue with regular expressions is similar to the problems that cause buffer overflow vulnerabilities in C programs. Even experts find it hard to write a non-trivial program that is completely free from problems.
That doesnt mean that C or regular expressions do not have their uses, but I am pleased with the restriction that you cannot use them TTW.
Well, if *you* are concerned that *you* *yourself* might shoot yourself in the food when using regex, the solution would be simple: Don't use them. Easy. Together with the fact the I am quite sure that *you* are not in great danger to do something very stupid with regex, I conclude that you have users which you don't trust if they had the power to use regex in python scripts. Ok, maybe this is a problem - maybe not. But then it would be more logical IMO to find a way to make python scripts more secure without sacrificing usability that much. Maybe the ability to impose resource limits on scripts individually, like for memory consumption and processing time, if that's possible? Btw. there seems to be something not quite optimal w.r.t. some kind of resource limit that seems to be present right now. I ran the code I posted just fur the fun of it and forgot that. After some time I got a out of memory exception (don't remember the exact name), but apparently that didn't cause python to release the memory again. My machine was completely unusable after that - segfaults on nearly everything I tried on the CLI until I killed that zope. Is that a bug? I'd say limiting the abilities of python scripts concerning the ability to break out of their zope sandbox should be enough. And the modules which are allowed to be imported should be measured by this criteria and probably some other stuff I'm absolutly not qualified to comment on, like what Chris said about modules returning non-python types. Again, I'm far from religious about this issue, it's just that I think the reasons often brought up about restrictions of the script(python) object and dtml should be choosen more carefully. cheers, oliver
I was agreeing with Toby, until it dawned on me that string.* is available unrestricted. Yes, my regexs may be vulnerable to a DOS attack if someone foists a 4M string at me. But so is string.index and string.rindex and (even worse) string.lower. Besides, as Oliver points out, limiting access to re doesn't mean I can't write code that wantonly consumes all CPU and memory. His example is artificial but it could easily be modified to take paramters from the HTTP REQUEST and still do stupid things. If the issue is resource (CPU or memory or disk) consumption, then trying to limit package availability is never going to be a 100% solution. To limit resource consumption, you must (wait for it....) limit resource consumption. In other words, requests need CPU timeouts and memory quotas. So to rephrase the original question: Assuming I'm willing to risk the DOS attacks, is there any other security risk to opening up regexs for Zope use? Is there some way a hacker can assume control of my Zope server or change its content because I decided to utilize regexes in my Python scripts? You don't have to tell me how, of course. Just let me know if it is possible.
-----Original Message----- From: zope-admin@zope.org [mailto:zope-admin@zope.org]On Behalf Of Toby Dickenson Sent: Wednesday, July 10, 2002 9:12 AM To: Oliver Bleutgen; zope@zope.org Subject: Re: [Zope] DTML, Zope and Regex
On Wednesday 10 Jul 2002 4:49 pm, Oliver Bleutgen wrote:
As I understand it, the problem is not so much security, pro se, but denial of service. That is, it is extremely easy to write regular expressions which take enormous amounts of time or memory to process.
Oh, come on.
my_bigasslist=[] i=0 while(1): i=i+1 my_bigasslist.append('bla'*i)
Gets zope to use >>100M in less than 2 secs on a lowly PII 350.
It doesnt matter how easy it is to write a program that exhibits the problem. Rather, how easy it is to write a program that provably cannot exhibit the problem.
The issue with regular expressions is similar to the problems that cause buffer overflow vulnerabilities in C programs. Even experts find it hard to write a non-trivial program that is completely free from problems.
That doesnt mean that C or regular expressions do not have their uses, but I am pleased with the restriction that you cannot use them TTW.
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
On Wed, Jul 10, 2002 at 09:56:32AM -0700, Charlie Reiman wrote:
I was agreeing with Toby, until it dawned on me that string.* is available unrestricted. Yes, my regexs may be vulnerable to a DOS attack if someone foists a 4M string at me. But so is string.index and string.rindex and (even worse) string.lower. Besides, as Oliver points out, limiting access to re doesn't mean I can't write code that wantonly consumes all CPU and memory. His example is artificial but it could easily be modified to take paramters from the HTTP REQUEST and still do stupid things.
Yes, but at least each is linear w.r.t. input size. regexes can be exponential. Damn, I am trying to remember: it feels to me that they can be factorial (but this would be hard to do accidentally).
If the issue is resource (CPU or memory or disk) consumption, then trying to limit package availability is never going to be a 100% solution. To limit resource consumption, you must (wait for it....) limit resource consumption. In other words, requests need CPU timeouts and memory quotas.
True -- it is not, and was never intended to be a 100% solution. It was an engineering tradeoff. And I suspect that the needs of Zope Hosting providers was weighed heavily. They would want to be able to look at a user's code that was taking a lot of resource and quickly make a decision on whether to continue to have him as a customer. regexes would certainly make that more difficult. I don't know any of this, I am as far outside Zope decision making circles as can be.
So to rephrase the original question: Assuming I'm willing to risk the DOS attacks, is there any other security risk to opening up regexs for Zope use? Is there some way a hacker can assume control of my Zope server or change its content because I decided to utilize regexes in my Python scripts?
Not to my knowledge. In fact, I doubt it; the regex compilation process is completely uncontrollable by input, and I would be surprised if there were any problems in the match algorithm that could be exploited by input (although I seem to recall dimly problems with Unicode).
You don't have to tell me how, of course. Just let me know if it is possible.
Jim Penny
Jim Penny wrote:
On Wed, Jul 10, 2002 at 09:56:32AM -0700, Charlie Reiman wrote:
I was agreeing with Toby, until it dawned on me that string.* is available unrestricted. Yes, my regexs may be vulnerable to a DOS attack if someone foists a 4M string at me. But so is string.index and string.rindex and (even worse) string.lower. Besides, as Oliver points out, limiting access to re doesn't mean I can't write code that wantonly consumes all CPU and memory. His example is artificial but it could easily be modified to take paramters from the HTTP REQUEST and still do stupid things.
Yes, but at least each is linear w.r.t. input size. regexes can be exponential. Damn, I am trying to remember: it feels to me that they can be factorial (but this would be hard to do accidentally).
If the issue is resource (CPU or memory or disk) consumption, then trying to limit package availability is never going to be a 100% solution. To limit resource consumption, you must (wait for it....) limit resource consumption. In other words, requests need CPU timeouts and memory quotas.
True -- it is not, and was never intended to be a 100% solution. It was an engineering tradeoff. And I suspect that the needs of Zope Hosting providers was weighed heavily. They would want to be able to look at a user's code that was taking a lot of resource and quickly make a decision on whether to continue to have him as a customer. regexes would certainly make that more difficult. I don't know any of this, I am as far outside Zope decision making circles as can be.
I think I am beginning to understand the scope of the decision to exclude regex support: more security for the future Zope ISP's vs less convenience for the future Zope developers; however, don't you all think that potential Zope developers may be discouraged when they know they have to contact their ISP to install an external method or product if they have something that requires a simple regex in their DTML/TAL code, as opposed to a developer who is working on an Apache/PHP solution, and has all the functionality PHP offers including regex support (with the restrictions the admin imposes on users via php.ini) without having to ask anything special to the ISP (except if they need something more specialized like ImageMagick support)? Looks like the security issue may be stepping on the usability issue's toes, which ultimately may interfere into wider adoption as developers with access to their own boxes will be more likely to go for Zope than developer relying on ISP's. Wouldn't it be better to somehow limit how much 'damage' developers can do in their own work area (via the Monster module, or zoped.ini for example), and give them enough rope to hang themselves, but not to crash the system. Don't know if that is possible, just an idea.
So to rephrase the original question: Assuming I'm willing to risk the DOS attacks, is there any other security risk to opening up regexs for Zope use? Is there some way a hacker can assume control of my Zope server or change its content because I decided to utilize regexes in my Python scripts?
Not to my knowledge. In fact, I doubt it; the regex compilation process is completely uncontrollable by input, and I would be surprised if there were any problems in the match algorithm that could be exploited by input (although I seem to recall dimly problems with Unicode).
You don't have to tell me how, of course. Just let me know if it is possible.
Jim Penny
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
-- Jorge O. Martinez MIS Senior Associate FDCH-eMedia Inc. 2400 Forbes Blvd., Suite 200 Lanham, MD 20706 E-mail => jmartinez@eMediaMillWorks.com Phone => (301)731-1228 ext. 105 Fax => (301)731-0937
Looks like the security issue may be stepping on the usability issue's toes, which ultimately may interfere into wider adoption as developers with access to their own boxes will be more likely to go for Zope than developer relying on ISP's.
I think the problem is not having anybody with a determination to focus on TTW scripting usability and lobby for some cohesive proposal on dev.zope.org that would make it better for some future revision of Zope. Regexes aren't the only thing that are annoying to not have in TTW scripts, and thought needs to be given to what gets included by default, and someobody needs to drive that effort. "Not it!" ;-) - C
Chris McDonough wrote:
Looks like the security issue may be stepping on the usability
issue's toes,
which ultimately may interfere into wider adoption as developers
with access to
their own boxes will be more likely to go for Zope than developer
relying on
ISP's.
I think the problem is not having anybody with a determination to focus on TTW scripting usability and lobby for some cohesive proposal on dev.zope.org that would make it better for some future revision of Zope. Regexes aren't the only thing that are annoying to not have in TTW scripts, and thought needs to be given to what gets included by default, and someobody needs to drive that effort. "Not it!" ;-)
On a different subject, but still with the usability issue for developers, I was thinking it would be nice to have some easier to use comment system for DTML code, for example for: <dtml-if foo> something here </dtml-if> now you have to use the rather long <dtml-comment> </dtml-comment>, it would probably nice if you only had to do: <//dtml-if foo> or <#dtml-if foo> And that would comment out all the code within that dtml tag. We could still use the <dtml-comment> tag for multi-tag commenting out. Anyhow, I know I am off the topic here, I just brought it up before I forget. Thanks to all who replied; sure helps enhance ZopeZen ;-) Regards, Jorge M.
- C
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
-- Jorge O. Martinez MIS Senior Associate FDCH-eMedia Inc. 2400 Forbes Blvd., Suite 200 Lanham, MD 20706 E-mail => jmartinez@eMediaMillWorks.com Phone => (301)731-1228 ext. 105 Fax => (301)731-0937
You can, more or less. DTML will disregard unknown tags, as will most HTML rendering engines (XML is another matter). So go ahead and changed <dtml-if> to <xdtml-if> and watch it get ignored. Note that this is not a true dtml-comment, as the code between tags will still appear in your output, but it's a handy trick for finding logic problems. But I would like to have a smaller alias for dtml-comment, too. Truthfully, what I would really like is a 'Comments' tab in the ZMI for every object allowing me to store adminstrative docs in the objects. And it would be really great if I had a chocolate ice cream cone right about now, preferrably in a waffle cone. There. Now we are well and truly off-topic.
-----Original Message----- From: Jorge O. Martinez [mailto:jmartinez@eMediaMillWorks.com] Sent: Wednesday, July 10, 2002 11:31 AM To: Chris McDonough Cc: Jim Penny; Charlie Reiman; zope@zope.org Subject: Re: [Zope] DTML, Zope and Regex
Chris McDonough wrote:
Looks like the security issue may be stepping on the usability
issue's toes,
which ultimately may interfere into wider adoption as developers
with access to
their own boxes will be more likely to go for Zope than developer
relying on
ISP's.
I think the problem is not having anybody with a determination to focus on TTW scripting usability and lobby for some cohesive proposal on dev.zope.org that would make it better for some future revision of Zope. Regexes aren't the only thing that are annoying to not have in TTW scripts, and thought needs to be given to what gets included by default, and someobody needs to drive that effort. "Not it!" ;-)
On a different subject, but still with the usability issue for developers, I was thinking it would be nice to have some easier to use comment system for DTML code, for example for:
<dtml-if foo> something here </dtml-if>
now you have to use the rather long <dtml-comment> </dtml-comment>, it would probably nice if you only had to do:
<//dtml-if foo> or <#dtml-if foo>
And that would comment out all the code within that dtml tag. We could still use the <dtml-comment> tag for multi-tag commenting out.
Anyhow, I know I am off the topic here, I just brought it up before I forget. Thanks to all who replied; sure helps enhance ZopeZen ;-)
Regards,
Jorge M.
- C
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
-- Jorge O. Martinez MIS Senior Associate FDCH-eMedia Inc. 2400 Forbes Blvd., Suite 200 Lanham, MD 20706 E-mail => jmartinez@eMediaMillWorks.com Phone => (301)731-1228 ext. 105 Fax => (301)731-0937
On Wed, Jul 10, 2002 at 01:32:39PM -0400, Jorge O. Martinez wrote:
I think I am beginning to understand the scope of the decision to exclude regex support: more security for the future Zope ISP's vs less convenience for the future Zope developers; however, don't you all think that potential Zope developers may be discouraged when they know they have to contact their ISP to install an external method or product
If you have to contact your ISP to install an external method or product, I suggest looking for another ISP.
Wouldn't it be better to somehow limit how much 'damage' developers can do in their own work area (via the Monster module, or zoped.ini for example), and give them enough rope to hang themselves, but not to crash the system. Don't know if that is possible, just an idea.
It's a nice idea. I don't know if it's possible either... or how much work it would take. Compared to "let's disable re in python scripts", it will surely be an enormous amount of work... so I don't expect we'll see this anytime soon. Feel free to prove me wrong. :) -- Paul Winkler home: http://www.slinkp.com "Muppet Labs, where the future is made - today!"
[Charlie Reiman]
I was agreeing with Toby, until it dawned on me that string.* is available unrestricted. Yes, my regexs may be vulnerable to a DOS attack if someone foists a 4M string at me. But so is string.index and string.rindex and (even worse) string.lower. Besides, as Oliver points out, limiting access to re doesn't mean I can't write code that wantonly consumes all CPU and memory. His example is artificial but it could easily be modified to take paramters from the HTTP REQUEST and still do stupid things.
Heck, if you want to drag the machine down, you do not need to import any modules. Try this: str='this will really do it!' for n in range(100000): str=str+str 500 MB gone in a few seconds... Cheers, Tom P
On Wed, Jul 10, 2002 at 05:49:43PM +0200, Oliver Bleutgen wrote:
Jim Penny wrote:
On Wed, Jul 10, 2002 at 03:17:14PM +0100, Ben Avery wrote:
well, external methods are python scripts with no safety measures at all, so are potentially much more unsafe than any use of regexps in a python script. So I'd say it's better to allow the re module in your python scripts (see my previous post) than resort to external methods.
but I also haven't come across a reason to consider regexps unsafe. I'm sure it's been discussed here before - could someone point us to a post on this subject, pls ?
As I understand it, the problem is not so much security, pro se, but denial of service. That is, it is extremely easy to write regular expressions which take enormous amounts of time or memory to process.
See http://www.usenix.com/publications/login/1999-4/reg_exp.html for a real-world example. As is noted in the article, a 1700-fold improvement here, a 1700-fold improvement there, can start to add up! regexes are worst case exponential in speed. On a side note - (warning diversion), you might also consider the Perl Apocolypse 5 http://www.perl.com/lpt/a/2002/06/04/apo5.html
Worse, the processing time and space is extremely dependent on input, so that apparently well-tested code can suddenly become a liability when exposed to a less than friendly audience. (Think about a line-oriented regex that is furnished multi-megabyte line.)
if inputvar='killmyserver': my_bigassarray=[] i=0 while(1): i=i+1 my_bigassarray.append('bla'*i) else: return 'whoa, I was lucky'
To say it another way, using regex does not make it more likely that you will be cracked. It does make it more likely that your system will appear to be unresponsive, and, if memory exhaustion occurs, dead.
This is not exactly what I had in mind when I said "apparently well-tested code". By that phrase I meant that the code, by a combination of inspection and testing was reasonably expected to not blow up, or not take excessive amounts of time. Because regexes have a worst-case exponential behavior (as I recall, in both space and time), and because it is reasonably easy to introduce that kind of behavior accidently and without malice; it seems to me to be a reasonable engineering tradeoff to prohibit regexes in TTW programming. This does not say that we can prohibit every kind of abuse, it does not say that regexes are not valuable tools. It does say that they are somewhat dangerous tools that can have difficult to predict impact on performance. Jim Penny
cheers, oliver
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
On Wednesday 10 July 2002 22:26, Chris Withers wrote:
Chris McDonough wrote:
Just to clarify: DTML is not "gone". This scares people when they hear it. DTML will be around probably forever.
joy!
It's not a joy. It's SLOW. -- Sincerely yours, Bogdan M. Maryniuck In most countries selling harmful things like drugs is punishable. Then howcome people can sell Microsoft software and go unpunished? (By hasku@rost.abo.fi, Hasse Skrifvars)
DTML will be around probably forever.
joy!
It's not a joy. It's SLOW.
Maybe, but who cares? It's not _that_ slow, and it's very rapid from a developer-hours POV to assemble uncomplicated sites with DTML if you take account of it's limitations. (And hey, I'm running Zope on a dual-2.2 Ghz Xeon server... frankly, page rendering times are just not something I notice :-) I can see the point of ZPT for more complex applications, but it's really going to take me some time to get my head around that syntax. Therefore for now, DTML is perfectly fine. I'm happy to hear it's going to remain. Julian.
On Friday 12 July 2002 04:05, Julian Melville wrote:
DTML will be around probably forever. joy! It's not a joy. It's SLOW. Maybe, but who cares? 8-(...) Me. If You have really complex project, You'll feel it VERY.
It's not _that_ slow, It IS that slow. Try to compare PythonScript loop and DTML loop with databases.
and it's very rapid from a developer-hours POV to assemble uncomplicated sites with DTML if you take account of it's limitations. ZPT?
I can see the point of ZPT for more complex applications, but it's really going to take me some time to get my head around that syntax. Therefore for now, DTML is perfectly fine. I'm happy to hear it's going to remain. Ah, that's answer for both of us! :-) ZPT for complex project, and DTML for _simple_and_or_small_. Here I agree with You fully.
-- Sincerely yours, Bogdan M. Maryniuck Now I know someone out there is going to claim, "Well then, UNIX is intuitive, because you only need to learn 5000 commands, and then everything else follows from that! Har har har!" (Andy Bates in comp.os.linux.misc, on "intuitive interfaces", slightly defending Macs.)
"Bo M. Maryniuck" wrote:
going to take me some time to get my head around that syntax. Therefore for now, DTML is perfectly fine. I'm happy to hear it's going to remain. Ah, that's answer for both of us! :-) ZPT for complex project, and DTML for _simple_and_or_small_. Here I agree with You fully.
ZPT is really pretty simple, quick, clean and explicit. Try the tutorial: http://www.zope.org/Documentation/Articles/ZPT1 http://www.zope.org/Documentation/Articles/ZPT2 http://www.zope.org/Documentation/Articles/ZPT3 ..you may be pleasantly surprised. My take would be to use ZPT for everything, even ZSQL methods (which sadly, you can'y currently do...) cheers, Chris
participants (14)
-
Ben Avery -
Bo M. Maryniuck -
Charlie Reiman -
Chris McDonough -
Chris Withers -
Duncan Booth -
Jim Penny -
Jorge O. Martinez -
Julian Melville -
Kirk Lowery -
Oliver Bleutgen -
Paul Winkler -
Thomas B. Passin -
Toby Dickenson