Using RegEx in Script(Python)
Hi, I wanted to know if the above can be done? What I need is a function that replaces every character of a string, that is not in [a-zA-Z1-9] with an underscore. I want to use this to automatically create an Object-Id from a title, to create a new Object. If this is not possible directly within a Script(Python), can it be done using an ExternalMethod? I suppose yes. Andreas -- You'll be called to a post requiring ability in handling groups of people.
On Fri, Jun 27, 2003 at 01:09:27AM +0200, Andreas Pakulat wrote:
I wanted to know if the above can be done? What I need is a function that replaces every character of a string, that is not in [a-zA-Z1-9] with an underscore. I want to use this to automatically create an Object-Id from a title, to create a new Object.
If this is not possible directly within a Script(Python), can it be done using an ExternalMethod? I suppose yes.
In order to use regular expressions in python you need to 'import re'. After that you can use 'sub(pattern, repl, string[, count])' to substitue a string. For more information on regexps in python take a look at http://www.python.org/doc/current/lib/node99.html#l2h-732 MfG Steffen -- Zu sagen 'Hier herrscht Freiheit' ist immer ein Irrtum oder auch eine Lüge -- Freiheit herrscht nicht (Erich Fried)
On 27.Jun 2003 - 01:53:27, Steffen Hausmann wrote:
On Fri, Jun 27, 2003 at 01:09:27AM +0200, Andreas Pakulat wrote:
I wanted to know if the above can be done? What I need is a function that replaces every character of a string, that is not in [a-zA-Z1-9] with an underscore. I want to use this to automatically create an Object-Id from a title, to create a new Object.
If this is not possible directly within a Script(Python), can it be done using an ExternalMethod? I suppose yes.
In order to use regular expressions in python you need to 'import re'.
After that you can use 'sub(pattern, repl, string[, count])' to substitue a string.
For more information on regexps in python take a look at http://www.python.org/doc/current/lib/node99.html#l2h-732
The problem is that import re leads to an BasiAuth Dialog, which cannot be fulfilled. I mean, I'm already logged in to the ZMI but do get a BasicAuth Dialog if I execute the script that tries to import re. I think that regexp's are not completely included in Zope - I looked at some parts of the python-code and did not find the sub-function at the point where all the other re-functions are mad "safe". I have a workaround that iterates over the string as list and changes everything that is not alphanumeric into '_'. As the strings I replace won't get very long (max 100 chars) this should be fast enough. I also tried the external method, but I think I did something wrong, the following returns the unchanged string: re.sub('[^a-zA-Z0-9]','WS/04','_') Andreas -- "Life, loathe it or ignore it, you can't like it." -- Marvin, "Hitchhiker's Guide to the Galaxy"
On 27.Jun 2003 - 02:48:24, Andreas Pakulat wrote:
On 27.Jun 2003 - 01:53:27, Steffen Hausmann wrote:
On Fri, Jun 27, 2003 at 01:09:27AM +0200, Andreas Pakulat wrote:
I wanted to know if the above can be done? What I need is a function that replaces every character of a string, that is not in [a-zA-Z1-9] with an underscore. I want to use this to automatically create an Object-Id from a title, to create a new Object.
If this is not possible directly within a Script(Python), can it be done using an ExternalMethod? I suppose yes.
In order to use regular expressions in python you need to 'import re'.
After that you can use 'sub(pattern, repl, string[, count])' to substitue a string.
For more information on regexps in python take a look at http://www.python.org/doc/current/lib/node99.html#l2h-732
The problem is that import re leads to an BasiAuth Dialog, which cannot be fulfilled. I mean, I'm already logged in to the ZMI but do get a BasicAuth Dialog if I execute the script that tries to import re. I think that regexp's are not completely included in Zope - I looked at some parts of the python-code and did not find the sub-function at the point where all the other re-functions are mad "safe".
I have a workaround that iterates over the string as list and changes everything that is not alphanumeric into '_'. As the strings I replace won't get very long (max 100 chars) this should be fast enough.
I also tried the external method, but I think I did something wrong, the following returns the unchanged string:
re.sub('[^a-zA-Z0-9]','WS/04','_')
It works right, but I made a mistake, having the replacement and the original string interchanged :( Andreas -- Try to get all of your posthumous medals in advance.
On Fri, Jun 27, 2003 at 01:09:27AM +0200, Andreas Pakulat wrote:
Hi,
I wanted to know if the above can be done? What I need is a function that replaces every character of a string, that is not in [a-zA-Z1-9] with an underscore. I want to use this to automatically create an Object-Id from a title, to create a new Object.
If this is not possible directly within a Script(Python), can it be done using an ExternalMethod? I suppose yes.
yes, you can use regular expressions in an External Method. I think you could also do it in a script by using the string.translate method. See the python standard lib docs. -- Paul Winkler http://www.slinkp.com Look! Up in the sky! It's POLY IDEAS X! (random hero from isometric.spaceninja.com)
It can be done in an external method, but you can also enable Script(Python) have regular expression processing capability. See the instructions in the Product code.
On Fri, Jun 27, 2003 at 01:09:27AM +0200, Andreas Pakulat wrote:
Hi,
I wanted to know if the above can be done? What I need is a function that replaces every character of a string, that is not in [a-zA-Z1-9] with an underscore. I want to use this to automatically create an Object-Id from a title, to create a new Object.
If this is not possible directly within a Script(Python), can it be done using an ExternalMethod? I suppose yes.
Andreas
If you're looking to have a "clean-zope-id" method, we use the following. A simple regex solution can sometimes forget to fix things like leading underscores, or getting rid of double underscores or such. I actually do this w/o regexes using translate(), but regexs might be faster. Feel free to benchmark and say so. ;) #!/usr/bin/env python2.1 """ConvertStringToID Converts a string into a Zope-safe ID. This removes all non-identifier safe characters. It replaces most with underscores, while trying to make the ID match a sensible choice (eg "Bill's House" -> "bills_house", not "bill_s_house"). The output is always lowercase, and any leading underscores are removed (as they would be illegal in Zope. """ import string tt = '______________________________________________._0123456789_______abcdefghijklmnopqrstuvwxy_______abcdefghijklmnopqrstuvwxyz_____________________________________________________________________________________________________________________________________' def ConvertStringToID(s, maxlen=None): """ Convert String to ID s = string to convert maxlen = maximum length of ID returns string. """ # translate most things to underscore. remove punctuation below w/o translating s = string.translate(s, tt, '!@#$%^&*()-=+,\'"') # remove ALL double-underscores while s.find("__") > -1: s = s.replace('__','_') # when we use py2.2.2, this and below can simply be s = s.strip("_"). yeah! # trim underscores off front while s.startswith("_"): s = s[1:] # trim underscores off end while s.endswith("_"): s = s[:-1] # trim to maxlength if maxlen and len(s) > maxlen: s = s[:maxlen] return s if __name__ == '__main__': assert ConvertStringToID("____A Lover's % Tale (Of 2 Cities).doc_") == "a_lovers_tale_of_2_cities.doc" HTH. -- Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton Independent Knowledge Management Consultant
On 30.Jun 2003 - 19:21:07, Joel Burton wrote:
On Fri, Jun 27, 2003 at 01:09:27AM +0200, Andreas Pakulat wrote:
Hi,
I wanted to know if the above can be done? What I need is a function that replaces every character of a string, that is not in [a-zA-Z1-9] with an underscore. I want to use this to automatically create an Object-Id from a title, to create a new Object.
If this is not possible directly within a Script(Python), can it be done using an ExternalMethod? I suppose yes.
Andreas
If you're looking to have a "clean-zope-id" method, we use the following. A simple regex solution can sometimes forget to fix things like leading underscores, or getting rid of double underscores or such. I actually do this w/o regexes using translate(), but regexs might be faster. Feel free to benchmark and say so. ;)
Thanks for that snippet. Looks really good, I'll give it a try, but I don't think that I can benchmark it in the near future, as the site won't go online so fast. Also the Strings I get are somewhat well-formated, normally having only " " or "/" in it. Andreas -- Things will be bright in P.M. A cop will shine a light in your face.
participants (6)
-
Andreas Pakulat -
Dennis Allison -
Joel Burton -
Paul Winkler -
Sergey Volobuev -
Steffen Hausmann