[Ed Colmar]
Hey Tom
Thanks for the reply...
Those backslashes are for escaping the special characters (\w and .). Do they need to be doubled in this case?
Yes, they are for escapeing the special characters once they get to the regular expression, but they have to get there first. They have to be doubled, or an alternative is HTMLFILE=r'/\\w*\\.html' htmlfile=re.compile(HTMLFILE) Here the "r" indicates for Python to use the "raw" string, and not to excape the backslashes (at least it used to be this way - I'm not quite sure about 2.2).
This still is not working for me
### I want: http://www.the.net/bigfolder/ ### import re url = "http://www.the.net/bigfolder/somepage.html" htmlfile = re.compile("/\\w*\\.html") m = htmlfile.match(url) if m: folder_url = htmlfile.sub(url, "/")
I'm also trying different variations to try and get a match. None of these are working either: htmlfile = re.compile("/.*$") (this one should really be working yes?) htmlfile = re.compile("[a-z]*$") htmlfile = re.compile("\w*$")
the only match I can make is this (which will match anything): htmlfile = re.compile(".*$")
I suggest you do print url matches=htmlfile.findall(url) print matches or from pprint import pprint pprint(matches) You can best work this out in regular python, then copy the working code into your Zope script. This will show you exactly what the match found. Regular expressions are notoriously hard to get working right (not Python's fault, that's just how they are), don't feel bad. You need to get more systematic about debugging - check every step of the way to make sure you understand what is going on, and read the docs for the re library. Cheers, Tom P