[Zope] Looping URL's

Tue Oct 12 14:53:12 EDT 2004

Am Di, den 12.10.2004 schrieb Edward Pollard um 20:01:
> We've been having a continual problem using Zope to serve as a 
> campus-wide CMS. Non technical people are plaguing the server with 
> malformed references, creating looping URLs that cause our search 
> engine to fall over dead. I presume you all know what I'm talking 
> about... the a/b/a/b/a/b URL's.
> 
> We've focused on user-training, but we really need to find a way to 
> make the server more bulletproof. I'm considering throwing a call into 
> the template that compares the request URL with the absolute_url and 
> redirects/logs when a discrepancy is found.
> 
> Does anyone have any comment on this approach, or a better 
> recommendation?

I'd parse and correct on upload. Just derive a product
from ZPT and use something like that to correct links:

import re
a=re.compile(r"<a .*?href=\"(.*?)\".*?>(.*?)</a>",
             re.DOTALL |re.IGNORECASE |re.MULTILINE)

a.findall(self.document_src())

should give you a list with tupes
where first element is the href argument
and the second element is whatever is inside
the <a>...</a>

You should check if href starts with a 
schema (http: https: mailto: ..)
or with /
every other URL is either bad or even misleading.

You could check via self.restrictedTraverse(href),
and if you get the object, use

"/"+object.absolute_url(1) to get the right URL.

in most cases, a silent replace should do.

Maybe this is something you can use to start with.

Regards
Tino