We've been having a continual problem using Zope to serve as a campus-wide CMS. Non technical people are plaguing the server with malformed references, creating looping URLs that cause our search engine to fall over dead. I presume you all know what I'm talking about... the a/b/a/b/a/b URL's. We've focused on user-training, but we really need to find a way to make the server more bulletproof. I'm considering throwing a call into the template that compares the request URL with the absolute_url and redirects/logs when a discrepancy is found. Does anyone have any comment on this approach, or a better recommendation? --- Edward J. Pollard, B.Sc Webmaster, University of Lethbridge
Edward Pollard wrote:
Does anyone have any comment on this approach, or a better recommendation?
I use absolute urls because this looping issue made me going mad some years ago. Switching domains (the only true reason for relative urls) does not happen as often as people believe...;) Cheers, Maik
Am Di, den 12.10.2004 schrieb Edward Pollard um 20:01:
We've been having a continual problem using Zope to serve as a campus-wide CMS. Non technical people are plaguing the server with malformed references, creating looping URLs that cause our search engine to fall over dead. I presume you all know what I'm talking about... the a/b/a/b/a/b URL's.
We've focused on user-training, but we really need to find a way to make the server more bulletproof. I'm considering throwing a call into the template that compares the request URL with the absolute_url and redirects/logs when a discrepancy is found.
Does anyone have any comment on this approach, or a better recommendation?
I'd parse and correct on upload. Just derive a product from ZPT and use something like that to correct links: import re a=re.compile(r"<a .*?href=\"(.*?)\".*?>(.*?)</a>", re.DOTALL |re.IGNORECASE |re.MULTILINE) a.findall(self.document_src()) should give you a list with tupes where first element is the href argument and the second element is whatever is inside the <a>...</a> You should check if href starts with a schema (http: https: mailto: ..) or with / every other URL is either bad or even misleading. You could check via self.restrictedTraverse(href), and if you get the object, use "/"+object.absolute_url(1) to get the right URL. in most cases, a silent replace should do. Maybe this is something you can use to start with. Regards Tino
Better recommendation? Who knows? This is different: I put a folder menu in a DTML document in a format suitable for passing to a python script. The menu document looks like this: <dtml-var expr="MenuGen([ ['index_html', 'Home'], ['people/', 'People'], ])"> The python script (MenuGen) gets the url and makes it absolute: theURL = context[url].absolute_url() I have a form to edit the menu document so that users can't get the format wrong. The form only allows links to its immediate child templates and folders. If someone puts in a dud link in the body of a page, for example from people/index_html to people/teachers/ where teachers does not have an index_html document, I guess I would have your problem. So I have a folder generation mechsnism that makes it easy for users to fill out a form to get a fully functional folder. My experience is that training won't work! Cliff Edward Pollard wrote:
We've been having a continual problem using Zope to serve as a campus-wide CMS. Non technical people are plaguing the server with malformed references, creating looping URLs that cause our search engine to fall over dead. I presume you all know what I'm talking about... the a/b/a/b/a/b URL's.
We've focused on user-training, but we really need to find a way to make the server more bulletproof. I'm considering throwing a call into the template that compares the request URL with the absolute_url and redirects/logs when a discrepancy is found.
Does anyone have any comment on this approach, or a better recommendation?
--- Edward J. Pollard, B.Sc Webmaster, University of Lethbridge
_______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
I had a similar problem a few months ago. One spider in particular was consuming significant amounts of bandwidth looking for URLs like this. The solution I used was to put a base href tag in the header template. -- David On Oct 12, 2004, at 11:01 AM, Edward Pollard wrote:
We've been having a continual problem using Zope to serve as a campus-wide CMS. Non technical people are plaguing the server with malformed references, creating looping URLs that cause our search engine to fall over dead. I presume you all know what I'm talking about... the a/b/a/b/a/b URL's.
participants (5)
-
Cliff Ford -
David Siedband -
Edward Pollard -
Maik Jablonski -
Tino Wildenhain