Urs van Binsbergen writes:
... There are however 2 things about it I do not like very much:
1) URL trailing slash handling:
http://example.com/some_doc http://example.com/some_doc/ are both valid URLs to access the method or document some_doc in the given root folder. In file-based publishing (like with apache) the second URL would be invalid, because some_doc is not a folder.
http://example.com/some_folder http://example.com/some_folder/ are both valid URLs to access the folder some_folder in the root folder. Apache would allow the first URL, but would redirect to the second, because some_folder is not a document, it is a folder.
2) recursive acquisition:
http://example.com/some_folder/some_folder/some_folder/some_folder/ is a valid URL to access the folder some_folder in the root folder.
---
WHY do I dislike these two things?
a) Philosophically: As the name "UNIQUE resource locator" already says: it is generally not good to have the same content available via different locators. Maybe, your philosophical argument is weakened when you learn that URL stands for "*UNIVERSAL* resource locator".
Its a universal syntax (!) to locate a resource accessible throuch a wide variety of protocols. It is quite common to have the same resource accessed through different URLs: often the same resource can be accessed both via HTTP and FTP, often the same (local) resource can be accessed with the "file", the "ftp" and the "http" protocol, often the same resource can be accessed via both "ftp" and "webdav" (wich is HTTP based).
b) Technically: Working with relative links becomes unreliable and dangerous. Problem #1 causes a relative URL to sometimes work and sometimes not work, depending on whether the visitor accesses "foo/bar/" or "foo/bar". Only, when you do strange strings. Usually, Zope sets the HTML base tag, such that it does not matter whether the user uses "foo/bar/" or "foo/bar".
Problem #2 makes relative links to be the door to infinite recursion. A simple link like "<a href="foo/">clickme</a>" will be the trap, where tumb spiders will loose themselves in a infinite loop (this was discussed shortly on this list under the subject "htdig indexing problem". When you use relative links in the same way you are forced to do it in a file system based publishing environment, there will be no infinite recursion. Simply avoid relative links containing a "/" not preceeded by "..". Use an absolute URL otherwise.
Experiences?
Since there are lots of Zope sites out there and I did not find big discussion on this matter until yet, am I maybe putting too much weight on it? I feel you do.
Workarounds ... - work with <base href=...> This is done at automatically unless your pages are strange..
... Other workarounds I was told: - (for problem 2): put an access-restricted subfolder with the same name into any folder - (for problem 2): disallow access to any some_folder/some_folder combinations in a robots.txt You may also learn about SiteAccess AccessRules (--> documentation on Zope.org).
Solution? ... - if the request-URL has a trailing slash, and the invoked object is not a folder: reponse 404 (even if generic Zope would serve an object then) While a file system folder is a very narrow concept, there are many folder variants in Zope. In fact, most objects in Zope can act like a folder (in the sense that they support a default presentation called "index_html").
Forget about the trailing "/" problem. Give your pages an HTML "head" element (as you should anyway) and do not include a "base" tag, then Zope will put such a tag in when it modified the URL.
- if the acquisition path invoked by the request-URL contains multiple times an identical object: reponse 404 --> SiteAccess AccessRule in your root folder.
Does this make sense? Maybe for you. I would not go this way.
Dieter