ZServer should redirect for folder objects?
Hi, I've been doing some work with Zope and I've noticed strange behavior with regards to ZServer and relative links. When a client requests an object that is a folder and the request URL does not have a trailing slash, the HTTP server should return an HTTP 301 Redirect to the correct URL with the slash, right? As an example, if I'm connecting to an Apache server: [headers not relevant to this discussion have been removed] $ lynx -head -dump http://some.apache.server/admin HTTP/1.1 301 Moved Permanently Server: Apache/1.3.14 (Unix) Location: http://some.apache.server/admin/ Content-Type: text/html $ lynx -head -dump http://some.apache.server/admin/ HTTP/1.1 200 OK Server: Apache/1.3.14 (Unix) Content-Type: text/html This is why if you send a request for http://some.apache.server/admin, when your client is done loading the URL, it will actually show http://some.apache.server/admin/. However, ZServer doesn't seem to follow this behavior: $ lynx -head -dump http://some.zope.zserver/admin HTTP/1.0 200 OK Server: Zope/Zope 2.3.3 (binary release, python 1.5.2, solaris-2.6-sparc) ZServer/1.1b1 Content-Type: text/html Content-Location: http://some.zope.zserver/admin/ $ lynx -head -dump http://some.zope.zserver/admin/ HTTP/1.0 200 OK Server: Zope/Zope 2.3.3 (binary release, python 1.5.2, solaris-2.6-sparc) ZServer/1.1b1 Content-Type: text/html Note that ZServer is returning: a) an HTTP 200 response b) a "Content-Location" header I looked up RFC 1945 (the HTTP/1.0 specification), and could not find any mention of "Content-Location". Only "Location" is defined. However, I also looked up RFC 2616 (HTTP/1.1), and there is mention of "Content-Location" there. But ZServer tells the client that it's talking HTTP/1.0, and yet returns a header from HTTP/1.1, so I believe this is incorrect behavior? This is important for relative links -- try this in ZServer. Make a /test directory, and put an <A HREF> tag linking to another file in that directory. E.g. /test/index_html: <A HREF="relative.html">relative link</A> Now, try loading up http://some.zope.zserver/test. Since ZServer does not redirect you, the URL remains as http://some.zope.zserver/test, and your link is incorrectly rendered by your client as http://some.zope.zserver/relative.html. But if you load http://some.zope.zserver/test/, the link comes out as http://some.zope.zserver/test/relative.html, which is right. Comments? Thanks! --- John Lim
John Lim writes:
I've been doing some work with Zope and I've noticed strange behavior with regards to ZServer and relative links.
When a client requests an object that is a folder and the request URL does not have a trailing slash, the HTTP server should return an HTTP 301 Redirect to the correct URL with the slash, right? Hmm?
Apache may do this, but I did not see a clause in the HTTP spec that would require this behaviour. Zope does not do it, but sets a "base" tag. This is another approach, good for Zope but difficult for Apache... Usually, this allows to correctly resolve relative links. But the magic only works, if your "index_html" has a "head" tag. Dieter
Hi!
I've been doing some work with Zope and I've noticed strange behavior with regards to ZServer and relative links.
When a client requests an object that is a folder and the request URL does not have a trailing slash, the HTTP server should return an HTTP 301 Redirect to the correct URL with the slash, right? Hmm?
Apache may do this, but I did not see a clause in the HTTP spec that would require this behaviour.
For Caches it might also be good as otherwise both objects might count as two different objects.. That's why apache is doing it, I think.
Zope does not do it, but sets a "base" tag.
which prevents from mirroring the site in some cases.. at least it used to be.. but that's another story.. ;-) cheers, Christian
Hi :-) --On Donnerstag, 26. Juli 2001 05:44 +0200 Christian Scholz <cs@comlounge.net> wrote:
Hi!
I've been doing some work with Zope and I've noticed strange behavior with regards to ZServer and relative links.
When a client requests an object that is a folder and the request URL does not have a trailing slash, the HTTP server should return an HTTP 301 Redirect to the correct URL with the slash, right? Hmm?
Apache may do this, but I did not see a clause in the HTTP spec that would require this behaviour.
For Caches it might also be good as otherwise both objects might count as two different objects.. That's why apache is doing it, I think.
Its recommended to link to the folder-objects with a traling slash anyway. The redirect above is even disable-able (what an word ;) with Apache, if someone wants it. (For a reason not transparent to me, M$IE always strips trailing slashes from history links, buts another story)
Zope does not do it, but sets a "base" tag.
you can patch the implementation of absolute_url() so it does not return the full host. (or call is absolute_url(relative=1)) (Does anyone know why absolute_url doesnt this by default?)
which prevents from mirroring the site in some cases.. at least it used to be.. but that's another story.. ;-)
This can be a feature :-) (together with the images without suffix :) Regards Tino
At 11:44 PM 7/25/2001, Christian Scholz wrote:
Apache may do this, but I did not see a clause in the HTTP spec that would require this behaviour.
Yup, I noticed that too while looking in the HTTP RFCs. I was trying to see what is "right behavior" but apparently there's no "fixed" answer. However, what about the Content-Location: header that ZServer returns? Minor issue (not sure if it even makes a difference), but shouldn't ZServer either use the Location: header if it returns "HTTP/1.0", or use Content-Location: but return "HTTP/1.1"?
For Caches it might also be good as otherwise both objects might count as two different objects.. That's why apache is doing it, I think.
That makes sense.
Zope does not do it, but sets a "base" tag.
which prevents from mirroring the site in some cases.. at least it used to be.. but that's another story.. ;-)
You hit the nail right on the head :) The reason this problem was encountered was because the webmaster of the site I'm working for uses wget to slurp the entire Zope site. We use Zope as a developmental server, from which wget creates a "mirror" loaded onto the production server. We had to put in a <BASE dummy=dummy> to prevent ZServer from putting in the <BASE HREF> tag for us, because if not wget will download the HTML files with the BASE still pointing to the dev. server. We don't want that, since we're going to publish the downloaded files on the prod. server with a different URL. I know this can be fixed after wget downloads the entire site -- use sed, perl, <insert your fave text replacement tool> etc. to parse the HTML files, but we'd rather not go that way. Right now, what we do while rendering links on the fly (DTML methods, Python etc) is to check whether the object being linked to is a folder or not. If it is, we tack on the trailing slash ourselves ;) --- John Lim
We had to put in a <BASE dummy=dummy> to prevent ZServer from putting in the <BASE HREF> tag for us, because if not wget will download the HTML files with the BASE still pointing to the dev. server. We don't want that, since we're going to publish the downloaded files on the prod. server with a different URL.
A better way to do this involves VirtualHostMonster. Its better because it takes care of <base href> absolute_url and <dtml-var BASEn> all in one go Put a VHM instance in your Zope root, then instead of... wget http://dev:8080/foo ....use.... wget http://dev:8080/VirtualHostBase/http/productionserver:80/foo (or something like that; I typed it from memory) I hope this helps. Toby Dickenson tdickenson@geminidataloggers.com
At 02:54 AM 7/26/2001, Toby Dickenson wrote:
A better way to do this involves VirtualHostMonster. Its better because it takes care of <base href> absolute_url and <dtml-var BASEn> all in one go
Put a VHM instance in your Zope root, then instead of...
wget http://dev:8080/foo
....use....
wget http://dev:8080/VirtualHostBase/http/productionserver:80/foo
Thanks, I've tried it -- it works, sort of. Now, since Zope prints a <BASE HREF> pointing to the production server, the mirroring client will get confused when rendering embedded links and items e.g. images since it will try to fetch it off the production server. Of course, I suppose you could force your client (wget or pavuk etc.) to ignore the <BASE HREF> or something. And now, back to the trailing slash issue. :) --- John Lim
Hi, --On Mittwoch, 25. Juli 2001 18:15 -0400 John Lim <zope@mail.jleh.com> wrote:
At 11:44 PM 7/25/2001, Christian Scholz wrote:
Apache may do this, but I did not see a clause in the HTTP spec that would require this behaviour.
Yup, I noticed that too while looking in the HTTP RFCs. I was trying to see what is "right behavior" but apparently there's no "fixed" answer.
Sure, becauce this is only common behavior of Apache, not caused by any standard.
However, what about the Content-Location: header that ZServer returns? Minor issue (not sure if it even makes a difference), but shouldn't ZServer either use the Location: header if it returns "HTTP/1.0", or use Content-Location: but return "HTTP/1.1"?
For Caches it might also be good as otherwise both objects might count as two different objects.. That's why apache is doing it, I think.
Its also because of relative references to work. Apache does not have acquisition :-) Normally, with a carefull designed site, there is no problem with redirects. Nobody should link to folder-urls without trailing slash. Unfortunately someone came to this idea instead of forcing the lerning curve. The result is people never know the importance and meaning of the / in URLs. (Even with redirect since it costs!)
That makes sense.
Zope does not do it, but sets a "base" tag.
which prevents from mirroring the site in some cases.. at least it used to be.. but that's another story.. ;-)
You hit the nail right on the head :) The reason this problem was encountered was because the webmaster of the site I'm working for uses wget to slurp the entire Zope site. We use Zope as a developmental server, from which wget creates a "mirror" loaded onto the production server.
Why do people always mirror? I dont see the advantage in it. If you think about caching, you will see you dont need mirroring at all but can transparently influence which element is complete dynamic and which is somewhat static by sending different expires: header. At least squid can be configured to not let people force reload from outside. Regards Tino
Hi!
Why do people always mirror? I dont see the advantage in it. If you think about caching, you will see you dont need mirroring at all but can transparently influence which element is complete dynamic and which is somewhat static by sending different expires: header. At least squid can be configured to not let people force reload from outside.
For example: One of my customer wants some snapshot of the site for the archive on CD-ROM... So how do I do that? (without giving them a full Zope install on it which is sometimes even not possible as there are sql databases etc. involved..) And in general I think the "you don't need it anyway" answer is always the solution ;-) I also think that there has been some more problems with the base tag I just don't remember them anymore ;-) (but I remember that we came to the conclusion that it's not possible without it anyway as there is some structural difference between the Zope object model and the web object model (if you can call it that way.. let's say directory structure ;-). cheers, Christian -- COM.lounge http://comlounge.net/ communication & design info@comlounge.net
Hi :-) --On Donnerstag, 26. Juli 2001 15:51 +0200 Christian Scholz <cs@comlounge.net> wrote:
Hi!
Why do people always mirror? I dont see the advantage in it. If you think about caching, you will see you dont need mirroring at all but can transparently influence which element is complete dynamic and which is somewhat static by sending different expires: header. At least squid can be configured to not let people force reload from outside.
For example: One of my customer wants some snapshot of the site for the archive on CD-ROM... So how do I do that? (without giving them a full Zope install on it which is sometimes even not possible as there are sql databases etc. involved..)
This is hard anyway. Imagine you have many forms or even have content- negotiation running :) So it might be more easy to put a complete Zope on the CD ;)
And in general I think the "you don't need it anyway" answer is always the solution ;-) I also think that there has been some more problems with the base tag I just don't remember them anymore ;-)
Sure, may be it is possible to patch absolute_url() so it does not render the full hostname (relative=1) + leading slash.
(but I remember that we came to the conclusion that it's not possible without it anyway as there is some structural difference between the Zope object model and the web object model (if you can call it that way.. let's say directory structure ;-).
Lets say object path expressed as URL ;-)) Cheers Tino :)
Hi!
Hi :-)
--On Donnerstag, 26. Juli 2001 15:51 +0200 Christian Scholz <cs@comlounge.net> wrote:
Hi!
Why do people always mirror? I dont see the advantage in it. If you think about caching, you will see you dont need mirroring at all but can transparently influence which element is complete dynamic and which is somewhat static by sending different expires: header. At least squid can be configured to not let people force reload from outside.
For example: One of my customer wants some snapshot of the site for the archive on CD-ROM... So how do I do that? (without giving them a full Zope install on it which is sometimes even not possible as there are sql databases etc. involved..)
This is hard anyway. Imagine you have many forms or even have content- negotiation running :) So it might be more easy to put a complete Zope on the CD ;)
Well, they don't need the forms for the archive and even not running forms. They are more interested in the content.. But the point is anyway that it should work regardless if it makes sense or not in some situations (IMHO) ;)
And in general I think the "you don't need it anyway" answer is always the solution ;-) I also think that there has been some more problems with the base tag I just don't remember them anymore ;-)
Sure, may be it is possible to patch absolute_url() so it does not render the full hostname (relative=1) + leading slash.
Hm, in ZWiki or WikiForNow I always remove the relative=1 from wiki_base_url() in order to make it work again with the virtual site root. But haven't thought about it more than that and what it means for the base tag.. (I also looked once into traverse() but somehow it said not that much to me.. ;-)
(but I remember that we came to the conclusion that it's not possible without it anyway as there is some structural difference between the Zope object model and the web object model (if you can call it that way.. let's say directory structure ;-).
Lets say object path expressed as URL ;-))
well, in case of Zope yes, in case of Apache not really.. (though the HTTP spec like to talk about itself in form of a object protocol.. ;-) -- christian -- COM.lounge http://comlounge.net/ communication & design info@comlounge.net
Hi ;) --On Donnerstag, 26. Juli 2001 17:21 +0200 Christian Scholz <cs@comlounge.net> wrote:
Hi!
Hi :-)
--On Donnerstag, 26. Juli 2001 15:51 +0200 Christian Scholz <cs@comlounge.net> wrote:
Hi!
Why do people always mirror? I dont see the advantage in it. If you think about caching, you will see you dont need mirroring at all but can transparently influence which element is complete dynamic and which is somewhat static by sending different expires: header. At least squid can be configured to not let people force reload from outside.
For example: One of my customer wants some snapshot of the site for the archive on CD-ROM... So how do I do that? (without giving them a full Zope install on it which is sometimes even not possible as there are sql databases etc. involved..)
This is hard anyway. Imagine you have many forms or even have content- negotiation running :) So it might be more easy to put a complete Zope on the CD ;)
Well, they don't need the forms for the archive and even not running forms. They are more interested in the content.. But the point is anyway that it should work regardless if it makes sense or not in some situations (IMHO) ;)
Sure, but I think thats not a problem of the server but an issue for the mirror tool. Wget might be a bit simplicistic for this task. There are other options like pavuk for example. Regards :-) Tino
Hi!
Sure, but I think thats not a problem of the server but an issue for the mirror tool. Wget might be a bit simplicistic for this task. There are other options like pavuk for example.
Ok, I think, we're repeating the same discussion now as before, so we might stop here (and pavuk wasn't working as well.. it's really a "problem" of Zope's advanced object model and thus changing all the other tool is not really the option ;-) (hm, didn't I say, I wanted to stop? ;-) cheers, Christian -- COM.lounge http://comlounge.net/ communication & design info@comlounge.net
Dieter Maurer wrote:
Zope does not do it, but sets a "base" tag. This is another approach, good for Zope but difficult for Apache... Usually, this allows to correctly resolve relative links. But the magic only works, if your "index_html" has a "head" tag.
Hopefully this will go away never to return in the 'New Religion' :-S Chris
Chris Withers writes:
Dieter Maurer wrote:
Zope does not do it, but sets a "base" tag. This is another approach, good for Zope but difficult for Apache... Usually, this allows to correctly resolve relative links. But the magic only works, if your "index_html" has a "head" tag.
Hopefully this will go away never to return in the 'New Religion' :-S Why?
It is much better than broken relative links in "index_html" or the nice ":method" actions. You know: * relative links are better than absolute ones because you can relocate your site without changes * working relative links (due to a correct "base" tag) are better than broken ones. Dieter
participants (6)
-
Chris Withers -
cs@comlounge.net -
Dieter Maurer -
John Lim -
Tino Wildenhain -
Toby Dickenson