Zope 2.7.0 b3 regressions
Hi! Migrating a CMF Site from Zope 2.6 to Zope 2.7.0 b3, I stumbled over these two Zope 2.7 issues: 1.) absolute_url(relative=1) behaves different ---------------------------------------------- 'relative' was changed from 'relative to site object' to 'relative to server root'. This is an API change and breaks Products like CMF. See <http://zope.org/Collectors/Zope/809> I propose to revert this change. 2.) reindexIndex not 100% backwards compatible ---------------------------------------------- CMF's CatalogTool inherits from ZCatalog and overrides catalog_object. Zope's reindexIndex fails because CMF doesn't implement the new catalog_object API. See <http://zope.org/Collectors/Zope/1134> I propose to add a capability check in reindexIndex. I'd volunteer to fix these issues in CVS as proposed, but maybe the people who made these changes still feel responsible for their code or there are objections regarding the proposed fixes. Any feedback is welcome. Cheers, Yuppie
Eep. Maybe CMF's overridden catalog should just be given a reindexIndex method instead of doing a capability check in Zope? More broadly, is it worth embedding the capabilities check (which can never, ever go away) into Zope itself or would it be better to change CMF to deal with the API change? On Wed, 2003-12-03 at 06:53, Yuppie wrote:
Hi!
Migrating a CMF Site from Zope 2.6 to Zope 2.7.0 b3, I stumbled over these two Zope 2.7 issues:
1.) absolute_url(relative=1) behaves different ----------------------------------------------
'relative' was changed from 'relative to site object' to 'relative to server root'. This is an API change and breaks Products like CMF.
See <http://zope.org/Collectors/Zope/809>
I propose to revert this change.
2.) reindexIndex not 100% backwards compatible ----------------------------------------------
CMF's CatalogTool inherits from ZCatalog and overrides catalog_object. Zope's reindexIndex fails because CMF doesn't implement the new catalog_object API.
See <http://zope.org/Collectors/Zope/1134>
I propose to add a capability check in reindexIndex.
I'd volunteer to fix these issues in CVS as proposed, but maybe the people who made these changes still feel responsible for their code or there are objections regarding the proposed fixes.
Any feedback is welcome.
Cheers, Yuppie
Hi! Chris McDonough wrote:
Eep. Maybe CMF's overridden catalog should just be given a reindexIndex method instead of doing a capability check in Zope? More broadly, is it worth embedding the capabilities check (which can never, ever go away) into Zope itself or would it be better to change CMF to deal with the API change?
Why can't the capabilities check go away in a future release? We could add a deprecation warning in reindexIndex in case it detects the old API. And of course CMF has to implement the new API. This is on the todo list: <http://collector.zope.org/CMF/206> But is it worth to have a CMF 1.4.3 release just to fix this issue? Cheers, Yuppie
On Wed, 2003-12-03 at 07:33, Yuppie wrote:
Why can't the capabilities check go away in a future release? We could add a deprecation warning in reindexIndex in case it detects the old API.
That's true.
And of course CMF has to implement the new API. This is on the todo list: <http://collector.zope.org/CMF/206>
But is it worth to have a CMF 1.4.3 release just to fix this issue?
Probably not, at least if your Zope checkin mentions the reason for the capabilities test and the deprecation warning and maybe the earliest date after which the capabilities check could be removed. It would be good to put this in the code itself, so we know why the capabilities check exists next year when reading the code... does that make sense to you? - C
Chris McDonough wrote:
On Wed, 2003-12-03 at 07:33, Yuppie wrote:
But is it worth to have a CMF 1.4.3 release just to fix this issue?
Probably not, at least if your Zope checkin mentions the reason for the capabilities test and the deprecation warning and maybe the earliest date after which the capabilities check could be removed. It would be good to put this in the code itself, so we know why the capabilities check exists next year when reading the code... does that make sense to you?
Sounds good. I'll make the Zope checkin regarding this issue within the next days. Yuppie
1.) absolute_url(relative=1) behaves different ----------------------------------------------
'relative' was changed from 'relative to site object' to 'relative to server root'. This is an API change and breaks Products like CMF.
See <http://zope.org/Collectors/Zope/809>
I propose to revert this change.
+1 from me. The original proposal has been to implement a separate method for this; maybe this should be done insetad. I thought the point to break backwards compatibility and to get rid of all the old cruft has been choosen to be Zope3, not Zope2.7, isn't it ? ;-) Cheers, Clemens
From: "Yuppie" <schubbe@web.de>
1.) absolute_url(relative=1) behaves different ----------------------------------------------
'relative' was changed from 'relative to site object' to 'relative to server root'. This is an API change and breaks Products like CMF.
See <http://zope.org/Collectors/Zope/809>
I propose to revert this change.
+1. I added a comment to the Issue, and are copying it here FYI: There is no reason to change absolute_url. If you use CMF, there is the portal_url tool, which could need some enhancing, btw, and if not, you need to make your own set of methods. Methods you sooner or later will need are (the exact naming is optional): virtualRootPath(ob): Returns the path to the virtual root. For example: ('', 'mysite.com') physicalPath(ob): Returns the full path to the object. For example: ('', 'mysite.com', 'path', 'to', 'object') //Note: This already exists as obj.getPhysicalPath()... virtalPath(ob): Returns the path from the virtual root. For example: ('', 'path', 'to', 'object') (Note that it starts with a '') path2url(path): For the above paths returns '/mysite.com', '/mysite.com/path/to/object' and '/path/to/object' respectively. After you have these, you can forget about absolute_url. :) I have yet encountered a specific need to return 'http://www.mysite.com/<anything>' because the browser will typically add that automatically anyway when you return '/<anything>'. These methods SHOULD really be a part of the VirtualHostMonster, but they aren't. People are welcome to fix this. :-) Since the VirtualHostMonster determines the virtual root dynamically per request, I don't know how to implement virtualRootPath and virtualPath... I guess you would need to get the virtual root from the request, somehow. You'll still end up with the situation that you want the virtual root for objects that are under other virtual roots that the one the current request is using, but that is an unsolvable problem with the current VirtualHostMonster. //Lennart
Yuppie! No, no and 3 times no! The fix was done by Evan and is CORRECT. absolute_url () does not (and should not!) know anything about CMF or portals or whatever else! It MUST however return correct results in all possible VH situations and this is what the fix addresses. Please forget about my attempt to correct the situation by adding a new method - that was nonsense. The real problem is - and this is stated in the original report - that absolute_url(1) did return WRONG RESULTS when inside-out vhosting was in use. This has bitten me on several occasions when customers deployed their sites with the CMF portal NOT living in the root of the vhost (as opposed to the root of Zope) and SERIOUS breakage occurred all over their sites. If you need anything CMF specific use the portal_url tool. I do not see why a basic infrastructure method like absolute_url() should know anything about portals at all. So -100 for unfixing things you don't seem to properly understand. Stefan --On Mittwoch, 03. Dezember 2003 12:53 +0100 Yuppie <schubbe@web.de> wrote:
1.) absolute_url(relative=1) behaves different ----------------------------------------------
'relative' was changed from 'relative to site object' to 'relative to server root'. This is an API change and breaks Products like CMF.
See <http://zope.org/Collectors/Zope/809>
I propose to revert this change.
-- The time has come to start talking about whether the emperor is as well dressed as we are supposed to think he is. /Pete McBreen/
If you need anything CMF specific use the portal_url tool. I do not see why a basic infrastructure method like absolute_url() should know anything about portals at all.
I have to admit I did not look deeply, but Stefan's notion that absolute_url is a basic infrastructure method that should not have to know about portals is correct. The portal_url tool was specifically created to provide you with paths and URLs that are relevant to the CMF site - if there are any problems the URLTool needs to be extended or fixed to address that. jens
Hi Stefan! Stefan H. Holek wrote:
No, no and 3 times no! The fix was done by Evan and is CORRECT. absolute_url () does not (and should not!) know anything about CMF or portals or whatever else!
'relative to site object' is quoted from the API documentation of absolute_url(), see <http://www.zope.org/Documentation/Books/ZopeBook/2_6Edition/AppendixB.stx>. 'site object' in this context is the Zope application object and has nothing to do with a CMF Site or whatever else.
It MUST however return correct results in all possible VH situations and this is what the fix addresses.
Yes. But the correct result is what the API documentation defines. To get what you want you have to add BASEPATH1 defined as "the externally visible path to the root Zope folder" alias 'Zope application object' alias 'site object'. Look for example at OFS/dtml/main.dtml
The real problem is - and this is stated in the original report - that absolute_url(1) did return WRONG RESULTS when inside-out vhosting was in use. This has bitten me on several occasions when customers deployed their sites with the CMF portal NOT living in the root of the vhost (as opposed to the root of Zope) and SERIOUS breakage occurred all over their sites.
That's exactly the scenario where I discovered the API change. But it didn't fix anything, it broke at least the icon paths. Cheers, Yuppie
From: "Stefan H. Holek" <stefan@epy.co.at>
It MUST however return correct results in all possible VH situations and this is what the fix addresses.
Well, the problem is determining what the heck is "correct". :-) Which is why you need a whole bunch of methods. absolute_url seems confused on which of those it is, and the whole idea that you call a method called absolute_url with a parameter called relative is just beyond bizarre. :-)
Please forget about my attempt to correct the situation by adding a new method - that was nonsense.
No. That's the right solution. And not one, but a whole bunch. :-) But I don't mind "fixing" absolute_url too. I just think it's not properly defined what it should return, and hence it should be deprecated and replaced with something else, that doesn't have this problem.
Please excuse my impatiance, but sometimes I just think it's obvious that I'm right, and that people don't listen. My experience of this is that I'm wrong in at least half of the cases, so that is probably what has happened now too. However, I took a look at the issue, and ended up with the following new methods on Traversable: getVirtualRoot__roles__=None # Public def getVirtualRoot(self): try: req = self.REQUEST rpp = req.get('VirtualRootPhysicalPath', ('',)) return rpp except AttributeError: return ('',) getVirtualPath__roles__=None # Public def getVirtualPath(self): root = self.getVirtualRoot()[1:] # No point in including the root path = self.getPhysicalPath()[1:] for each in root: if path[0] == each: path = path[1:] else: break # And then we add the root again: return ('',) + path path2url__roles__=None # Public def path2url(self, path): return '/'.join(path) I will check this into head this evening, and unless people scream tomorrow I will check it into the 2.7 branch. With the above methods there will be little use of absolute_url, and you will be able to rely on the answers. Portal_url may need to rely on these methods, I'll look into that later. Quick, brutal, efficient, and usually dead wrong. That's me. :-) //Lennart
Hi Lennart! Lennart Regebro wrote:
def getVirtualRoot(self):
[...]
def getVirtualPath(self):
How are these related to URLPATHn, BASEPATHn? I'm to lazy to figure it out myself;)
Quick, brutal, efficient, and usually dead wrong. That's me. :-)
//Lennart
Quick? <http://mail.zope.org/pipermail/zope-dev/2001-December/014601.html> Please be careful with method names that might already be in use in some products. Google says Silva uses a getVirtualRoot() method. Why not using REQUEST variables? Cheers, Yuppie
[...]
Please be careful with method names that might already be in use in some products. Google says Silva uses a getVirtualRoot() method. Why not using REQUEST variables?
... which is defined in an "adapter"-style class which is not implementing Traversable itself, so there is no conflict here. But thanks for checking this, anyway. :) Cheers, Clemens
From: "Clemens Robbenhaar" <robbenhaar@espresto.com>
Please be careful with method names that might already be in use in some products. Google says Silva uses a getVirtualRoot() method.
... which is defined in an "adapter"-style class which is not implementing Traversable itself, so there is no conflict here.
But thanks for checking this, anyway. :)
And even if it was, your implementation would reasonably override Traversable, and there would be no breakage...
From: "Yuppie" <schubbe@web.de>
Quick?
Yeah, yeah. I was fast once I actually did it. :-)
Please be careful with method names that might already be in use in some products. Google says Silva uses a getVirtualRoot() method.
And EasyPublisher uses all of these already.
Why not using REQUEST variables?
Because it would be wrong, ugly, inconsistent with GetPhysicalPath and/or complicated to implement? :-) To explain them I'll use Evans examples (without testing so I could be wrong): http://localhost:8080/temp_folder/test getVirtualRoot(): ('',) getVirtualPath(): ('', 'temp_folder', 'test') getPhysicalPath(): ('', 'temp_folder', 'test') http://localhost:8080/VirtualHostBase/http/www.example.com:80/temp_folder/te... getVirtualRoot(): ('',) getVirtualPath(): ('', 'temp_folder', 'test') getPhysicalPath(): ('', 'temp_folder', 'test') http://localhost:8080/VirtualHostBase/http/www.example.com:80/temp_folder/Vi... getVirtualRoot(): ('', 'temp_folder') getVirtualPath(): ('', 'test') getPhysicalPath(): ('', 'temp_folder', 'test') http://localhost:8080/VirtualHostBase/http/www.example.com:80/temp_folder/Vi... getVirtualRoot(): ('', 'temp_folder') getVirtualPath(): ('', 'test') getPhysicalPath(): ('', 'temp_folder', 'test') I think.... :-)
Lennart Regebro wrote:
I will check this into head this evening, and unless people scream tomorrow I will check it into the 2.7 branch.
Please hold off. I've been meaning to revisit this for a while, and I have a bit of time to do so today and tomorrow. Also, virtual hosting is properly the domain of the request object, not the object being traversed. This is why the modified absolute_url() uses REQUEST.physicalPathToURL. Yuppie wrote:
'relative to site object' is quoted from the API documentation of absolute_url()
The API documentation is incorrect, and the docstring in the method is correct: '''Return a canonical URL for this object based on its physical containment path, possibly modified by virtual hosting. If the optional 'relative' argument is true, only return the path portion of the URL.''' "Relative" in this context refers to the concept of a "relative path" as used in rfc1808, not to a relationship with a Zope object. It is meant for use in situations such as redirection to a secure page from an insecure one (eg. 'https://example.com' + target.absolute_url(1)) where you would otherwise have to generate the complete URL and then break it apart. Current behavior looks like this: http://localhost:8080/temp_folder/test absolute_url( ): http://localhost:8180/temp_folder/test absolute_url(1): temp_folder/test http://localhost:8080/VirtualHostBase/http/www.example.com:80/temp_folder/te... absolute_url( ): http://www.example.com/temp_folder/test absolute_url(1): temp_folder/test http://localhost:8080/VirtualHostBase/http/www.example.com:80/temp_folder/Vi... absolute_url( ): http://www.example.com/test absolute_url(1): test http://localhost:8080/VirtualHostBase/http/www.example.com:80/temp_folder/Vi... absolute_url( ): http://www.example.com/foo/test absolute_url(1): foo/test This is entirely consistent, predictable, and easy to explain. The problem you are encountering is almost certainly due to a use of absolute_url where it shouldn't be used, or is used incorrectly. Cheers, Evan @ 4-am
Hi Evan! Evan Simpson wrote:
Yuppie wrote:
'relative to site object' is quoted from the API documentation of absolute_url()
The API documentation is incorrect, and the docstring in the method is correct:
'''Return a canonical URL for this object based on its physical containment path, possibly modified by virtual hosting. If the optional 'relative' argument is true, only return the path portion of the URL.'''
"Relative" in this context refers to the concept of a "relative path" as used in rfc1808, not to a relationship with a Zope object. It is meant for use in situations such as redirection to a secure page from an insecure one (eg. 'https://example.com' + target.absolute_url(1)) where you would otherwise have to generate the complete URL and then break it apart.
You introduced that concept in Zope 2.7. The method docstring is part of your change. Before Zope 2.7, absolute_url was defined different, worked different and was used different in products maintained by ZC.
Current behavior looks like this:
http://localhost:8080/temp_folder/test absolute_url( ): http://localhost:8180/temp_folder/test absolute_url(1): temp_folder/test
http://localhost:8080/VirtualHostBase/http/www.example.com:80/temp_folder/te...
absolute_url( ): http://www.example.com/temp_folder/test absolute_url(1): temp_folder/test
http://localhost:8080/VirtualHostBase/http/www.example.com:80/temp_folder/Vi...
absolute_url( ): http://www.example.com/test absolute_url(1): test
http://localhost:8080/VirtualHostBase/http/www.example.com:80/temp_folder/Vi...
absolute_url( ): http://www.example.com/foo/test absolute_url(1): foo/test
This is entirely consistent, predictable, and easy to explain. The problem you are encountering is almost certainly due to a use of absolute_url where it shouldn't be used, or is used incorrectly.
I don't think the old API was better. I'm just saying that you changed the API in a way that is not backwards compatible. I encountered the problem with a plain new CMF Site. And the use of absolute_url is consistent with Zope 2.6 API and implementation. So I don't think it's a problem of incorrect use. Cheers, Yuppie
Yuppie wrote:
You introduced that concept in Zope 2.7. The method docstring is part of your change. Before Zope 2.7, absolute_url was defined different, worked different and was used different in products maintained by ZC. [snip] I don't think the old API was better. I'm just saying that you changed the API in a way that is not backwards compatible. I encountered the problem with a plain new CMF Site.
Gotcha. Grepping Zope's source and the Products I had to hand showed only one use of absolute_url(1), in Draft.py, so I hoped that making the implementation sane wouldn't affect too much. Looking at the 1.4 branch of CMF, I see it in three places: 1. DiscussionTool.py uses it when looking up replies. This looks like a non-issue for new or properly converted discussions in 1.4. 2. SkinsTool.py uses it to construct skin cookies. 3. Any caller of URLTool that passes 'relative=1' to it. I can only find one of these, namely getIcon() in DynamicType.py. Is #3 likely to be the cause of the problem you're seeing? Can you be more specific about the circumstances of the problem? Cheers, Evan @ 4-am
Hi Evan! Evan Simpson wrote:
Gotcha. Grepping Zope's source and the Products I had to hand showed only one use of absolute_url(1), in Draft.py, so I hoped that making the implementation sane wouldn't affect too much.
Looking at the 1.4 branch of CMF, I see it in three places:
1. DiscussionTool.py uses it when looking up replies. This looks like a non-issue for new or properly converted discussions in 1.4.
2. SkinsTool.py uses it to construct skin cookies.
3. Any caller of URLTool that passes 'relative=1' to it. I can only find one of these, namely getIcon() in DynamicType.py.
Is #3 likely to be the cause of the problem you're seeing? Can you be more specific about the circumstances of the problem?
Yes. getIcon() is the cause of the problem I see: To access the ZMI I use this Apache rule: ProxyPass /zope27 http://localhost:8080/VirtualHostBase/http/example.org:80/VirtualHostRoot/_v... getIcon() for a folder in myCMFSite returns 'zope27/myCMFSite/folder_icon.gif' (was 'myCMFSite/folder_icon.gif' in Zope 2.6) OFS/dtml/main.dtml adds BASEPATH1, so the URL is '/zope27/zope27/myCMFSite/folder_icon.gif' (would be '/zope27/myCMFSite/folder_icon.gif' in Zope 2.6) Zope doesn't know anything about the name 'zope27' and returns 'Resource not found'. The icon URLs are also broken inside the CMF interface, so we would need a CMF 1.4.3 release to get this working with Zope 2.7. Grepping the products on my disk I found some files using absolute_url(1), especially in CMFDeployment. I have no idea if your change fixes or breaks these products. Please let me know if you need further information. Cheers, Yuppie
Yuppie wrote:
Yes. getIcon() is the cause of the problem I see:
To access the ZMI I use this Apache rule: ProxyPass /zope27 http://localhost:8080/VirtualHostBase/http/example.org:80/VirtualHostRoot/_v...
getIcon() for a folder in myCMFSite returns 'zope27/myCMFSite/folder_icon.gif' (was 'myCMFSite/folder_icon.gif' in Zope 2.6)
OFS/dtml/main.dtml adds BASEPATH1, so the URL is '/zope27/zope27/myCMFSite/folder_icon.gif' (would be '/zope27/myCMFSite/folder_icon.gif' in Zope 2.6)
Based on this, and on a lot of back-burner pondering, I'm now thinking that the proper fix for this is the one you suggest. It makes sense for the relative version of the absolute path to omit BASE1, the URL of the virtual root, returning the semantics of absolute_path(1) to "path relative to the virtual root". Use cases that need a hostname-relative URL can use BASEPATH1 + absolute_url(1). Cheers, Evan @ 4-am
Evan, absolute_url(1) was broken (by my definition of broken) basically since the introduction of VHM, which means the better part of 2 years. Naturally, there is code now that relies on this (broken) behavior. This does however not mean it should not be fixed! The ugly part is that the behavior of absolute_url(1) changes suddenly when the Vhost configuration starts to include inside-out parts (_vh_xyz). This means that is is possible to break (seemingly) working code by reconfiguring Apache. :-( I had some very bad experiences with big packages like CPS2 that suddenly exploded in my face at the worst possible time (deployment at the customer's site). The idiom '/'+absolute_url(1) to get the path part of an object's URL is *very* common, and as luck will have it *works absolutely fine* as long as inside-out hosting is not present. So this error usually goes undetected and creeps all over people's code. I'd be willing to bet that it is possible to break other packages like, say, Plone simply by changing Vhost configs as well ;-). Note that this is one of my main points: It will be of little use to document usage of BASEPATH1+absolute_url(1) when '/'+absolute_url(1) appears to work (until it is far too late). Once you have a big package poisoned like this, all you can basically do is monkey-patch absolute_url() which is what I had to do on several occasions. So by my definition, the URL (relative or not) should *always* include eventual _vh_xyz parts. If what one really needs is related to the physical layout of the ZODB, there is always getPhysicalPath(). URLs are in fact just some whack attributes of objects, and objects can have more than one URL at any time, depending on Vhost configs *only*. URLs are a function of the current REQUEST (traversal) and do represent little information with regard to an object's location in the ZODB. I see the main issue here in that the concepts of URL and physical location are not well separated (CMF's getIcon() attempting to use URLs to locate objects for example). Should this be your last word on this I am with Lennart in that we have to think about a whole new class of API methods for URL information. Regards, Stefan P.S.: I have written a bunch of regression tests for absolute_url behavior over the weekend and if nobody tells me otherwise am going to check them into Products/SiteAccess/tests. On Montag, Dez 8, 2003, at 07:53 Europe/Vienna, Evan Simpson wrote:
Yuppie wrote:
Yes. getIcon() is the cause of the problem I see: To access the ZMI I use this Apache rule: ProxyPass /zope27 http://localhost:8080/VirtualHostBase/http/example.org:80/ VirtualHostRoot/_vh_zope27 getIcon() for a folder in myCMFSite returns 'zope27/myCMFSite/folder_icon.gif' (was 'myCMFSite/folder_icon.gif' in Zope 2.6) OFS/dtml/main.dtml adds BASEPATH1, so the URL is '/zope27/zope27/myCMFSite/folder_icon.gif' (would be '/zope27/myCMFSite/folder_icon.gif' in Zope 2.6)
Based on this, and on a lot of back-burner pondering, I'm now thinking that the proper fix for this is the one you suggest. It makes sense for the relative version of the absolute path to omit BASE1, the URL of the virtual root, returning the semantics of absolute_path(1) to "path relative to the virtual root". Use cases that need a hostname-relative URL can use BASEPATH1 + absolute_url(1).
Cheers,
Evan @ 4-am
-- The time has come to start talking about whether the emperor is as well dressed as we are supposed to think he is. /Pete McBreen/
On Mon, 2003-12-08 at 09:35, Stefan H. Holek wrote:
[...] So by my definition, the URL (relative or not) should *always* include eventual _vh_xyz parts. If what one really needs is related to the physical layout of the ZODB, there is always getPhysicalPath().
+1
URLs are in fact just some whack attributes of objects, and objects can have more than one URL at any time, depending on Vhost configs *only*. URLs are a function of the current REQUEST (traversal) and do represent little information with regard to an object's location in the ZODB.
+1
I see the main issue here in that the concepts of URL and physical location are not well separated (CMF's getIcon() attempting to use URLs to locate objects for example).
IMHO, this is broken behaviour. If you try to use an URL to locate an object, the only sane behaviour is to feed this URL to an URL api (probably in the REQUEST object) to get it mapped to a physical path.
Should this be your last word on this I am with Lennart in that we have to think about a whole new class of API methods for URL information.
I think this should be done anyway, because of backward compatibility problems. Really, I think it's ok if at some point we simply say "hey, now we'll use this new API because that old API was broken and people relied on the broken behaviour". This is certainly better than pulling people's rug under their feet. We could then start deprecating the old API and eventually pull it away, if the arrival of Zope3 doesn't obviate it anyway :-)
P.S.: I have written a bunch of regression tests for absolute_url behavior over the weekend and if nobody tells me otherwise am going to check them into Products/SiteAccess/tests.
+5 Yes, please! As the author of ASP404, I'd really like to be able to rely on Zope's virtual hosting behaviour. -- Ideas don't stay in some minds very long because they don't like solitary confinement.
Leonardo Rochael Almeida wrote:
IMHO, this is broken behaviour. If you try to use an URL to locate an object, the only sane behaviour is to feed this URL to an URL api (probably in the REQUEST object) to get it mapped to a physical path.
(Un)RestrictedTraverse can do this, right? Or does that require the full physical Path/URL? On Mon, 2003-12-08 at 09:35, Stefan H. Holek wrote:
P.S.: I have written a bunch of regression tests for absolute_url behavior over the weekend and if nobody tells me otherwise am going to check them into Products/SiteAccess/tests.
I think this is a good idea. In any case I'd like them, to adapt them to the getXXPath API's, which I think I'll check in tomorrow. They are really trivial methods, but it would be good to have unit tests for API documentation. :) //Lennart
On Mon, Dec 08, 2003 at 12:35:38PM +0100, Stefan H. Holek wrote:
Note that this is one of my main points: It will be of little use to document usage of BASEPATH1+absolute_url(1) when '/'+absolute_url(1) appears to work (until it is far too late).
As a frequent (ab)user of '/'+absolute_url(1), which did indeed bite me when i deployed to an "inside out" apache setup, I thought I'd try this out... I think you meant BASEPATH1+'/'+absolute_url(1)? I put this in a page template called test_abs_url: <p>Typical relative path using absolute_url(1): <span tal:replace="python:'/' + here.absolute_url(1)" /> </p> <p>BASEPATH1 is: <span tal:replace="request/BASEPATH1" /> </p> <p>Better relative path using BASEPATH1 and absolute_url(1): <span tal:replace="python:request['BASEPATH1']+here.absolute_url(1)" /> </p> If I visit this at http://localhost:8080/ctimi/about/test_abs_url, I get: Typical relative path using absolute_url(1): /ctimi/about BASEPATH1 is: Better relative path using BASEPATH1 and absolute_url(1): ctimi/about ^^ note, no leading slash If I visit http://localhost:18080/VirtualHostBase/http/www.foobar.com:80/VirtualHostRoo... I get this: Typical relative path using absolute_url(1): /ctimi/about BASEPATH1 is: /foo Better relative path using BASEPATH1 and absolute_url(1): /fooctimi/about ... definitely not right. But your point is made - '/'+absolute_url(1) is clearly inadequate too. If I change the template to use request['BASEPATH1']+'/'+here.absolute_url(1), then I get this: Typical relative path using absolute_url(1): /ctimi/about BASEPATH1 is: /foo Better relative path using BASEPATH1 and absolute_url(1): /foo/ctimi/about ... which looks correct to me. -- Paul Winkler http://www.slinkp.com Look! Up in the sky! It's CHEESY ENGINEER! (random hero from isometric.spaceninja.com)
Paul Winkler wrote:
As a frequent (ab)user of '/'+absolute_url(1), which did indeed bite me when i deployed to an "inside out" apache setup, I thought I'd try this out... I think you meant BASEPATH1+'/'+absolute_url(1)?
I would like to know: 1. Exactly what is an "inside out" apache setup. 2. What is the result you want?
On Mon, Dec 08, 2003 at 05:58:12PM +0100, Lennart Regebro wrote:
Paul Winkler wrote:
As a frequent (ab)user of '/'+absolute_url(1), which did indeed bite me when i deployed to an "inside out" apache setup, I thought I'd try this out... I think you meant BASEPATH1+'/'+absolute_url(1)?
I would like to know: 1. Exactly what is an "inside out" apache setup.
See the "About" tab on VHM. "Inside out" is mentioned numerous times in this thread.
2. What is the result you want?
The result that I got by doing BASEPATH1+'/'+absolute_url(1) as described in my previous message. I thought that was clear. BASEPATH1 does not have a trailing slash, and absolute_url(1) does not have a leading slash, so if you visit _vh_foo/bar you will get foobar instead of foo/bar. Therefore, BASEPATH1 + absolute_url(1) does not work. You have to insert the slash. -- Paul Winkler http://www.slinkp.com Look! Up in the sky! It's LATE HYDROXY WATERBOY! (random hero from isometric.spaceninja.com)
Paul Winkler wrote:
See the "About" tab on VHM.
OK, good.
"Inside out" is mentioned numerous times in this thread.
Yeah, I know, but I was getting confused to what it actually ment.
The result that I got by doing BASEPATH1+'/'+absolute_url(1) as described in my previous message. I thought that was clear.
Now it is. :) Thanks. //Lennart
On Monday 08 December 2003 11:35, Stefan H. Holek wrote:
Note that this is one of my main points: It will be of little use to document usage of BASEPATH1+absolute_url(1) when '/'+absolute_url(1) appears to work (until it is far too late).
We can fix this social problem by providing an easy way for product developers to run their development zope server with the virtual path equivalent to an inside-out hosting configuration. easy means not needing apache/squid. Our staging server is uses an inside-out virtual host configuration (to simplify ssl certificate management) so we hit all these problems early enough to fix the damage cheaply. -- Toby Dickenson
Stefan H. Holek wrote at 2003-12-8 12:35 +0100:
... The ugly part is that the behavior of absolute_url(1) changes suddenly when the Vhost configuration starts to include inside-out parts (_vh_xyz). This means that is is possible to break (seemingly) working code by reconfiguring Apache. :-(
Maybe, my contribution has not been read. Thus, I try again: "/" + "absolute_url(1)" should implement the notion of "absolute-path URL reference" (see RFC2396 section 5). This means, that the receiving browser should be able to retrieve the object correctly given this URL reference. -- Dieter
On Mon, Dec 08, 2003 at 08:24:04PM +0100, Dieter Maurer wrote:
Maybe, my contribution has not been read. Thus, I try again:
"/" + "absolute_url(1)" should implement the notion of "absolute-path URL reference" (see RFC2396 section 5).
This means, that the receiving browser should be able to retrieve the object correctly given this URL reference.
Yup. But while we're on the subject... Why doesn't absolute_url(1) include a leading slash? I don't think I've ever seen a use of absolute_url(1) that didn't have to add the slash. What was the rationale originally? -- Paul Winkler http://www.slinkp.com Look! Up in the sky! It's ZOOMING SOLITARY PICNINC CHLAMYDIA OOZE! (random hero from isometric.spaceninja.com)
On Monday 08 December 2003 21:21, Paul Winkler wrote:
On Mon, Dec 08, 2003 at 08:24:04PM +0100, Dieter Maurer wrote:
Maybe, my contribution has not been read. Thus, I try again:
"/" + "absolute_url(1)" should implement the notion of "absolute-path URL reference" (see RFC2396 section 5).
This means, that the receiving browser should be able to retrieve the object correctly given this URL reference.
Yup. But while we're on the subject... Why doesn't absolute_url(1) include a leading slash? I don't think I've ever seen a use of absolute_url(1) that didn't have to add the slash. What was the rationale originally?
Because <dtml-var BASEPATH1>/<dtml-var "absolute_url(1)"> looks nicer than without the slash ? -- Toby Dickenson
Toby Dickenson wrote:
Because
<dtml-var BASEPATH1>/<dtml-var "absolute_url(1)">
looks nicer than without the slash
?
OT: Seeing as that would actually have to be written <dtml-var "REQUEST.BASEPATH1" html_quote>/<dtml-var "absolute_url(1)" html_quote> to get anywhere close to reliable and secure behavior, calling dtml-anything "nice" seems to be a bit moot--bug 813 lives on. ...but anyway, I have no opinion on the absolute_url api one way or the other, its been so carry on...
Hi Stefan! Stefan H. Holek wrote:
absolute_url(1) was broken (by my definition of broken) basically since the introduction of VHM, which means the better part of 2 years. Naturally, there is code now that relies on this (broken) behavior. This does however not mean it should not be fixed!
AFAICT 'inside-out' hosting was used long before the introduction of VHM. Using your definition of broken absolute_url(1) was broken since the introduction of CGI scripts, which means longer than Zope has its current name. [...]
The idiom '/'+absolute_url(1) to get the path part of an object's URL is *very* common, and as luck will have it *works absolutely fine* as long as inside-out hosting is not present. So this error usually goes undetected and creeps all over people's code. I'd be willing to bet that it is possible to break other packages like, say, Plone simply by changing Vhost configs as well ;-).
Note that this is one of my main points: It will be of little use to document usage of BASEPATH1+absolute_url(1) when '/'+absolute_url(1) appears to work (until it is far too late).
I can see why you think the API should be changed. But do you really think it is the Right Thing to break existing products of people who read the API documentation and tested their products carefully to fix the products of people who trusted their intuition? Cheers, Yuppie
Evan Simpson wrote:
Lennart Regebro wrote:
I will check this into head this evening, and unless people scream tomorrow I will check it into the 2.7 branch.
Please hold off. I've been meaning to revisit this for a while, and I have a bit of time to do so today and tomorrow. Also, virtual hosting is properly the domain of the request object, not the object being traversed. This is why the modified absolute_url() uses REQUEST.physicalPathToURL.
BTW: I'm missing a REQUEST variable that represents the URL requested by the browser. 'PATH_INFO' doesn't show the virtual URL: '/VirtualHostBase/http/example.org:80/VirtualHostRoot/_vh_test/path/to/object' 'URL' might be changed by __before_publishing_traverse__ or __browser_default__: 'http://example.org/test/path/to/object/index_html/view' So I think it would be great if VHM would add a variable like 'REQUESTED_URL' (should have a better name) that isn't further modified on traversal. 'http://example.org/test/path/to/object' Just my 2 cents. Cheers, Yuppie
On Wed, 03 Dec 2003 20:43:23 +0100 Yuppie <schubbe@web.de> wrote: [snip]
So I think it would be great if VHM would add a variable like 'REQUESTED_URL' (should have a better name) that isn't further modified on traversal. 'http://example.org/test/path/to/object'
+1 I actually had a case recently where using traverse_subpath in a python script screwed up CookieCrumbler (which uses REQUEST.URL to determine where it should redirect). Basically it ate the end of the URL value so the login form redirected to the wrong place. The evil hack fix was actually a REQUEST.set('URL', ...), in the python script before any unauthorized errors could be raised. Perhaps it should be called "ACTUAL_URL" or "ORIGINAL_URL". This would be the thing that CookieCrumber would redirect to... -Casey
Stefan H. Holek wrote at 2003-12-3 15:25 +0100:
No, no and 3 times no! The fix was done by Evan and is CORRECT. absolute_url () does not (and should not!) know anything about CMF or portals or whatever else!
Right...
It MUST however return correct results in all possible VH situations and this is what the fix addresses.
But, when "absolute_url(1)" behaves as Yuppie describes, it does not behave correct. Let's look at it from a semantic point of view: HTTP knows two kinds of absolute URL references, univeral absolute URLs (containing the protocol and the server) and server relative absolute URL (starting with a "/"). The first notion is supported by "absolute_url()". For unknown reasons, "absolute_url(1)" only almost realizes the second notion (it lacks the leading "/"). But, we became familiar with this deficiency. When you accept that "absolute_url(1)" should come near to the notion of server relative absolute URL, then it *must* return an URL with respect to the currently active site root. Otherwise, the browser using this URL will interpret it wrongly. It may still work due to acquisition, but this is more by accident. Thus, I agree with Yuppie. If "absolute_url(1)" behaves as he describes, then it is wrong and there is no longer an easy way to implement the notion of "server relative absolute URL reference". -- Dieter
After reading this paragraph for the third time I realized you have a very good point here. But <quote by="Evan Simpson"> "Relative" in this context refers to the concept of a "relative path" as used in rfc1808, not to a relationship with a Zope object. It is meant for use in situations such as redirection to a secure page from an insecure one (eg. 'https://example.com' + target.absolute_url(1)) where you would otherwise have to generate the complete URL and then break it apart. </quote> So, what do we want "relative" to mean for absolute_url? Stefan On Mittwoch, Dez 3, 2003, at 20:14 Europe/Vienna, Dieter Maurer wrote:
When you accept that "absolute_url(1)" should come near to the notion of server relative absolute URL, then it *must* return an URL with respect to the currently active site root. Otherwise, the browser using this URL will interpret it wrongly. It may still work due to acquisition, but this is more by accident. -- The time has come to start talking about whether the emperor is as well dressed as we are supposed to think he is. /Pete McBreen/
Stefan H. Holek wrote at 2003-12-8 19:14 +0100:
... So, what do we want "relative" to mean for absolute_url?
I spelled it out more precisely in a recent post (after rereading RFC2693). In my view "'/' + obj.absolute_url(1)" should implement the "absolute path" relative URL for "obj" (as defined by RFC2693, section 5). -- Dieter
On Mon, Dec 08, 2003 at 07:14:28PM +0100, Stefan H. Holek wrote:
After reading this paragraph for the third time I realized you have a very good point here.
But
<quote by="Evan Simpson"> "Relative" in this context refers to the concept of a "relative path" as used in rfc1808, not to a relationship with a Zope object. It is meant for use in situations such as redirection to a secure page from an insecure one (eg. 'https://example.com' + target.absolute_url(1)) where you would otherwise have to generate the complete URL and then break it apart. </quote>
So, what do we want "relative" to mean for absolute_url?
Speaking for myself, what I really want is something that always works for the client, since that's who generally cares about URLs. I like the behavior of Evan's fix. For looking up objects in zope, we should use other methods such as getPhysicalPath(). Lennart's path2url() would be handy. But I don't think it's worth the backward compatibility pain to change absolute_url(1) now. We've been waiting ages for a stable Zope 2.7, IMHO it's too late for something this problematic to change. So my proposal is this: 1) Implement distinct methods for client-useable virtual paths vs. server-useable containment paths, as Lennart proposes. 2) absolute_url(1) in Zope 2.7 should continue to work as in <= 2.6. 3) As soon as we decide conclusively what is going to happen with absolute_url(1) in zope 2.8, we can have it log a deprecation warning. (Hmm, that could really annoy admins. A deprecation warning on every call to getIcon(), ugh. Other ideas?) 4) In zope 2.8, we have the new behavior, whatever we decide that is. We also have to be sure to update the Help docs and the long-suffering API Reference. -- Paul Winkler http://www.slinkp.com Look! Up in the sky! It's LIUTENANT ON WHEELS! (random hero from isometric.spaceninja.com)
participants (13)
-
Casey Duncan -
Chris McDonough -
Clemens Robbenhaar -
Dieter Maurer -
Evan Simpson -
Jamie Heilman -
Jens Vagelpohl -
Lennart Regebro -
Leonardo Rochael Almeida -
Paul Winkler -
Stefan H. Holek -
Toby Dickenson -
Yuppie