catalogObject changes (Zope-2_6-branch)
Hi! Yesterday I spent some hours tracking down why catalog Metadata and catalog Indexes are getting out of sync in my CMF Site. I found that checkin <http://cvs.zope.org/Zope/lib/python/Products/ZCatalog/Catalog.py.diff?r1=1.98.6.10&r2=1.98.6.11> including this new condition <code> if not idxs: # if the caller specifies that we should update only a # specific set of indexes, we don't do a metadata update. self.updateMetadata(object, uid) </code> Could anybody tell me what's the rationale behind that checkin? - it breaks backwards compatibility - it's only in Zope-2_6-branch, not in Zope-2_7-branch or HEAD - I can't see what this 'if' is good for anyway Cheers, Yuppie
Yuppie wrote at 2003-10-3 10:14 +0200:
Yesterday I spent some hours tracking down why catalog Metadata and catalog Indexes are getting out of sync in my CMF Site.
I found that checkin <http://cvs.zope.org/Zope/lib/python/Products/ZCatalog/Catalog.py.diff?r1=1.98.6.10&r2=1.98.6.11>
including this new condition <code> if not idxs: # if the caller specifies that we should update only a # specific set of indexes, we don't do a metadata update. self.updateMetadata(object, uid) </code>
Could anybody tell me what's the rationale behind that checkin?
- it breaks backwards compatibility - it's only in Zope-2_6-branch, not in Zope-2_7-branch or HEAD - I can't see what this 'if' is good for anyway
I can ;-) The "idxs" argument is often provided to update only special indexes: e.g. workflow indexes or "AllowedRolesAndUsers" and when you update selected indexes from the "Indexes" tab. In these cases, it is often not necessary to update the Metadata. Metadata updates are often monstruous. In one of our applications, Metadata updates are responsible for a 500 kB transaction when a single workflow state changes. I am now advocating not to use Metadata at all. They are only relevant when you process large result sets (e.g. for sorting or statistics). Dieter
The system is making the assumption that if you update indexes individually, you know what you're doing. Passing an index name into catalogObject was always meant to do as much or as little work as you specified with respect to the names of the indexes; it was only by mistake that metadata got updated when indexes were specified. The new (correct) behavior may break some applications that depended on the bug, and I'm sorry it broke yours, but it was indeed a bug. And, doh, yes, I need to forward-port the fix to the 2.7 branch and the HEAD. :-( On Fri, 2003-10-03 at 16:49, Dieter Maurer wrote:
Yuppie wrote at 2003-10-3 10:14 +0200:
Yesterday I spent some hours tracking down why catalog Metadata and catalog Indexes are getting out of sync in my CMF Site.
I found that checkin <http://cvs.zope.org/Zope/lib/python/Products/ZCatalog/Catalog.py.diff?r1=1.98.6.10&r2=1.98.6.11>
including this new condition <code> if not idxs: # if the caller specifies that we should update only a # specific set of indexes, we don't do a metadata update. self.updateMetadata(object, uid) </code>
Could anybody tell me what's the rationale behind that checkin?
- it breaks backwards compatibility - it's only in Zope-2_6-branch, not in Zope-2_7-branch or HEAD - I can't see what this 'if' is good for anyway
I can ;-)
The "idxs" argument is often provided to update only special indexes: e.g. workflow indexes or "AllowedRolesAndUsers" and when you update selected indexes from the "Indexes" tab. In these cases, it is often not necessary to update the Metadata.
Metadata updates are often monstruous. In one of our applications, Metadata updates are responsible for a 500 kB transaction when a single workflow state changes.
I am now advocating not to use Metadata at all. They are only relevant when you process large result sets (e.g. for sorting or statistics).
Dieter
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Hi Chris! Chris McDonough wrote:
The system is making the assumption that if you update indexes individually, you know what you're doing. Passing an index name into catalogObject was always meant to do as much or as little work as you specified with respect to the names of the indexes; it was only by mistake that metadata got updated when indexes were specified. The new (correct) behavior may break some applications that depended on the bug, and I'm sorry it broke yours, but it was indeed a bug.
Thanks for clarifying this! Now I understand why you changed it this way. But I still don't agree you fixed a bug. From my point of view this was an API change. 1.) The interface (IZCatalog.py) a.) "If provided, idxs specifies the names of indexes to update." No different behavior regarding metadata update is mentioned. b.) There is no other interface method to update metadata, so catalog_object is responsible for keeping metadata in sync. 2.) The use cases a.) Change of many object attributes or big attributes like 'body'. Everything should be updated, works like before. b.) Dieter described this already:
The "idxs" argument is often provided to update only special indexes: e.g. workflow indexes or "AllowedRolesAndUsers" and when you update selected indexes from the "Indexes" tab. In these cases, it is often not necessary to update the Metadata.
c.) Change of metadata. Before you 'fixed' catalog_object, it was possible to edit and reindex metadata like Description without reindexing the whole PDF / Word body of a file. Now this expensive task has to be perfomed after each metadata change. 3.) The existing code a.) CMF. Not badly broken, but the metadata of new content items is now wrong. This line in TypesTool.py doesn't work as expected: ob.reindexObject(idxs=['portal_type', 'Type']) b.) SilvaMetadata. The reindex code seems to be broken: <http://cvs.infrae.com/SilvaMetadata/Binding.py?rev=1.39&content-type=text/vnd.viewcvs-markup> Adding an 'update_metadata' flag would have been backwards compatible and useful for each use case.
And, doh, yes, I need to forward-port the fix to the 2.7 branch and the HEAD. :-(
I saw you were working on sunday. At least this time you mentioned the change in CHANGES.txt... Cheers, Yuppie
OK, I can see your point of view. Option 1: Add an update_metadata flag to the catalogObject method with a default of True on all branches. Option 2: Do nothing, but add the "updateMetadata" method to the ZCatalog interface. Option 1 would only take effect when 2.6.3 was released (if ever) and 2.7 final would have the new flag. Because since the behavior change has already been released in 2.6.2, it might be better to do option 2 and fix the places in CMF/SilvaMetadata that specify indexes, leaving it the way it is. Thanks, - C On Mon, 2003-10-06 at 06:24, Yuppie wrote:
Hi Chris!
Chris McDonough wrote:
The system is making the assumption that if you update indexes individually, you know what you're doing. Passing an index name into catalogObject was always meant to do as much or as little work as you specified with respect to the names of the indexes; it was only by mistake that metadata got updated when indexes were specified. The new (correct) behavior may break some applications that depended on the bug, and I'm sorry it broke yours, but it was indeed a bug.
Thanks for clarifying this! Now I understand why you changed it this way. But I still don't agree you fixed a bug. From my point of view this was an API change.
1.) The interface (IZCatalog.py)
a.) "If provided, idxs specifies the names of indexes to update." No different behavior regarding metadata update is mentioned.
b.) There is no other interface method to update metadata, so catalog_object is responsible for keeping metadata in sync.
2.) The use cases
a.) Change of many object attributes or big attributes like 'body'. Everything should be updated, works like before.
b.) Dieter described this already:
The "idxs" argument is often provided to update only special indexes: e.g. workflow indexes or "AllowedRolesAndUsers" and when you update selected indexes from the "Indexes" tab. In these cases, it is often not necessary to update the Metadata.
c.) Change of metadata. Before you 'fixed' catalog_object, it was possible to edit and reindex metadata like Description without reindexing the whole PDF / Word body of a file. Now this expensive task has to be perfomed after each metadata change.
3.) The existing code
a.) CMF. Not badly broken, but the metadata of new content items is now wrong. This line in TypesTool.py doesn't work as expected:
ob.reindexObject(idxs=['portal_type', 'Type'])
b.) SilvaMetadata. The reindex code seems to be broken: <http://cvs.infrae.com/SilvaMetadata/Binding.py?rev=1.39&content-type=text/vnd.viewcvs-markup>
Adding an 'update_metadata' flag would have been backwards compatible and useful for each use case.
And, doh, yes, I need to forward-port the fix to the 2.7 branch and the HEAD. :-(
I saw you were working on sunday. At least this time you mentioned the change in CHANGES.txt...
Cheers, Yuppie
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Hi! Chris McDonough wrote:
Option 1: Add an update_metadata flag to the catalogObject method with a default of True on all branches.
Option 2: Do nothing, but add the "updateMetadata" method to the ZCatalog interface.
Option 1 would only take effect when 2.6.3 was released (if ever) and 2.7 final would have the new flag. Because since the behavior change has already been released in 2.6.2, it might be better to do option 2 and fix the places in CMF/SilvaMetadata that specify indexes, leaving it the way it is.
I agree backwards compatibilty is broken anyway. Whatever the solution is, products using the idxs argument for metadata changes have to be changed to work with 2.6.2. Regarding migration, I agree option 2 would be better. Regarding the interface, I still prefer option 1 or even option 1 *and* 2. The fact that zcat.catalog_object(obj) does update metadata while zcat.catalog_object(obj, idxs=zcat.indexes()) doesn't, isn't intuitive at all. At least this difference has to be mentioned in the interface definition. Just my 2 cents. I can live with both options. Cheers, Yuppie
This is broken in 2.6.2 and I would consider it a bug in that release. I would support an improved api such as: catalog_object(obj, idxs=[], metadata=1) The way it is currently breaks at least one of my applications as well (ironically, this was an application the Chris was also working on). I am happy to make the change if we agree it is the right thing. -Casey On Tuesday 07 October 2003 07:13 am, Yuppie wrote:
Hi!
Chris McDonough wrote:
Option 1: Add an update_metadata flag to the catalogObject method with a default of True on all branches.
Option 2: Do nothing, but add the "updateMetadata" method to the ZCatalog interface.
Option 1 would only take effect when 2.6.3 was released (if ever) and 2.7 final would have the new flag. Because since the behavior change has already been released in 2.6.2, it might be better to do option 2 and fix the places in CMF/SilvaMetadata that specify indexes, leaving it the way it is.
I agree backwards compatibilty is broken anyway. Whatever the solution is, products using the idxs argument for metadata changes have to be changed to work with 2.6.2.
Regarding migration, I agree option 2 would be better.
Regarding the interface, I still prefer option 1 or even option 1 *and* 2. The fact that
zcat.catalog_object(obj)
does update metadata while
zcat.catalog_object(obj, idxs=zcat.indexes())
doesn't, isn't intuitive at all. At least this difference has to be mentioned in the interface definition.
Just my 2 cents. I can live with both options.
Cheers, Yuppie
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Hi! Casey Duncan wrote:
This is broken in 2.6.2 and I would consider it a bug in that release. I would support an improved api such as:
catalog_object(obj, idxs=[], metadata=1)
I'm sure you mean catalog_object(obj, uid, idxs=[], metadata=1) if we are talking about the ZCatalog interface.
I am happy to make the change if we agree it is the right thing.
+1
On Tuesday 07 October 2003 07:13 am, Yuppie wrote:
I agree backwards compatibilty is broken anyway. Whatever the solution is, products using the idxs argument for metadata changes have to be changed to work with 2.6.2.
Regarding migration, I agree option 2 would be better.
I forgot about people skipping 2.6.2 because they don't want to adjust their code. So I agree adding the flag is the Right Thing also regarding migration. Cheers, Yuppie
On Tuesday 07 October 2003 10:41 am, Yuppie wrote:
Hi!
Casey Duncan wrote:
This is broken in 2.6.2 and I would consider it a bug in that release. I would support an improved api such as:
catalog_object(obj, idxs=[], metadata=1)
I'm sure you mean catalog_object(obj, uid, idxs=[], metadata=1) if we are talking about the ZCatalog interface.
No I figured while we were at it, we'd break compatibilty for a completely different reason ;^) Seriously, I meant what you said... -Casey
On Tue, 2003-10-07 at 10:52, Casey Duncan wrote:
This is broken in 2.6.2 and I would consider it a bug in that release. I would support an improved api such as:
catalog_object(obj, idxs=[], metadata=1)
The way it is currently breaks at least one of my applications as well (ironically, this was an application the Chris was also working on).
I am happy to make the change if we agree it is the right thing.
+1 Changing the behaviour of existing APIs in a backward incompatible way is a Bad Thing (TM). IMHO, The only acceptable exception is if the API had different behaviour from the documentation to begin with (i.e. the API was broken). And I suggest rushing up a 2.6.3 (even if just a beta or rc) to discourage people from migrating to 2.6.2. Cheers, Leo -- Ideas don't stay in some minds very long because they don't like solitary confinement.
On Tue, 2003-10-07 at 09:52, Casey Duncan wrote:
This is broken in 2.6.2 and I would consider it a bug in that release. I would support an improved api such as:
catalog_object(obj, idxs=[], metadata=1)
The way it is currently breaks at least one of my applications as well (ironically, this was an application the Chris was also working on).
FWIW, those changes have been in there for two months on the 2.6 branch. This includes 2.6.2b3, b4, b5, and b6. The breakage couldn't have been *that* bad. ;-) - C
On Tuesday 07 October 2003 12:08 pm, Chris McDonough wrote:
On Tue, 2003-10-07 at 09:52, Casey Duncan wrote:
This is broken in 2.6.2 and I would consider it a bug in that release. I would support an improved api such as:
catalog_object(obj, idxs=[], metadata=1)
The way it is currently breaks at least one of my applications as well (ironically, this was an application the Chris was also working on).
FWIW, those changes have been in there for two months on the 2.6 branch. This includes 2.6.2b3, b4, b5, and b6. The breakage couldn't have been *that* bad. ;-)
I guess that means I should start using releases ;^) Nah.... -Casey
OK, I checked in changes on the 2.6, 2.7, and HEAD branches that add an update_metadata keyword argument to Catalog's catalogObject and ZCatalog's catalog_object. The argument defaults to true. I also added some tests, and made sure that the reindexIndex method of ZCatalog took advantage of this feature. - C On Tue, 2003-10-07 at 12:08, Chris McDonough wrote:
On Tue, 2003-10-07 at 09:52, Casey Duncan wrote:
This is broken in 2.6.2 and I would consider it a bug in that release. I would support an improved api such as:
catalog_object(obj, idxs=[], metadata=1)
The way it is currently breaks at least one of my applications as well (ironically, this was an application the Chris was also working on).
FWIW, those changes have been in there for two months on the 2.6 branch. This includes 2.6.2b3, b4, b5, and b6. The breakage couldn't have been *that* bad. ;-)
- C
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
On Tue, 2003-10-07 at 16:57, Chris McDonough wrote:
OK, I checked in changes on the 2.6, 2.7, and HEAD branches that add an update_metadata keyword argument to Catalog's catalogObject and ZCatalog's catalog_object. The argument defaults to true. I also added some tests, and made sure that the reindexIndex method of ZCatalog took advantage of this feature.
Cool! Can we get some form of 2.6.3 beta out to encourage people not to migrate to 2.6.2? Cheers, Leo
On Tue, 2003-10-07 at 21:37, Leonardo Rochael Almeida wrote:
On Tue, 2003-10-07 at 16:57, Chris McDonough wrote:
OK, I checked in changes on the 2.6, 2.7, and HEAD branches that add an update_metadata keyword argument to Catalog's catalogObject and ZCatalog's catalog_object. The argument defaults to true. I also added some tests, and made sure that the reindexIndex method of ZCatalog took advantage of this feature.
Cool! Can we get some form of 2.6.3 beta out to encourage people not to migrate to 2.6.2?
Or maybe a HotFix? -- Ideas don't stay in some minds very long because they don't like solitary confinement.
Leonardo Rochael Almeida wrote:
Cool! Can we get some form of 2.6.3 beta out to encourage people not to migrate to 2.6.2?
Or maybe a HotFix?
This isn't hotfix-related so I'd say no. Would be cool to crank out a 2.6.3 though... Chris
participants (6)
-
Casey Duncan -
Chris McDonough -
Chris Withers -
Dieter Maurer -
Leonardo Rochael Almeida -
Yuppie