[Zope] Preventing duplicates in ZCatalog

Oliver Bleutgen myzope@gmx.net
Tue, 22 Apr 2003 19:46:36 +0200


Wankyu Choi wrote:

There are many people knowing the ZCatalog better than I do, but at 
least we can try to get the ball rolling ;).



> Here's a very simple concept I'd like to implement in my applications using
> ZCatalog:
> 
> 	** Every entry in the built-in ZCatalog should be unique.  No
> duplicates!**

The question here is unique with respect to what?


> 
> However, every object's reference could get duplicated ( that is, inserted
> more than once ) when the object is accessed via acquisition. Say, when I
> have two folders, *dir1*, *dir2" and an object "obj1" in the dir1 folder:
> 
> 	* Accessing obj1 via /dir1/obj1 catalogs it with the uid
> "/dir1/obj1".
>  
> 	* Accessing it again via "/dir1/obj1", skips catalogging the object
> or recatalog it with the same uid.
> 
> 	* Accessing it via "/dir2/obj1" or "/dir2/dir1/obj1" duplicates the
> catalog entry with with a different uid.
 >
 > [Explanation snipped]
> 
> I want to make sure that no duplicates get into any catalog from the source
> code level even when using VHM or when users access objects by way of
> acquisition.
> 
> ( The problem is easy to reproduce. In a CMF site, create a couple of
> folders and in one of them, create a news item and publish it. It'd appear
> on top of the news_slot box. Revisit the news item via acquisition, that is,
> putting the other folder on top of the item and edit it. Voila, the
> news_slot box shows two references for the news item. )
> 
>[snip]
> 
> In short, can I catalog objects using acquisition-safe unique keys, say
> their oid's or auto-generated md5 hash or something?

I can't follow you here, because above you said that that you want to 
implement that "Accessing it via "/dir2/obj1" or "/dir2/dir1/obj1" 
duplicates the catalog entry with with a different uid."

This isn't acquisition safe.

> 
> ( I wonder, if it's doable with little problem, why the designer of ZCatalog
> made it in such a way that object's physical paths work as uids, which may
> lead to the above-mentioned problems since they can't be unique when objects
> are acquired with different urls... please enlighten me if there's something
> I should be aware of in this regard, Casey. Terribly sorry for another
> newbish stupid question :-)

Here is where there seems to be a misunderstanding, at least as far *I* 
understand the ZCatalog:

It's up to the object to pass an uid to the Catalog if he/she wants, and 
the decision what to use for the uid is up to the programmer who 
implements that (i.e. it's not a (Z)Catalog implementation problem). All 
cases that I have seen in stock zope/CMF are using getPhysicalPath() for 
that, and this is also the default if no uid is passed. Therefore I 
can't see how your example with the news item could happen. 
CMFCatalogAware shouldn't do this, and I couldn't reproduce this with a 
custom type of mine, which uses CMFCatalogAware like NewsItems do AFAIK.
Additionally getPhysicalPath() is also the answer to your 
acquisition-safe unique key, because it accomplishs exactly that, 
acquisition safe and zope-wide uniqueness. It also doesn't take vhms or 
the hostname into account, so I really can't see what happens there.

Another question, when you say you do  _index_ your read count, do you 
mean storing it as meta_data in the catalog?


cheers,
oliver