[ZODB-Dev] Unique Object ID
Casey Duncan
casey at zope.com
Thu Jun 5 00:32:37 EDT 2003
On Wednesday 04 June 2003 04:49 pm, Johan Dahlin wrote:
> We've been running into a issue here while developing IndexedCatalog.
> IndexedCatalog stores information about it's objects (which always are
> inherited from a special class) in a OOBTree.
> They are keyed by the id of the object. In the current invocation of
> IndexedCatalog we have been using id() of the object (eg the memory
> address) as the id of the object.
>
> Recently we have found out that it's not very reliable, since it might
> create conflicts if the same memory address is returned when creating
> objects. So we need to find out a reliable way of creating a object id.
> There are basically two options that I can think of:
>
> 1) use _p_oid
bad idea, these can change if you move objects between databases. So there is
a pretty strong possibility of collisions.
> We must first store the object in a temporary location somewhere under
> the root commit(1) pull it back and get the oid. I think this is really
> bad for performance. Is it possible to get an _p_oid in another way? Or
> store the object in a connection and not a root?
>
> 2) Using a counter, increase it for each object
>
> I believe this can create problems regarding to multiple connections and
> conflict errors. Create an object in connection A, create another one in
> connection B, commit A, commit B. *boom*
This should work if you use a BTrees.Length.Length object which has automatic
conflict resolution. Its a pretty clever and elegant piece of code too, give
it a looksee. There are also tests in the BTree code that you can look at to
see how to test your code for conflicts.
> Am I missing something, or is it another way of doing this?
> Comments, suggestions highly appreciated.
BTW: Zope Catalog uses (in 2.7 head) random.randint(-2000000000, 2000000000)
to generate record ids. It then double-checks that the id hasn't already been
taken. This should probably also be done if you use a counter just to be
safe. Performance here isn't really an issue, since relative to writing to
the database, generating a random number is not expensive.
Actually Catalog uses a combination of random and sequencial ids. That way if
many objects are added at once, they tend to cluster in the BTree data
structure minimizing the number of nodes and buckets that need to be touched.
Have a look at the catalogObject method of Catalog.py in the Zope head.
Also, if use an integer rid, then you can use IOBTrees too, which are
optimized for integer keys.
ZCatalog uses paths to uniquely identify objects in zope. There are BTrees of
path->rid and rid->path in there so that the indexes can just use the integer
rids. Perhaps you could also generate something like a path. Each object in
the database must have a unique path to access it. Perhaps that can be used
as a key in your catalog.
-Casey
More information about the ZODB-Dev
mailing list