[Catalog/ZCatalog] external RID references: bad idea?
Hi. Faced with a need to make large numbers of references to objects within other objects, wanting to get certain information about the linked objects without waking them up in the ZODB, etc., and wanting to store the references efficiently, I thought that the Catalog might be an avenue to explore, since it had solved some somewhat similar issues. So I studied it. Interface declarations will make these kind of questions a bit more straightforward in the future, but for now... My question: Would it be a bad idea to use a given object's catalog RID to store an object reference (elsewhere), and then using the RID to get the UID, the metadata, and or the actual object as needed, using standard ZCatalog methods (getobject, getMetadataForRID, getIndexDataForRID, getpath, etc.)? [If this isn't too bad of an idea, then I will have to subclass ZCatalog to add a method that calls _catalog.hasuid for a given object's physical path--this will return the RID.] [If this is actually approaching a good idea, I would propose to ZC to add a clear interface-happy getRID in catalog and ZCatalog] Risks I see: primary one is that, unless I get ZC's blessing on this and a clear addition to the (nonexistent?) catalog/zcatalog interfaces of the needed getRID function, this is a hack. I'd be relying on the internal workings of the catalog staying the same. secondary risk I see, and I hope not a problem, is that I'd have to be waking up the catalog every time I needed to mess with any reference in my objects. I'm hoping that this won't be a problem merely because the catalog is such an integral part of the inner workings that it will always be "awake". I would really appreciate any thoughts or concerns about this approach. Thanks Gary
Gary Poster wrote:
Hi.
Faced with a need to make large numbers of references to objects within other objects, wanting to get certain information about the linked objects without waking them up in the ZODB, etc., and wanting to store the references efficiently, I thought that the Catalog might be an avenue to explore, since it had solved some somewhat similar issues. So I studied it. Interface declarations will make these kind of questions a bit more straightforward in the future, but for now...
[snip] Another option to explore might be to store the oid (a unique 8 byte string generated for each object by the ZODB) as the reference along with some cached metadata that you'll lookup often. You can use the ZODB connection object (stored in a persistent object's _p_jar to lookup another object by oid, it acts like a big dictionary). All persistent objects in a ZODB have a _p_oid attribute that contains the oid value. Perhaps describing your application in more detail would yield better ideas... -- | Casey Duncan | Kaivo, Inc. | cduncan@kaivo.com `------------------>
From: "Casey Duncan" <cduncan@kaivo.com>
Gary Poster wrote:
Hi.
Faced with a need to make large numbers of references to objects within other objects, wanting to get certain information about the linked
objects
without waking them up in the ZODB, etc., and wanting to store the references efficiently, [snip]
Another option to explore might be to store the oid (a unique 8 byte string generated for each object by the ZODB) as the reference along with some cached metadata that you'll lookup often.
[ snip] Thanks! I will explore it. I haven't delved into the ZODB code much yet. Precipitately, the advantages I see to the catalog RID possibility are two: the metadata caching mechanism is already built, and objects are already supposed to inform the catalog upon the need to reindex (keeping the metadata fresh). The advantages to the OID are that (as you informed me) the interface elements I need are already in place (and therefore presumably somewhat stable), and that an object can probably be referenced successfully before it is cataloged--maybe at an earlier stage in the object's life. I'll look into it more.
Perhaps describing your application in more detail would yield better ideas...
OK. Here goes. I put a dashed line where I think you can probably stop reading and get the gist. ;-) While I keep an eye to contributing back to the community by making my solutions as flexible as possible, I'm putting a super-bibliography for musicians, especially vocalists, into Zope. It stores objects describing compositions, books, texts, recordings, publications, people, topics, and other items. On a simple level, I need the kind of referencing I describe for connecting people objects as creators to other objects; for connecting any object to another (particularly topics) in a "describes" relationship; for connecting same-class objects in a parent-child relationship; and other similar tasks. (Obviously, I'm coming from a bit of a RDBM background on this but I'm enjoying the better modeling possible with the ZODB, among other things.) When a composition object displays, for instance, it needs to both know the name and address of all of it's creators, ideally without waking up the creators yet. Similarly, a person needs to know back links--what objects claim me as a creator? Rather than caching a page or an object, I have decided it will be best to cache the relationships and metadata somehow. ----------------- Here there be dragons ------------------- The modelling for compositions is particularly complex, at least to me, since I include instruments needed, if any, and voices needed, if any; the voices themselves have high and low range extremes I am keeping track of, and even multiple options for those. If they are published, each song might be transposed by a given number of half steps (producing a new set of the high and low extremes for the composition). If the composition's parent is published and transposed, that means that will produce yet another set of high and low extremes. Displaying and searching by range extremes thus becomes quite complex, and a high, high candidate for caching. Even so, expecting my code to keep the cached information fresh when the relationships are so far-flung makes me nervous: I think I'll only be able to cache so far down the chain, and rely on live checks (or at least secondary cached metadata checks) for the rest. I'm figuring I'm going to need a new pluggable index, based on the work in PathIndex, for the complicated range searches and some other needs; an interlinking class that manages inter-object back and forward links behind the scenes for caching and getting the cached metadata I described; and some simple subclasses that will represent each of the data types. I have plans from there as well, but those are first steps. OK, I'm stopping there; hoping that is enough, or more than enough; and hoping it is useful.
-- | Casey Duncan | Kaivo, Inc. | cduncan@kaivo.com `------------------>
Thanks. Gary
----- Original Message ----- From: "Dieter Maurer" <dieter@handshake.de>
Gary Poster writes:
.... I think it would be a bad idea.
Mainly, because "rid"s are not persistently associated with objects. If someone calls "manage_catalogReindex", then all your rids change.
Dieter
oh. good call. darn darn darn darn darn. thank you very much. OK, back to Casey Duncan's idea then: ZODB oids. I'll try to do my digging in the code tomorrow, but anybody see any problem with using them as references? That loses the built in metadata (and metadata refreshing) of the catalog, but gives me...gives me...well, gives me a smaller-size reference than a full path. That really doesn't solve the main problems that the catalog references helped me with. And if the OIDs are completely reliable, ZC presumably would have used them for space reasons rather than full path info as the catalog's UID...or maybe the using the OID "wakes up" the object... Argh. Back to the drawing board. Thank you very much, Dieter and Casey. Gary
On Wed, 15 Aug 2001 21:27:17 -0400, "Gary Poster" <garyposter@earthlink.net> wrote:
I'll try to do my digging in the code tomorrow, but anybody see any problem with using them as references? That loses the built in metadata (and metadata refreshing) of the catalog, but gives me...gives me...well, gives me a smaller-size reference than a full path.
Yes, there is a problem. OIDs are only unique within a single storage. * If some objects are exported and reimported, their OIDs will change. * You will get duplicate OIDs in the same Zope if you are using a mounted storage. An alternative that has not been mentioned so far is storing a real object reference. The Zope management interface, restricts inter-object references to be a tree structure. However, there is no such restriction in the underlying ZODB. This may well be easier if you can live without managing your relationships as if they were folders, and without using Zopes security mechanisms to control accees to the referred-to objects. (I suggest discussing this deeper on zope-dev) Toby Dickenson tdickenson@geminidataloggers.com
Toby Dickenson wrote:
On Wed, 15 Aug 2001 21:27:17 -0400, "Gary Poster" <garyposter@earthlink.net> wrote:
I'll try to do my digging in the code tomorrow, but anybody see any problem with using them as references? That loses the built in metadata (and metadata refreshing) of the catalog, but gives me...gives me...well, gives me a smaller-size reference than a full path.
Yes, there is a problem. OIDs are only unique within a single storage.
* If some objects are exported and reimported, their OIDs will change.
* You will get duplicate OIDs in the same Zope if you are using a mounted storage.
An alternative that has not been mentioned so far is storing a real object reference. The Zope management interface, restricts inter-object references to be a tree structure. However, there is no such restriction in the underlying ZODB.
This may well be easier if you can live without managing your relationships as if they were folders, and without using Zopes security mechanisms to control accees to the referred-to objects.
(I suggest discussing this deeper on zope-dev)
Toby Dickenson tdickenson@geminidataloggers.com
I totally agree. Use object references whenever you can because then you are totally insulated from the ZODB underneath. However, perhaps a better description of what you are trying to do might lend some insights... -- | Casey Duncan | Kaivo, Inc. | cduncan@kaivo.com `------------------>
----- Original Message ----- From: "Toby Dickenson" <tdickenson@devmail.geminidataloggers.co.uk> <snip good ideas>
(I suggest discussing this deeper on zope-dev)
Thank you very much. I joined zope-dev, cataloged what we had all discussed so far, and posted there. Sorry for the slow response time: network problems for the past three days. Hopefully behind me. Gary
Gary Poster writes:
OK, back to Casey Duncan's idea then: ZODB oids. I would expect that OIDs are reliable (in the sense that they do not change, once associatate), but
* they are a very low level feature, difficult to access from most Zope parts * they may not be unique Think of "mountable storages". Then each storage will have its own OID's, interpreted in its own local context. Dieter
participants (4)
-
Casey Duncan -
Dieter Maurer -
Gary Poster -
Toby Dickenson