RE: [Zope] [Q] Design decisions: the use of ZCatalogs versus a ba ckend DB?
-----Original Message----- From: Darran Edmundson [mailto:Darran.Edmundson@anu.edu.au]
Toying with ZCatalogs, I am starting to wonder if many of the uses I envisioned are not better served with a backend database. For example, for a publication database where nearly all of the instance data needs to be indexed (author, title, journal, year), this results in near total duplication of the instance data within the catalog.
Yep, of course, if you use an rdbm, you have duplicate data between Zope and it.
It seems that ZCatalogs are meant to be used if (i) only a fraction of the instance data needs to be indexed, or (ii) the dataset is small enough that one can live with the duplication.
I'm not sure I understand, indexing requires duplicate data of some kind. And it's not one to one duplication, the catalog goes to great lengths to store that information efficiently.
As another example, consider a catalog of URLs (descriptor, link, creation_date) gleaned from a URLClass ZClass. Here we might have hundreds of URLClass instances peppered throughout user folders and all of the data needs to be indexed. Is the overhead of object creation and duplication within the catalog better served with a simple backend database?
Actually, in this case there is no duplication. URL information is generated dynamicly from Zope, there is no universal URL to object mapping, the information is distributed and implicit within the structure of the object system and how you access it. But I see what you mean, the thing is, you can't have an index really without a corpus of information *to* index, so there is allways duplication of information, otherwise you would have to do real time 'grepping' through the objects. -Michel
----- Original Message ----- From: "Michel Pelletier" <michel@digicool.com> To: "'Darran Edmundson'" <Darran.Edmundson@anu.edu.au>; "zope" <zope@zope.org> Sent: Monday, January 24, 2000 11:23 AM Subject: RE: [Zope] [Q] Design decisions: the use of ZCatalogs versus a backend DB?
-----Original Message----- From: Darran Edmundson [mailto:Darran.Edmundson@anu.edu.au]
It seems that ZCatalogs are meant to be used if (i) only a fraction of the instance data needs to be indexed, or (ii) the dataset is small enough that one can live with the duplication.
I'm not sure I understand, indexing requires duplicate data of some kind. And it's not one to one duplication, the catalog goes to great lengths to store that information efficiently.
I think what Darran is asking here is actually with regards to the metadata table. Darran, you actually don't *have* to duplicate things in the catalog via the metadata table if you don't want to. (As Michel points out, you will have some duplication of data for the index itself, but an RDBMS would have the same issue). The metadata table allows very fast, efficient access to select pieces of information about the cataloged object. This is good, because you don't have to "wake up" all of the objects in the search (possibly filling up Zope's cache). However, if you need to access all of the information in the object, you may as well leave the metadata empty and just go ahead and get the information directly from the object. (I don't remember off the top of my head which method you use to get at the object that is pointed to by the catalog). Thus, the only duplication is the index itself, which is no more overhead than you're likely to get out of an RDBMS. Kevin
participants (2)
-
Kevin Dangoor -
Michel Pelletier