ZCatalog Related Questions
I'm working on a site which manages news articles for a web publication. Currently I have an Article ZClass which has some properties about the article. In addition, Article is has ObjectManager behavior. When an article is created, key properties of the object are defined and the article content itself is stored as a DTMLMethod within the Article. So far this has worked fine. However, I'm now looking into how I can add searching to my site. I've played around a little bit with ZCatalogs but some things still puzzle me. One of the things I would like to do is have a full text search of all my articles. However, it seems that the way I've structured things would require me to index all my DTMLMethods, whether they were contained in an article folder or not. I can't seem to tell the system that I want to catalog only the DTMLMethods which are contained inside objects with a meta_type of 'Article', for example. It seems that to get the behavior that I want I should store the article content in a property of my Article ZClass. Then, I could just index that property value and I would only get data from that property. What other ways are there to do this sort of thing? Another question I have concerns the difference between Catalog Indexes and Catalog Metadata. From what I've read, it seems that items in Indexes are things you can specify in a search in order to find objects which have values corresponding to the index fields. Metadata is information stored in the cataloged object itself and is immediately available rather than having to get the real object from the catalog object. Is this correct? If I wanted to do a full text search, do I just define one index field with a value of TextIndex? How does the TextIndex get populated. If I define a TextIndex and do a find of all 'Articles', what will end up in the TextIndex? Finally, is it generally true that a web site would define one ZCatalog (typically named 'Catalog') which contains data from all the different items that I might want to have cataloged? For example, I have Article data that I'm interested in and I have Issue data that I'm interested in. They don't have the same set of properties. Do I just define one ZCatalog and define indexes for all of the properties that I'm interested in from both objects? I've read the information on ZCatalogs from zope.org but I still have questions. Is there a place where I can see some additional examples of ZCatalog usage? Thanks. James W. Howe mailto:jwh@allencreek.com Allen Creek Software, Inc. pgpkey: http://ic.net/~jwh/pgpkey.html Ann Arbor, MI 48103
----- Original Message ----- From: James W. Howe <jwh@allencreek.com>
One of the things I would like to do is have a full text search of all my articles. However, it seems that the way I've structured things would require me to index all my DTMLMethods, whether they were contained in an article folder or not. I can't seem to tell the system that I want to catalog only the DTMLMethods which are contained inside objects with a meta_type of 'Article', for example.
I think that a good way to handle this would be to give your Article ZClass a 'PrincipiaSearchSource' method. This method would read and return the contents which you want indexed. One side benefit of this is that you can pre-process the contents to remove/replace stuff which you don't want in your index. PrincipiaSearchSource is the default method used to populate TextIndexes (IIRC). Cheers, Evan @ 4-am
At 10:13 AM 2/3/00 -0600, Evan Simpson wrote:
----- Original Message ----- From: James W. Howe <jwh@allencreek.com>
One of the things I would like to do is have a full text search of all my articles. However, it seems that the way I've structured things would require me to index all my DTMLMethods, whether they were contained in an article folder or not. I can't seem to tell the system that I want to catalog only the DTMLMethods which are contained inside objects with a meta_type of 'Article', for example.
I think that a good way to handle this would be to give your Article ZClass a 'PrincipiaSearchSource' method. This method would read and return the contents which you want indexed. One side benefit of this is that you can pre-process the contents to remove/replace stuff which you don't want in your index.
PrincipiaSearchSource is the default method used to populate TextIndexes (IIRC).
I just started looking into PrincipiaSearchSource. I've seen it mentioned but I never knew what is was used for. Can you give me a little more information on how I might use this? Suppose, for example, that instances of my Article ZClass always have a DTMLMethod called 'article_content'. I want to catalog this information but always in the context of the Article object. For example, if the article_content contained the string "Zope", I would want to be able to do a full text search and have the Catalog return an Article instance which had "Zope" in the text of its "article_content" DTMLMethod. If I wanted to index this stuff, would I just define a PrincipiaSearchSource method which returned the (possibly massaged) contents of the "article_content" method? When I did a catalog search, would I just ask for items which had a meta_type of 'Article' and a PricipiaSearchSource which contained the string I was looking for? Thanks for your help. James W. Howe mailto:jwh@allencreek.com Allen Creek Software, Inc. pgpkey: http://ic.net/~jwh/pgpkey.html Ann Arbor, MI 48103
----- Original Message ----- From: "James W. Howe" <jwh@allencreek.com> To: <zope@zope.org> Sent: Thursday, February 03, 2000 9:46 AM Subject: [Zope] ZCatalog Related Questions
One of the things I would like to do is have a full text search of all my articles. However, it seems that the way I've structured things would require me to index all my DTMLMethods, whether they were contained in an article folder or not. I can't seem to tell the system that I want to catalog only the DTMLMethods which are contained inside objects with a meta_type of 'Article', for example. It seems that to get the behavior that I want I should store the article content in a property of my Article ZClass. Then, I could just index that property value and I would only get data from that property. What other ways are there to do this sort of thing?
I'm currently doing it using a property, but the PrincipiaSearchSource method that Evan mentioned should work out. The way you describe it in your other message (making a PrincipiaSearchSource method that returns possible massaged article_content) should work. One thing to note, though, is that there was some discussion a while back about ZCatalog not calling DTML methods in a completely useful way. I don't remember the final conclusion, but I do know that I've had some trouble using a DTML method as an index. Since your ZClasses are based on your own Python base classes, you can add your own PrincipiaSearchSource in there, if you want. You can also add code to automatically add/remove your items from the catalog. (Just look at CatalogAwareness.py in the ZCatalog directory).
Another question I have concerns the difference between Catalog Indexes and Catalog Metadata. From what I've read, it seems that items in Indexes are things you can specify in a search in order to find objects which have values corresponding to the index fields. Metadata is information stored in the cataloged object itself and is immediately available rather than having to get the real object from the catalog object. Is this correct?
Yes. This has the advantage of not activating the main object in the cache (which is useful if you're searching lots of objects or large objects).
If I wanted to do a full text search, do I just define one index field with a value of TextIndex? How does the TextIndex get populated. If I define a TextIndex and do a find of all 'Articles', what will end up in the TextIndex?
Basically, you tell ZCatalog what name to look up for each object and to index that in an TextIndex sort of index. So, if you specify PrincipiaSearchSource as the thing to look up, the index will index all of the words contained in or returned by PrincipiaSearchSource for all Articles (since you specified that metatype). As mentioned above, instead of doing a find you can also make your base class act like CatalogAware (or even subclass CatalogAware if it does what you want).
Finally, is it generally true that a web site would define one ZCatalog (typically named 'Catalog') which contains data from all the different items that I might want to have cataloged? For example, I have Article data that I'm interested in and I have Issue data that I'm interested in. They don't have the same set of properties. Do I just define one ZCatalog and define indexes for all of the properties that I'm interested in from both objects?
That's up to you. You can certainly do it that way. The properties that are unique to Issues won't really cause a problem for Articles and vice versa. To date, I've had all of my objects indexed in a single catalog. I'm breaking some of them out now, however, because when you change something that modifies the catalog from within a Version, that whole catalog is then locked against changes. I have some things that only site administrators can modify and other things that users can modify... so, if an administrator makes a change in a Version, the whole site Catalog is locked, and the users can't do their thing.
I've read the information on ZCatalogs from zope.org but I still have questions. Is there a place where I can see some additional examples of ZCatalog usage?
I haven't seen too many examples of using it. I'm sure you'll see it heavily used in the Portal Toolkit. Kevin
participants (3)
-
Evan Simpson -
James W. Howe -
Kevin Dangoor