On Monday 08 September 2003 05:36 pm, Ed Leafe wrote:
On Monday, September 8, 2003, at 05:31 PM, J Cameron Cooper wrote:
I'm a bit on the other side of the fence from most of the rest of the folks who have replied. My take on the matter is that unless you have a very compelling reason or need to use an external RDBMS you should keep everything in Zope.
I'm curious as to how to best find information when storing data in Zope. I come from a RDBMS background, so writing SQL statements to find the data that matches a user's request is a no-brainer for me. I would be interested in how one would store hundreds or thousands of records of data in Zope, and then find just the one or two that match the given criteria. Do you have to iterate through the entire branch of the ZODB, or is there another, simpler way that I'm missing?
Create a very simple python product to hold the data structure you need and make it catalog path aware. Store it in a BTree folder so you can have hundreds of thousands of items in a folder with no problem. Since the items are catalog aware they will remain correct in the catalog on any change to them so you can query the catalog for the data you want just set up what fields you want to index. The whole thing is an OODB so design it like you would an in memory app for an OO program. Make your data structure, index what you want from it for faster access, then use that index to filter for the data you want. This approach works well in general however in a lot of cases there are more specific methods that will outperform this one by a huge margin. For example if your company has a hierarchy of products and you want to find specific types it is often better to store it in that hierarchy. Example you have a games hierarchy and inside that you have action, rpg, strategy, sim, etc and then you may break it up further. You can then do a search in the zcatalog on a pathindex for games games/action etc to get all the items in that group. Also while zope presents the db as hierarchical it really is not. If you want an item to be in multiple places without copying it is trivial to do from a python product. You just store the object reference like your normally would in a python program just make sure to remove the acquisition wrapper before storage. This gives you one real copy of the object in the db but in many locations. If you often perform x type of query design a data structure that works well for that type of query of that query is expensive to make. Then use something like an observer pattern to keep that data structure always up to data as changes as made. What you end up with is something like caching a search result as far as speed goes but it is all completely dynamic since you update the data structures real time. OODBs are harder in some ways to design for then an RDB but with a good design you can often make them run far faster.