Casey Duncan wrote at 2004-1-23 18:12 -0500:
... I'd be interested to know what the specific reasons are. I have plans about improving ZCatalog in various ways, and it's always interesting to here other outside opinions and use cases that can be used to inform future improvements.
We had extremely bulky BTrees buckets holding the meta data information. This caused huge transaction sizes (a workflow state change resulted in a transaction of about 500 kB). Of course, this was a configuration problem: "summary" and "bobobase_modification_time" were part of the catalog's MetaData and my colleagues used "summary" extensively (each summary was several kb big) ... Tim already optimized the BTrees package a lot. But, intersection may still gain from more optimizations. I used code like this: found = intersect(tree, set) where "tree" is an "OOBTree" and "set" usually had a single element (but could have more, of course). I found out, that this is often extremely slow -- much much slower than if len(set) == 1: key = set[0] if tree.has_key(key): found = set else found = OOSet() else: found = intersct(tree, set) In a fully optimized intersection, the difference should be very small. Path index searches are slow. It helped (for us) to reverse the order in which intersections are done (lower level path components tend to be more specific, leading to smaller intermediate intersection sets). Colleagues suggested to cache catalog results. I will implement that soon (however not for "ZCatalog" itself but for our "HaufeQuery" which is similar to your "CatalogQuery", just using query objects instead of query strings). "ZCatalog" should have an easy way to freely use "and", "or" "not" to combine subqueries to indexes -- similar to your "CatalogQuery" (or our "HaufeQuery"). -- Dieter