Andy McKay wrote:
Any cataloguing and un-cataloguing of an object is expensive, c'mon you are changing all the indices, vocabulary and so on. You never notice it normally for 1 - 10 things, but run an import script of 10000 and catalog each object as it gets added (rather than all of them at the end) and you'll notice the difference. (This script was cataloguing 250,000 mail messages, one at a time. Big no-no)
Perhaps I expressed myself poorly. What I am watching out for is evidence that adding, indexing, reindexing, or retreiving *a single object* (or a small set of objects), takes longer if there are more objects stored/indexed already. In other words, does the time to store/index/reindex/retreive an object change (for the worse) depending on whether there are 10,000 objects, 100,000 objects or 10,000,000 objects stored/cataloged in the ZODB/ZCatalog? Previously, the fact that searching performance suffered depending on a combination of number of total objects and the size of the result set (irrespective of the batch size, apparently), came to light, and has apparently been fixed. Now searching performance scales with the number of cataloged objects. So, are there any non-linear gotchas waiting for me? Michael Bernstein.