Re: [Zope-dev] 100k+ objects, or...Improving Performance of BTreeFolder...

10 Dec 2001


      I'm not sure if this is taken into consideration in your work so far/future 
plans...  but just in case you were unaware, it is not necessary for you to 
persistently store objects in the ZODB that you intend to index in a 
ZCatalog.  All that is required is that the object to be cataloged is 
accessible via a URL path.  ZSQL methods can be set up to be 
URL-traversable, and to wrap a class around the returned row.  To load the 
items into the catalog, you can use a PythonScript or similar to loop over 
a multi-row query, passing the objects directly to the catalog along with a 
path that matches the one they'll be retrievable from.  This approach would 
eliminate the need for BTreeFolder altogether, although of course it 
requires access to the RDBMS for retrievals.  This should reduce the number 
of writes and allow for bigger subtransactions in a given quantity of memory.


At 07:36 PM 12/9/01 -0800, sean.upton@uniontrib.com wrote:
...
Interesting FYI for those looking to support lots of cataloged objects in
ZODB and Zope (Chris W., et al)... I'm working on a project to put ~350k
Cataloged objects (customer database) in a single BTreeFolder-derived
container; these objects are 'proxy' objects which each expose a single
record in a relational dataset, and allow about 8 fields to be indexed (2 of
which, TextIndexes).
...
- Also, I want to make it clear that if I had a data access API that needed
more than simple information about my datasets (i.e. I was trying to do
reporting on patterns, like CRM-ish types of applications), I would likely
wrap a function around indexes done in the RDB, not in Catalog.  My requires
no reporting functionality, and thus really needs no indexes, other than for
finding a record for customer service purposes and account validation
purposes.  The reason, however, that I chose ZCatalog was for full text
indexing that I could control/hack/customize easily.  My slightly uninformed
belief now is that for big datasets or "enterprise" applications (whatever
that means), I would use a hybrid set of (faster) indexes using the RDB's
indexes where appropriate (heavily queried fields), and ZCatalog for
TextIndexes (convenient).   I'm sure inevitable improvements to ZCatalog
(there seems to be community interest in such) will help here.

Re: [Zope-dev] 100k+ objects, or...Improving Performance of BTreeFolder...

Phillip J. Eby