The proxy objects exist in the odb to provide the ability to catalog each record. Right now, I've worked with sets of up to 20k objects indexed in a Catalog using several field and globbing text indexes without issue. As for cataloging them (and hoping for the best), I will likely use sub-transactions in the method in my container that mass-rebuilds (and thus reindexes) the BTreeFolder's contained objects, which will use some very small fields in the relational database obtained via methods that grab them from the database. Each record in the database total is about 400 bytes, so fairly small, and I don't plan on storing much metadata in the Catalog, but I imagine that the search index will be at least 100MB... The fields are small, mostly customer information (name, address, that sort of thing), and the main reason I want to catalog them is that I want customer service reps to be albe to do a globbing full-text search on a portion of a full-name field that isn't always consistent (i.e. "John Doe" vs. "Doe, John" or all kinds of other variants). Sean -----Original Message----- From: Chris Withers [mailto:chrisw@nipltd.com] Sent: Thursday, November 08, 2001 12:13 AM To: sean.upton@uniontrib.com Cc: zope@zope.org Subject: Re: [Zope] BTreeFolders + Catalog + lots of objects? sean.upton@uniontrib.com wrote:
Has anyone used the BTreeFolder product to store hundreds-of-thousands or millions of objects?
Nope.
I'm developing an internal CRM system that will contain somewhere between 300k-500k records, stored in a back-end relational datastore, and exposed via metadata proxy objects (1-per-record) sitting in a container subclassed from BTreeFolder;
Is there any particular reason the proxy objects need to live in the ZODB?
I am wondering if anybody has done anything similar to this, in terms of number of objects stored in a BTreeFolder, and the type of storage that they used. I'm also wondering about anyone using ZCatalog for such a large number of indexed objects.
I'm having infinite ammounts of fun (not!) attempting to index 40,000 word documents. It largely depends on what types of indexes you will be using on these objects and what size the objects themselves are. If the objects are anything more than extremely simple and small proxies, I reckon you'll run into problems with BTreeFolder. Likewise, if you're indexing is anything other than simple Field indexing (and with that number of objects, even that may be enough) you won't successfully index that many objects. It's go for sticking the lot in your relational backend. cheers, Chris (who's had his faith in ZODB scalability systematically destroyed over the last couple of months :-( )
I'm also thinking of overriding BTreeFolder.manage_main_listing() with a user interface that allows users to simply type in an object id into a
text
box, instead of listing them (the string object ids correspond to a long integer from 1..n); it wouldn't work very well to list half-a-million objects in a html-form select control... Ideally, I'd do batching of some sort, but I'm having trouble figuring out how I would do that.
Anyone have any thoughts?
Sean
========================= Sean Upton Senior Programmer/Analyst SignOnSanDiego.com The San Diego Union-Tribune 619.718.5241 sean.upton@uniontrib.com =========================
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )