[Zope] Efficient Processing Of Large ZCatalog Queries

VanL vlindberg@verio.net
Thu, 17 Oct 2002 14:48:21 -0600


Hello,

I have a zope setup that does a ZCatalog Query, grabs each item (i.e., 
it does not use the query-return objects), and then does some processing 
on each returned object.

In pseudocode, whenever I do a ZCatalog Query, I do the following:

return [myFunction(getObject(x)) for x in catalog.search(myquery)]

The problem is that some of the query response will be quite large -- up 
to 10,000 objects returned.  Doing a dtml-in over a result set this size 
does not seem to be feasible -- the browser times out, for one thing.  

I have avoided batch processing so far, because I understand (perhaps 
incorrectly) that batch-processing delays the evaluation of later batch 
results until the batch is viewed.  As I am running a script on the 
results, often via a cron job, I want all of the objects to be processed.

If this understanding is incorrect, please let me know.

So my questions are as follows:

1.  Is my understanding about batch processing correct or incorrect? 
 Can I have the browser only display the first hundred responses, but 
have all 5-10,000 results processed?

2.  Is there a way to stream responses as they come?  More specifically, 
would it perhaps be feasible to grab the result set and process them one 
at a time, displaying partial results?  I tried using 
REQUEST.RESPONSE.write, but the results all seemed to come out at the 
same time anyway.

3. Finally, what factors influence the speed of a ZCatalog query?  Is it 
the total number of indexed objects?  Is it constant?  Currently I do 
one query and then process the results.  If I reworked this so that 
behind the scenes it was doing n queries, I would obviously be slowing 
myself down, but by how much?

I have created a system for managing data, but I seem to be running into 
some difficulty now that I have put in the data.  For smaller numbers 
(up to 5000 main records), Zope seems to work fine, but I'm having 
trouble scaling up.  Can anyone help?

I'm happy to answer any questions to illuminate the problem more clearly.

Thanks In Advance,

Van Lindberg