[Zope] Efficient Processing Of Large ZCatalog Queries
VanL
vlindberg@verio.net
Thu, 17 Oct 2002 14:48:21 -0600
Hello,
I have a zope setup that does a ZCatalog Query, grabs each item (i.e.,
it does not use the query-return objects), and then does some processing
on each returned object.
In pseudocode, whenever I do a ZCatalog Query, I do the following:
return [myFunction(getObject(x)) for x in catalog.search(myquery)]
The problem is that some of the query response will be quite large -- up
to 10,000 objects returned. Doing a dtml-in over a result set this size
does not seem to be feasible -- the browser times out, for one thing.
I have avoided batch processing so far, because I understand (perhaps
incorrectly) that batch-processing delays the evaluation of later batch
results until the batch is viewed. As I am running a script on the
results, often via a cron job, I want all of the objects to be processed.
If this understanding is incorrect, please let me know.
So my questions are as follows:
1. Is my understanding about batch processing correct or incorrect?
Can I have the browser only display the first hundred responses, but
have all 5-10,000 results processed?
2. Is there a way to stream responses as they come? More specifically,
would it perhaps be feasible to grab the result set and process them one
at a time, displaying partial results? I tried using
REQUEST.RESPONSE.write, but the results all seemed to come out at the
same time anyway.
3. Finally, what factors influence the speed of a ZCatalog query? Is it
the total number of indexed objects? Is it constant? Currently I do
one query and then process the results. If I reworked this so that
behind the scenes it was doing n queries, I would obviously be slowing
myself down, but by how much?
I have created a system for managing data, but I seem to be running into
some difficulty now that I have put in the data. For smaller numbers
(up to 5000 main records), Zope seems to work fine, but I'm having
trouble scaling up. Can anyone help?
I'm happy to answer any questions to illuminate the problem more clearly.
Thanks In Advance,
Van Lindberg