New subject: [Zope-dev] Re: Zope Mailing Lists and ZCatalog

4 Aug 2000

      Andy Dawkins wrote:
...
Michel
In case you are not aware, we at NIP currently host a complete archive of
the Zope mailing lists that are publicly available.
Yep.
...
We are using ZCatalog to index all the messages from the Mailing list
archives.  To give you an idea of numbers, the Zope mailing list alone is
over 30,000 messages.
...
The problem we have is getting that many objects in to the Catalog.  If we
load the objects in to the ZODB, then catalog them, the machine either runs
out of memory or, if we lower the sub transactions, It runs out of hard
drive space.
This is because you are indexing more content than you have virtual+tmp
memory to store the transaction in.  Zope is transaction, as I'm sure
you know, so it has to store the transaction somewhere so it can roll it
back if neccesary, and memory+tmp storage is where that goes
(subtransactions are swapped out to tmp).
...
If we use CatalogAware to catalog the objects as they are imported the
Catalog explodes to stupid sizes because CatalogAware doesn't support Sub
transactions.
Subtransactions are a storage thing, and really don't have anything to
do with catalogaware, if you have a subtransaction threshold set then
subtransactions will be used for any cataloging operation, catalogaware
or not.
...
We could solve these issues by regularly packing the database during the
import, but it isn't a perfect solution.
I'm not sure what you mean with these last to paragraphs, it seems like
you have two problems:

1) you are mass indexing and running out of memory

2) you are indexing lots of content quickly and your database is growing

The answer to 1 is to not mass index and incrimentatly index over time. 
The answer to 2 is to use a storage that does not store old revisions,
like berkeley storage.
...
Also as messages arrived over time the Catalog would once again explode
dramatically,
...
Basically we(NIP) would like to know if you(Michel/DC) are planning to
improve ZCatalog/CatalogAware, if you are planning a successor to ZCatalog
or basically any information that could be useful to us regarding the
current development and urgency of ZCatalog/CatalogAware.
There isn't anything wrong with the Catalog (for this particular
problem), or at least, there isn't anything in the catalog to fix that
would solve your problem.  We've had customers index well over 50,000
objects; you just have to understand the resource constraints and work
with them, for example, don't mass index, use storages that scale to
high write environments, etc.
...
Thanks in advance for your assistance.
NP.

-Michel

Re: Zope Mailing Lists and ZCatalog

Michel Pelletier

R. David Murray

Kapil Thangavelu

tags

participants (3)