----- Original Message ----- From: "Ken Ara" <feedreader@yahoo.com>
Some code you can use as a starting point to produce KWIC (key word in context) can be found here: http://zope.org/Members/Ioan/SiteSearch. It's old but works for me under Zope 2.7.3.
This implementation requires the searchable text to be in Catalog metadata, which seems to be a bad thing, but I have never really understood just how bad..
We have experimented with storing compressed and uncompressed metadata (up to 20k bytes per zcatalog record). This worked fine for us (in terms of retrieval speed and zodb size) until we hit about 500k records (we used our own KWIC scripts not SiteSearch). At that time retrieval time started to increase to unacceptable times (over 2 seconds per search) and the zodb started to become unwieldly (5gb+). At that time (slow retrieval speed) we had about 5 million objects in the zodb (after packing). I would suggest that storing lots of metadata is workable if: 1) you don't have a lot of records 2) you don't store too much metadata/record 3) you have lots (1gb+) of RAM 4) you have fast disks 5) you have a fast cpu We currently have a zcatalog with a single ZCTextindex which holds about 1 million records (zodb size is under 3gb). Our retrieval speed, include KWIC processing, is under 1.5 seconds per search. We have very little metadata (less than 100 bytes per record), and access the final result set objects to get the data we need for KWIC processing and result set display. HTH Jonathan