Anselm Lingnau <lingnau@tm.informatik.uni-frankfurt.de> wrote:
I'm using Zope to re-vamp a web site, one of whose most important features is an archive of a reasonably busy mailing list, which is accessed using home-grown Perl CGI code. I've written Python code to let users browse the archive sorted by users, subject etc., but now I'm looking at allowing text searches. The »old« instance of the web site used Glimpse and a simple CGI script (in Perl) to do this across the whole site (including the mail archive) and ideally this would be what I'm after for the new version as well.
However, the mail archive now weighs in at about 45 MB in individual text files (one per message), and I don't really see myself putting this into the ZODB so I can use ZCatalog. ZCatalog, however, looks good for indexing the rest of the site (I haven't done this yet). Is there a reasonable way of interfacing Glimpse with the Zope searching machinery so I could again have one-stop searching of the whole site? (It would probably be straightforward to search just the mail archive by calling out to Glimpse and massaging the results.)
You could probably use ZCatalog in conjunction with LocalFS to accomplish this; I think LocalFS was recently revved to allow cataloguing. Note that the actual mass-indexing process is going to be *painful*, as ZCatalog is intended to ease incremental indexing. I think I would write a script which walked the hierarchy, calling a method to index one (or a few) messages at a time. This script might also need to pack the database at intervals; the catalog is a bit space inefficient across mutliple index/reindex operations. Tres. -- =============================================================== Tres Seaver tseaver@digicool.com Digital Creations "Zope Dealers" http://www.zope.org