[Zope] Making lots of external data searchable?
Tres Seaver
tseaver@digicool.com
Sat, 02 Dec 2000 11:43:03 -0500
Anselm Lingnau <lingnau@tm.informatik.uni-frankfurt.de> wrote:
> I'm using Zope to re-vamp a web site, one of whose most
> important features is an archive of a reasonably busy mailing
> list, which is accessed using home-grown Perl CGI code. I've
> written Python code to let users browse the archive sorted by
> users, subject etc., but now I'm looking at allowing text
> searches. The »old« instance of the web site used Glimpse and a
> simple CGI script (in Perl) to do this across the whole site
> (including the mail archive) and ideally this would be what I'm
> after for the new version as well.
>
> However, the mail archive now weighs in at about 45 MB in
> individual text files (one per message), and I don't really see
> myself putting this into the ZODB so I can use ZCatalog.
> ZCatalog, however, looks good for indexing the rest of the site
> (I haven't done this yet). Is there a reasonable way of
> interfacing Glimpse with the Zope searching machinery so I
> could again have one-stop searching of the whole site? (It
> would probably be straightforward to search just the mail
> archive by calling out to Glimpse and massaging the results.)
You could probably use ZCatalog in conjunction with LocalFS to
accomplish this; I think LocalFS was recently revved to allow
cataloguing.
Note that the actual mass-indexing process is going to be *painful*,
as ZCatalog is intended to ease incremental indexing. I think I
would write a script which walked the hierarchy, calling a method
to index one (or a few) messages at a time. This script might
also need to pack the database at intervals; the catalog is a
bit space inefficient across mutliple index/reindex operations.
Tres.
--
===============================================================
Tres Seaver tseaver@digicool.com
Digital Creations "Zope Dealers" http://www.zope.org