Re: [Zope-dev] Request for a Pluggin Index (NameIndex)

11 Jun 2001

      Matt Hamilton wrote:
...
I would like to help if I had time :)  I think the most efficient way of
doing what you want is to construct an index based on a 'Suffix Trie' this
essentially allows matching of arbitrary substrings very quickly, the only
problem is that it takes up a fair amount of space.  The upside is that it
can be updated and incrementally added to quite easily (unlike many
inverted list implementations).
I confess I have not had the chance to look at the pluggable index types
in 2.4, but would really like to as I would like to port over some
indexing code I was working on for another project that allows compressed
storage of inverted lists for indexes.  On average you can store a 32-bit
document id/ref in around 4 bits, which means you save a lot of space and
can keep stopwords in the lexicon (as an example try searching for 'to be
or not to be' in an index that removes stopwords :).  Not only do you save
space, but due to the way the inverted list is read and decompressed you
save time on disk access for large indexes as there is less to physically
read.
Wow Matt, you seem to know what you're talking about :-)

If you get a chance to implement the index I asked about, please gimme a shout,
I'd love to try it out...

cheers,

Chris

PS: Whereabouts in the UK are you?

Re: [Zope-dev] Request for a Pluggin Index (NameIndex)

Chris Withers