[Zope] relevance ranking in ZCTextIndex or equivalent

Jonathan dev101 at magma.ca
Wed May 31 11:16:36 EDT 2006


----- Original Message ----- 
From: "Miles Waller" <miles at jamkit.com>
To: <zope at zope.org>
Sent: Wednesday, May 31, 2006 10:59 AM
Subject: [Zope] relevance ranking in ZCTextIndex or equivalent


> Hi,
>
> I'm planning to implement a text search where
>
> (match against the title)
>  ranks more highly than
> (match in the description)
>  ranks more highly than
> (matches against the body text).
>
> Titles and descriptions are short bits of text, so results in these
> categories can be ranked just by the frequency that the word appears in
> that part of the text.  Matches against the body text should ideally be
> ranked more like ZCTextIndex rather than plain frequency.
>
> My ideas are:
>
> - do three separate searches, and then concatenate the result sets
> together.
> problem: making sure there are no duplicates in the list without parsing
> all the results in their entirety.
>
> - hijack the 'scoring' part of the index, so those results with matches
> in the title can have their scores artificially heightened to achieve
> the ordering i want
> problem: it's compleletely opaque without a lot of study whether this
> would achieve what i want.  i'd also need to index the items so the
> index knew what was in the title, which could be a problem.
>
> - index title, description and text separately, and then use dieter's
> AdvancedQuery product to do the query and combine results
> problem: is it possible to get at the scores when the documents are
> returned from the index to be able to order them?  are the scores
> returned separately, or will each query overwrite the last one?
>
> Has anyone ever tried to do this - or got any pointers - at all?

A definitely non-trivial task, but here are some ideas to get you pointed in 
the right (I hope) direction:

Try googling, or looking in the zope source for:

data_record_normalized_score_
BaseIndex.py
OkapiIndex.py
SetOps.py
okascore.c


Good Luck!

Jonathan 



More information about the Zope mailing list