[Zope-dev] ZCatalog text index search bugs?

Dieter Maurer dieter@handshake.de
Tue, 13 Jun 2000 23:18:31 +0200 (CEST)


R. David Murray writes:
 > I am very confused.
 > 
 > I'm looking at the SearchIndex source under 2.1.4 (2.1.6 seems to be
 > the same).  In Lexicon.py the 'query' method defines the default_operator
 > to be 'or'.  I can't see that TextIndex overrides this when it calls
 > it.
 > 
 > But the response to PR 1141 (against 2.1.6) in the collector says:
 > 
 >           The TextIndex search does an AND, not an OR, of the search
 >           words: if you ask it to find "foo bar", it returns only
 >           objects matching *both* "foo" and "bar", rather than object
 >           matching *either* "foo" or "bar" (which Jason expected).
This is definitely not the case!

 > Indeed, if you do a search that includes a word that is not on an
 > item, the item is not returned.  So how is that working?
That is a bug I discovered and analysed yesterday.
  The index lookup is done by "index[word]". This raises
  a KeyError exception, if "word" is not in the index. The exception
  aborts the search; it returns without hit.
  The behaviour is correct for "and" but of cause
  wrong for "or".

After I had gotten several complaints about reporting bugs
already fixed in CVS, I checked out the CVS today.
The relevant code has undergone quite a few modifications.
I will take some time to see, if the bug remains.
Today, it is already too late. Maybe, I will see tomorrow.

 > ....

 > So I think 'or' searching is broken, and that text indexes being
 > a default 'and' search is just an accident <grin>.
You are right!

 > ....

 > (*) I recall reading that the 'near' operator, which is used if
 > the splitter breaks up a word in the search string, is not really
 > supported and that the 'and' operator is used instead.)
The "near" operator in 2.1.6 raises an exception. This means,
no hits.

I have posted a patch some days ago that fixes near searches,
except for objects that can not be stored in ZODB like
LocalFS objects. For such objects, it uses an "and" to
approximate "near".

 > ....

 > If I can reproduce this in 2.2.0b I'll file it in the collector.
I will watch any posts from you.
It is not necessary that we do the same work.


Dieter