Re: [Zope-dev] ZCatalog text index search bugs?
R. David Murray writes:
I am very confused.
I'm looking at the SearchIndex source under 2.1.4 (2.1.6 seems to be the same). In Lexicon.py the 'query' method defines the default_operator to be 'or'. I can't see that TextIndex overrides this when it calls it.
But the response to PR 1141 (against 2.1.6) in the collector says:
The TextIndex search does an AND, not an OR, of the search words: if you ask it to find "foo bar", it returns only objects matching *both* "foo" and "bar", rather than object matching *either* "foo" or "bar" (which Jason expected). This is definitely not the case!
Indeed, if you do a search that includes a word that is not on an item, the item is not returned. So how is that working? That is a bug I discovered and analysed yesterday. The index lookup is done by "index[word]". This raises a KeyError exception, if "word" is not in the index. The exception aborts the search; it returns without hit. The behaviour is correct for "and" but of cause wrong for "or".
After I had gotten several complaints about reporting bugs already fixed in CVS, I checked out the CVS today. The relevant code has undergone quite a few modifications. I will take some time to see, if the bug remains. Today, it is already too late. Maybe, I will see tomorrow.
....
So I think 'or' searching is broken, and that text indexes being a default 'and' search is just an accident <grin>. You are right!
....
(*) I recall reading that the 'near' operator, which is used if the splitter breaks up a word in the search string, is not really supported and that the 'and' operator is used instead.) The "near" operator in 2.1.6 raises an exception. This means, no hits.
I have posted a patch some days ago that fixes near searches, except for objects that can not be stored in ZODB like LocalFS objects. For such objects, it uses an "and" to approximate "near".
....
If I can reproduce this in 2.2.0b I'll file it in the collector. I will watch any posts from you. It is not necessary that we do the same work.
Dieter
Dieter Maurer wrote:
R. David Murray writes:
I am very confused.
The TextIndex search does an AND, not an OR, of the search words: if you ask it to find "foo bar", it returns only objects matching *both* "foo" and "bar", rather than object matching *either* "foo" or "bar" (which Jason expected). This is definitely not the case!
Indeed, if you do a search that includes a word that is not on an item, the item is not returned. So how is that working? That is a bug I discovered and analysed yesterday. The index lookup is done by "index[word]". This raises a KeyError exception, if "word" is not in the index. The exception aborts the search; it returns without hit. The behaviour is correct for "and" but of cause wrong for "or".
This is fixed in 2.2b1, whe index not finding the word should not raise an exception anymore and the search should not abort. Can you confirm that when you get a chance? -- -Michel Pelletier http://www.zope.org/Members/michel/MyWiki Visit WikiCentral for the latest Zen: http://www.zope.org/Members/WikiCentral
participants (2)
-
Dieter Maurer -
Michel Pelletier