Subject: Re: [Zope-dev] Comment on CVS change
Casey Duncan <cduncan@kaivo.com> wrote:
"R. David Murray" wrote:
To: zope-checkins@zope.org Date: Mon, 19 Mar 2001 19:47:31 -0500 (EST) From: chrism@digicool.com (Chris McDonough) Subject: [Zope-Checkins] CVS: Zope2 - UnTextIndex.py:1.33.2.9
Update of /cvs-repository/Zope2/lib/python/SearchIndex In directory korak:/home/chrism/sandboxes/2_3Branch/lib/python/SearchIndex
Modified Files: Tag: zope-2_3-branch UnTextIndex.py Log Message: Changed default query operator to 'AND' instead of 'OR' (way improved search results).
This strikes me as a Very Bad Thing. Not the idea of having AND be the default query operator, but the fact of *changing* the default. If I remember correctly the default is documented (insofar as it is documented) as being AND, but the default has been OR for so long that I'm sure there are many sites that will break if this change is committed to a release. Worse, the behavior of a site as seen by end users will change. Finally, it is my experience that most search engines use OR as the default operator. I prefer AND myself, but OR appears to be something of a defacto standard.
I'm willing to be convinced, though <grin>.
A way to set the default would be cool, but might open up a can of worms.
I'm a little torn between being alarmed at this and agreeing with it. I think if this is done there should be an TTW interface in ZCatalog to switch between AND and OR as the default. That would be the best solution IMHO. Also a _documented_ way to switch it in Catalog as well.
I've gotta weigh in here, too; the breakage induced by this change will be large. Give that what *real* users expect is *neither* a Boolean "AND" *nor* a Boolean "OR", but instead a DWIM/Googlesque "affinity" search, I don't think the win is clear enough to warrant the breakage: * "AND" searches return *small* result sets; non-programmers will be surprised by the often-empty lists they get back, and won't have any clue how to broaden their search. False negatives suck. * "OR" searches return *long* results sets; non-programmers will be unhappy that the items they want are buried in the muck. False positives suck. Given that any site manager can override the policy trivially, using only two lines of DTML, should we really be switching the (admittedly arbitrary) existing polciy embedded in the core? Tres. P.S. <dtml-let search_terms="_.string.split( search_text )" search_with_and="_.string.join( search_terms, ' and ' )"> <dtml-var searchCatalog> </dtml-let> OK, so it is two and a half lines. -- =============================================================== Tres Seaver tseaver@digicool.com Digital Creations "Zope Dealers" http://www.zope.org
I've gotta weigh in here, too; the breakage induced by this change will be large. Give that what *real* users expect is *neither* a Boolean "AND" *nor* a Boolean "OR", but instead a DWIM/Googlesque "affinity" search, I don't think the win is clear enough to warrant the breakage:
* "AND" searches return *small* result sets; non-programmers will be surprised by the often-empty lists they get back, and won't have any clue how to broaden their search. False negatives suck.
I think most people are getting used to narrowing their search by adding terms. Google, Yahoo, Lycos and the like have trained them to do this. I think the idea that folks, even nonprogrammers, don't know to do this in the post-1995 world may be a little flawed.
Given that any site manager can override the policy trivially, using only two lines of DTML, should we really be switching the (admittedly arbitrary) existing polciy embedded in the core?
No, I suppose not. I'll change it back. :-( Not happy about it.
Chris McDonough wrote:
I've gotta weigh in here, too; the breakage induced by this change will be large. Give that what *real* users expect is *neither* a Boolean "AND" *nor* a Boolean "OR", but instead a DWIM/Googlesque "affinity" search, I don't think the win is clear enough to warrant the breakage:
I think long term, the Catalog machinery should support such "affinity" searching.
* "AND" searches return *small* result sets; non-programmers will be surprised by the often-empty lists they get back, and won't have any clue how to broaden their search. False negatives suck.
I think most people are getting used to narrowing their search by adding terms. Google, Yahoo, Lycos and the like have trained them to do this. I think the idea that folks, even nonprogrammers, don't know to do this in the post-1995 world may be a little flawed.
I rarely find myself using any explicit boolean operators when I use Google. And even when it returns 657,340,269 pages, the ones I wanted tend to be in the top 30. I think "OR" searching is fine if the result scoring can be done intelligently somehow.
Given that any site manager can override the policy trivially, using only two lines of DTML, should we really be switching the (admittedly arbitrary) existing polciy embedded in the core?
No, I suppose not. I'll change it back. :-( Not happy about it.
I still say a toggle in the Catalog management interface is the best solution. -- | Casey Duncan | Kaivo, Inc. | cduncan@kaivo.com `------------------>
Casey Duncan wrote:
I rarely find myself using any explicit boolean operators when I use Google. And even when it returns 657,340,269 pages, the ones I wanted tend to be in the top 30. I think "OR" searching is fine if the result scoring can be done intelligently somehow.
It's pretty simple. The default operator should continue to be 'OR', but the result sorting should give precedence to results that satisfy the 'AND' condition. No, I don't know hoe to get it to do this. Didn't Catalog results used to have a 'score' attribute for something like this? HTH, Michael Bernstein.
On Wed, 21 Mar 2001, Tres Seaver wrote:
Given that any site manager can override the policy trivially, using only two lines of DTML, should we really be switching the (admittedly arbitrary) existing polciy embedded in the core? [... <dtml-let search_terms="_.string.split( search_text )" search_with_and="_.string.join( search_terms, ' and ' )"> <dtml-var searchCatalog> </dtml-let>
OK, so it is two and a half lines.
Well, actually it's a *lot* more than that, unless you have both a regular search and an 'advanced search' box. Consider what happens if the user enters the following search string: foo or bar The code above produces the search "foo and or and bar". I'm actally not sure what Catalog will do with that, but I doubt it is what the searcher intended. Even if you have regular and advanced search boxes, having the default operator in one be the opposite of the default operator in the other would be a bad thing from a user interface design standpoint, IMO. Making the default operator settable strikes me as simpler/better design than writing a method that will do the transformation correctly, since the latter basically requries duplicating the top level of the text index search input parser. --RDM
participants (5)
-
Casey Duncan -
Chris McDonough -
Michael R. Bernstein -
R. David Murray -
Tres Seaver