RE: [Zope] Need a list of words not indexed by Catalog
near the end of lib/python/SearchIndex/TextIndex.py is a list called 'stop_words' [Zope Dev] It would be good to move this out of the .py file into an editable, internationalizable resource file. -----Original Message----- From: Jason Spisak [mailto:webmaster@hiretechs.com] Sent: Monday, September 13, 1999 9:31 AM To: zope@zope.org Subject: [Zope] Need a list of words not indexed by Catalog Can we get a public list of the words not indexed by the Catalog? I just spent a few hours being tortured trying to find out why a guy in the Datbase of 10,000 whose name is 'Max Many', doesn't come up on a last name search, and guess what, the Catalog doesn't index his last name. The word 'many'. Suprize! This is the third gotcha word we've expericened (don't try to find C/C++ in a document) and it's makes people doubt the software. (I explain about the size of indexes and such, and I understand it, but every surprise word you can't find is another step backward.) You might publish a list of words you'll never find in the Catalog. Just in case one of them happens to be a very big part of someone's business. All my best, -- Jason Spisak webmaster@hiretechs.com _______________________________________________ Zope maillist - Zope@zope.org http://www.zope.org/mailman/listinfo/zope (To receive general Zope announcements, see: http://www.zope.org/mailman/listinfo/zope-announce For developer-specific issues, zope-dev@zope.org - http://www.zope.org/mailman/listinfo/zope-dev )
Terrel Shumway wrote:
near the end of lib/python/SearchIndex/TextIndex.py is a list called 'stop_words'
[Zope Dev] It would be good to move this out of the .py file into an editable, internationalizable resource file.
Agreed! And then there's the *multi* lingual issue too. What if I have Dutch and English on my site? Regards, Martijn
-----Original Message----- From: Martijn.Faassen@vet.uu.nl [mailto:Martijn.Faassen@vet.uu.nl]On Behalf Of Martijn Faassen Sent: Tuesday, September 14, 1999 11:35 AM Cc: 'zope-dev@zope.org' Subject: [Zope-dev] Re: [Zope] Need a list of words not indexed by Catalog
Terrel Shumway wrote:
near the end of lib/python/SearchIndex/TextIndex.py is a list called 'stop_words'
[Zope Dev] It would be good to move this out of the .py file into an editable, internationalizable resource file.
Agreed! And then there's the *multi* lingual issue too. What if I have Dutch and English on my site?
Together with French and German and Chinese and Antarctic? In other words: it would be a good idea to make stopwords/indexing configurable. The second question is on what basis it should be configurable: on a document basis or automatically (based on a language property/header info/metainfo) or manually (on a folder basis). Hm, this is only the surface - because if you think a bit more about it you would want the index/catalog to yield multilingual results from a single search :-(. I do not want to think about what this implies. It seems like you run into a _lot_ of complexities with multilingual issues, and still these are real issues for many of us. Rik
participants (3)
-
Martijn Faassen -
Rik Hoekstra -
Terrel Shumway