15 Aug
2001
15 Aug
'01
1:39 p.m.
On Tue, 14 Aug 2001 sean.upton@uniontrib.com wrote:
Numbers should be accommodated too: a search for a 2000 Mercedes C230 ( http://classifieds.signonsandiego.com/results?searchTextWeighted=2000+Merced es+C230 ) would find matches for Mercedes, but 2000 and C230 would be stop-words in the default behavior in Splitter.c; I had to modify the source to use isalnum() instead of isalpha() so that they would not be caught as stop-words...
One can argue about whether or not pure numbers should be indexed by default (I think they should be), but it is very hard for me to imagine a situation in which treating a word that is a mixture of alphabetic and numeric characters as a stop word is the correct behavior, at least in English. --RDM