Harry Wilkinson wrote:
I have two problems with getting ZCatalog to search for what I need:
1) Need to be able to search for words like 'J++' and 'C#' - this is relatively simple to do by editing Splitter.c a little and recompiling 2) Need to be able to search for single-letter words like 'C' - this is easy to modify Splitter.c to accomodate, but causes errors in GlobbingLexicon.py, even though the vocabulary is standard
So far I have solved problem (1) by changing the contents of Splitter.c, but that's a bit messy. Currently I don't know of an alternative though.
I have modified Splitter.c so it indexes the extra characters, and reduced the mimimum word length to 1, which works fine when indexing, and I can see all the symbol-inclusive words and single-letter words in the vocabulary. Unfortunately, any search on a single-letter word gives an IndexError, "String out of range".
This is because the globbinglexicon never anticipated single letter patterns. This is a bug. Try this (untested) quick patch: Index: GlobbingLexicon.py =================================================================== RCS file: /cvs-repository/Zope2/lib/python/SearchIndex/GlobbingLexicon.py,v retrieving revision 1.9 diff -c -r1.9 GlobbingLexicon.py *** GlobbingLexicon.py 2001/04/02 18:19:45 1.9 --- GlobbingLexicon.py 2001/07/26 05:21:48 *************** *** 221,226 **** --- 221,229 ---- if i == 0: digrams.insert(i, (self.eow + pattern[i]) ) + if len(pattern) == 1: + digrams.append( (pattern[i] + self.eow) ) + break digrams.append((pattern[i] + pattern[i+1])) else: try:
I am stuck on problem (2) and don't know how to avoid the errors arising in GlobbingLexicon.py without editing in some kind of hack to get around it.
That's exactly what this patch does.
I don't even know why GlobbingLexicon is getting involved in the search process since I am not trying to use wildcards and haven't elected to use a globbing vocabulary (AFAIK).
You must have somehow, GlobbingLexicon is never the default. -Michel