[Zope-dev] Modifying Splitter.c to search on '+' & '#', and single
letter words
Harry Wilkinson
harryw@nipltd.com
Thu, 26 Jul 2001 10:38:55 +0100
This seems to work perfectly, thanks a lot :D
I am pretty sure I'm not using a globbing vocabulary, I've tried deleting the
test ZCatalog I was using and creating a new one and using the vocabulary it
gives me. Is it meant to use GlobbingLexicon.py for all vocabularies?
Well thanks again :)
Harry
Michel Pelletier wrote:
> Harry Wilkinson wrote:
> >
> > I have two problems with getting ZCatalog to search for what I need:
> >
> > 1) Need to be able to search for words like 'J++' and 'C#'
> > - this is relatively simple to do by editing Splitter.c a little
> > and recompiling
> > 2) Need to be able to search for single-letter words like 'C'
> > - this is easy to modify Splitter.c to accomodate, but causes
> > errors in GlobbingLexicon.py, even though the vocabulary is standard
> >
> > So far I have solved problem (1) by changing the contents of Splitter.c,
> > but that's a bit messy. Currently I don't know of an alternative
> > though.
> >
> > I have modified Splitter.c so it indexes the extra characters, and
> > reduced the mimimum word length to 1, which works fine when indexing,
> > and I can see all the symbol-inclusive words and single-letter words in
> > the vocabulary. Unfortunately, any search on a single-letter word gives
> > an IndexError, "String out of range".
>
> This is because the globbinglexicon never anticipated single letter
> patterns. This is a bug. Try this (untested) quick patch:
>
> Index: GlobbingLexicon.py
> ===================================================================
> RCS file:
> /cvs-repository/Zope2/lib/python/SearchIndex/GlobbingLexicon.py,v
> retrieving revision 1.9
> diff -c -r1.9 GlobbingLexicon.py
> *** GlobbingLexicon.py 2001/04/02 18:19:45 1.9
> --- GlobbingLexicon.py 2001/07/26 05:21:48
> ***************
> *** 221,226 ****
> --- 221,229 ----
>
> if i == 0:
> digrams.insert(i, (self.eow + pattern[i]) )
> + if len(pattern) == 1:
> + digrams.append( (pattern[i] + self.eow) )
> + break
> digrams.append((pattern[i] + pattern[i+1]))
> else:
> try:
>
> > I am stuck on problem (2) and don't know how to avoid the errors arising
> > in GlobbingLexicon.py without editing in some kind of hack to get around
> > it.
>
> That's exactly what this patch does.
>
> > I don't even know why GlobbingLexicon is getting involved in the
> > search process since I am not trying to use wildcards and haven't
> > elected to use a globbing vocabulary (AFAIK).
>
> You must have somehow, GlobbingLexicon is never the default.
>
> -Michel
>
> _______________________________________________
> Zope-Dev maillist - Zope-Dev@zope.org
> http://lists.zope.org/mailman/listinfo/zope-dev
> ** No cross posts or HTML encoding! **
> (Related lists -
> http://lists.zope.org/mailman/listinfo/zope-announce
> http://lists.zope.org/mailman/listinfo/zope )