[Zope] Re: Zope Splitter

sean.upton@uniontrib.com sean.upton@uniontrib.com
Wed, 26 Sep 2001 09:23:59 -0700


In the Splitter.c code, there is some line in the function check_synstop()
(can't remember which line number) that is part of the code that marks
things as stopwords.  One of the things it does is evaluate an isalpha()
function on the string.  You would want to change that to isalnum() to
enable numbers and recompile- that's it... If you need smaller words, you
will make changes to the line that has the comment /* Single-letter words
are stop words! */ in Splitter.c as well as make a change to
GlobbingLexicon.py (in get() line 224 in Zope 2.3.2) from

digrams.append((pattern[i] + pattern[i+1]))

to:

		if (len(pattern) == 1):
		   digrams.append((pattern[i] + self.eow))
		   break
            digrams.append((pattern[i] + pattern[i+1]))

Sean

-----Original Message-----
From: Dieter Maurer [mailto:dieter@handshake.de]
Sent: Tuesday, September 25, 2001 1:48 PM
To: tovesj@oclc.org
Cc: zope@zope.org
Subject: [Zope] Re: Zope Splitter


We want to have all questions go to the list....
Reply redirected....

Toves,Jenny writes:
 > I have been tasked with getting our textindex to index words that are all
 > digits (such as '123'). I saw an archive with a message
 >
(<http://zope.nipltd.com/public/lists/zope-archive.nsf/ByKey/AE534A02934E861
 > 2> 
 > ) from you that looks like you have fixed a similar problem. Is modifying
 > the spliiter code the only way to fix it? Did you get it compiled and
 > re-installed? Was the HowTo mentioned in the response to your message
 > helpful?
I updated the splitter in order to get UTF-8 strings indexed.
Someone else extended the splitter for recognizing numbers
and words with less than 3 characters.

There have been no problems at all when I modified and recompiled
the splitter (under Unix!). I just edited the code and
started "make" in the corresponding directory. That was it...


Dieter

_______________________________________________
Zope maillist  -  Zope@zope.org
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )