[Zope-CVS] CVS: Products/ZCTextIndex - ZCTextIndex.py:1.1.2.15
Fred L. Drake, Jr.
fdrake@acm.org
Tue, 7 May 2002 17:38:00 -0400
Update of /cvs-repository/Products/ZCTextIndex
In directory cvs.zope.org:/tmp/cvs-serv11800
Modified Files:
Tag: TextIndexDS9-branch
ZCTextIndex.py
Log Message:
Splitter: Pre-compile the regex so SRE doesn't need to look it up all
the time. Not a biggie, but it seems to be a good idea.
StopWordRemover: Instead of testing the length of the word and
inclusion in the stop words dict as two separate tests, add all
1-character 8-bit strings to the dict and just use one test for
each word.
=== Products/ZCTextIndex/ZCTextIndex.py 1.1.2.14 => 1.1.2.15 ===
import re
+import string
import ZODB
from Persistence import Persistent
@@ -67,10 +68,12 @@
class Splitter:
+ rx = re.compile(r"\w+")
+
def process(self, lst):
result = []
for s in lst:
- result += re.findall(r"\w+", s)
+ result += self.rx.findall(s)
return result
class CaseNormalizer:
@@ -80,8 +83,10 @@
class StopWordRemover:
- dict = get_stopdict()
+ dict = get_stopdict().copy()
+ for c in range(255):
+ dict[chr(c)] = None
def process(self, lst):
- d = self.dict
- return [w for w in lst if len(w) > 1 and not d.has_key(w)]
+ has_key = self.dict.has_key
+ return [w for w in lst if not has_key(w)]