[Zope-CVS] CVS: Products/ZCTextIndex - Index.py:1.1.2.22

Jeremy Hylton jeremy@zope.com
Fri, 3 May 2002 14:51:44 -0400


Update of /cvs-repository/Products/ZCTextIndex
In directory cvs.zope.org:/tmp/cvs-serv13522

Modified Files:
      Tag: TextIndexDS9-branch
	Index.py 
Log Message:
Use a dict to store small maps inside _wordinfo.


=== Products/ZCTextIndex/Index.py 1.1.2.21 => 1.1.2.22 ===
         return d.keys(), freqs, scaled_int(math.sqrt(Wsquares))
 
+    DICT_CUTOFF = 10
+
     def _add_wordinfo(self, wid, f, docid):
+        # Store a wordinfo in a dict as long as there are less than
+        # DICT_CUTOFF docids in the dict.  Otherwise use an IIBTree.
+
+        # The pickle of a dict is smaller than the pickle of an
+        # IIBTree, substantially so for small mappings.  Thus, we use
+        # a dictionary until the mapping reaches DICT_CUTOFF elements.
+
+        # The cutoff is chosen based on the implementation
+        # characteristics of Python dictionaries.  The dict hashtable
+        # always has 2**N slots and is resized whenever it is 2/3s
+        # full.  A pickled dict with 10 elts is half the size of an
+        # IIBTree with 10 elts, and 10 happens to be 2/3s of 2**4.  So
+        # choose 10 as the cutoff for now.
         try:
             map = self._wordinfo[wid]
         except KeyError:
-            map = IIBTree()
+            map = {}
+        if len(map) == self.DICT_CUTOFF:
+            map = IIBTree(map)
         map[docid] = f
         self._wordinfo[wid] = map