[Zope-Checkins] CVS: Zope/lib/python/BTrees - BTreeModuleTemplate.c:1.27 BTreeTemplate.c:1.33 Maintainer.txt:1.7

Tim Peters tim.one@comcast.net
Fri, 31 May 2002 16:01:16 -0400


Update of /cvs-repository/Zope/lib/python/BTrees
In directory cvs.zope.org:/tmp/cvs-serv25366

Modified Files:
	BTreeModuleTemplate.c BTreeTemplate.c Maintainer.txt 
Log Message:
Introduced a new BTREE_SEARCH macro for searching interior BTree nodes,
to squash delicate code duplication and for speed.

+ This is optimized in several subtle ways over the current method.  This
  is documented in Maintainer.txt, along with a correctness proof.

+ I'll replace all the BTree searches with this soon.  For now I just
  changed _BTree_get().  I *suspect* this also fixes a subtle bug
  introduced by the recent "don't ignore comparison error" patch:  if a
  comparison did trigger an exception, the function just returned without
  doing the PER_ALLOW_DEACTIVATION + PER_ACCESSED dance.  _BTree_get()
  does that dance again now.


=== Zope/lib/python/BTrees/BTreeModuleTemplate.c 1.26 => 1.27 ===
 #define BTREE(O) ((BTree*)(O))
 
+/* Use BTREE_SEARCH to find which child pointer to follow.
+ * RESULT   An int lvalue to hold the index i such that SELF->data[i].child
+ *          is the correct node to search next.
+ * SELF     A pointer to a BTree node.
+ * KEY      The key you're looking for, of type KEY_TYPE.
+ * ONERROR  What to do if key comparison raises an exception; for example,
+ *          perhaps 'return NULL'.
+ *
+ * See Maintainer.txt for discussion:  this is optimized in subtle ways.
+ * It's recommended that you call this at the start of a routine, waiting
+ * to check for self->len == 0 after.
+ */
+#define BTREE_SEARCH(RESULT, SELF, KEY, ONERROR) {          \
+    int _lo = 0;                                            \
+    int _hi = (SELF)->len;                                  \
+    int _i, _cmp;                                           \
+    for (_i = _hi >> 1; _i > _lo; _i = (_lo + _hi) >> 1) {  \
+        TEST_KEY_SET_OR(_cmp, (SELF)->data[_i].key, (KEY))  \
+            ONERROR;                                        \
+        if      (_cmp < 0) _lo = _i;                        \
+        else if (_cmp > 0) _hi = _i;                        \
+        else   /* equal */ break;                           \
+    }                                                       \
+    (RESULT) = _i;                                          \
+}
+
 typedef struct SetIteration_s
 {
   PyObject *set;


=== Zope/lib/python/BTrees/BTreeTemplate.c 1.32 => 1.33 ===
 _BTree_get(BTree *self, PyObject *keyarg, int has_key)
 {
-  int min, max, i, cmp, copied=1;
-  PyObject *r;
+  int min, copied=1;
+  PyObject *r = NULL;
   KEY_TYPE key;
   
   COPY_KEY_FROM_ARG(key, keyarg, copied);
@@ -30,38 +30,25 @@
 
   PER_USE_OR_RETURN(self, NULL);
 
+  BTREE_SEARCH(min, self, key, goto Error);
   if (self->len)
     {
-      for (min=0, max=self->len, i=max/2; max-min > 1; i=(min+max)/2)
-        {
-          TEST_KEY_SET_OR(cmp, self->data[i].key, key) return NULL;
-          if (cmp < 0) min=i;
-          else if (cmp == 0)
-            {
-              min=i;
-              break;
-            }
-          else max=i;
-        }
-      
       if (SameType_Check(self, self->data[min].child)) 
-        r=_BTree_get( BTREE(self->data[min].child), keyarg, 
-                      has_key ? has_key + 1: 0);
+        r = _BTree_get( BTREE(self->data[min].child), keyarg, 
+                        has_key ? has_key + 1: 0);
       else
-        r=_bucket_get(BUCKET(self->data[min].child), keyarg, 
-                      has_key ? has_key + 1: 0);
+        r = _bucket_get(BUCKET(self->data[min].child), keyarg, 
+                        has_key ? has_key + 1: 0);
     }
   else
     {  /* No data */
       UNLESS (has_key) 
-        {
-          PyErr_SetObject(PyExc_KeyError, keyarg);
-          r=NULL;
-        }
+        PyErr_SetObject(PyExc_KeyError, keyarg);
       else
-        r=PyInt_FromLong(0);
+        r = PyInt_FromLong(0);
     }
 
+Error:
   PER_ALLOW_DEACTIVATION(self);
   PER_ACCESSED(self);
   return r;


=== Zope/lib/python/BTrees/Maintainer.txt 1.6 => 1.7 ===
   an IIBTree has larger values than an OOBTree:  pickles store ints
   more efficiently than they can store arbitrary Python objects.
+
+
+The BTREE_SEARCH Macro
+======================
+For notational ease, consider a fixed BTree node x, and let
+
+    K(i) mean x->data.key[i]
+    C(i) mean all the keys reachable from x->data.child[i]
+
+For each i in 0 to x->len-1 inclusive,
+
+    K(i) <= C(i) < K(i+1)
+
+is a BTree node invariant, where we pretend that K(0) holds a key
+smaller than any possible key, and K(x->len) holds a key larger
+than any possible key.  (Note that K(x->len) doesn't actually exist,
+and K(0) is never used although space for it exists in non-empty
+BTree nodes.)
+
+When searching for a key k, then, the child pointer we want to follow
+is the one at index i such that K(i) <= k < K(i+1).  There can be
+only one such i, since the keys are strictly increasing.  And there is
+at *least* one such i provided the tree isn't empty.  For the moment,
+assume the tree isn't empty (we'll get back to that later).
+
+The macro's chief loop invariant is
+
+    K(lo) < k < K(hi)
+
+This holds trivially at the start, since lo is set to 0 ahd hi to
+x->len, and we pretend K(0) is minus infinity and K(len) is plus
+infinity.  Inside the loop, if K(i) < k we set lo to i, and if
+K(i) > k we set hi to i.  These obviously preserve the invariant.
+If K(i) == k, the loop breaks and sets the result to i, and since
+K(i) == k in that case i is obviously the correct result.
+
+What if the key isn't present?  lo and hi move toward each other,
+narrowing the range, until eventually lo+1 == hi.  At that point,
+i = (lo+hi)/2 = (lo+lo+1)/2 = lo + 1/2 = lo, so that:
+
+1. The loop's "i > lo" test is false, so the loop ends then.
+
+and
+
+2. The invariant still holds, so K(i) < k < K(i+1), and i is again
+   the correct answer.
+
+Can we get out of the loop too early?  No:  if hi = lo + d for some d
+greater than 1, then i = (lo+lo+d)/2 = lo + d/2, and d/2 is at least 1
+since d is at least 2:  i is strictly greater than lo then, and the
+loop continues.
+
+Can lo==hi?  Yes, but only if the node is empty.  Then i, lo and hi
+all start out as 0, and the loop exits immediately.  If the loop
+isn't empty, then lo and hi start out with different values.  Whenever
+lo and hi have different values, lo <= (lo + hi)/2 < hi, so i and lo
+are strictly smaller than hi, so setting either lo or hi to i leaves
+the new lo strictly smaller than the new hi.
+
+Can the loop fail to terminate?  No:  by the above, when lo < hi-1,
+lo < i=(lo+hi)/2 < hi, so setting either lo or hi to i leaves the
+new lo and hi strictly closer to each other than were the old lo and
+hi.
+
+Optimization points:
+
++ Division by 2 is done via shift rather via "/2".  These are
+  signed ints, and almost all C compilers treat signed int division
+  as truncating, and shifting is not the same as truncation for
+  signed int division.  The compiler has no way to know these values
+  aren't negative, so has to generate longer-winded code for "/2".
+  But we know these values aren't negative, and exploit it.
+
++ The order of _cmp comparisons matters.  We're in an interior
+  BTree node, and are looking at only a tiny fraction of all the
+  keys that exist.  So finding the key exactly in this node is
+  unlikely, and checking _cmp == 0 is a waste of time to the same
+  extent.  It doesn't matter whether we check for _cmp < 0 or
+  _cmp > 0 first, so long as we do both before worrying about
+  equality.
+
++ At the start of a routine, it's better to run this macro even
+  if x->len is 0 (check for that afterwards).  We just called a
+  function and so probably drained the pipeline.  If the first thing
+  we do then is read up self->len and check it against 0, we just
+  sit there waiting for the data to get read up, and then another
+  immediate test-and-branch, and for a very unlikely case (BTree
+  nodes are rarely empty).  It's better to get into the loop right
+  away so the normal case makes progress ASAP.