[Zope-Checkins] CVS: Zope/lib/python/BTrees - BucketTemplate.c:1.36 Maintainer.txt:1.9
Tim Peters
tim.one@comcast.net
Sat, 8 Jun 2002 15:46:19 -0400
Update of /cvs-repository/Zope/lib/python/BTrees
In directory cvs.zope.org:/tmp/cvs-serv7237
Modified Files:
BucketTemplate.c Maintainer.txt
Log Message:
For speed, and to squash delicate code duplication, introduced a
micro-optimized BUCKET_SEARCH macro. Change _bucket_get() to use
it (more later).
=== Zope/lib/python/BTrees/BucketTemplate.c 1.35 => 1.36 ===
#define BUCKETTEMPLATE_C "$Id$\n"
+/* Use BUCKET_SEARCH to find the index at which a key belongs.
+ * INDEX An int lvalue to hold the index i such that KEY belongs at
+ * SELF->keys[i]. Note that this will equal SELF->len if KEY
+ * is larger than the bucket's largest key. Else it's the
+ * smallest i such that SELF->keys[i] >= KEY.
+ * ABSENT An int lvalue to hold a Boolean result, true (!= 0) if the
+ * key is absent, false (== 0) if the key is at INDEX.
+ * SELF A pointer to a Bucket node.
+ * KEY The key you're looking for, of type KEY_TYPE.
+ * ONERROR What to do if key comparison raises an exception; for example,
+ * perhaps 'return NULL'.
+ *
+ * See Maintainer.txt for discussion: this is optimized in subtle ways.
+ * It's recommended that you call this at the start of a routine, waiting
+ * to check for self->len == 0 after (if an empty bucket is special in
+ * context; INDEX becomes 0 and ABSENT becomes true if this macro is run
+ * with an empty SELF, and that may be all the invoker needs to know).
+ */
+#define BUCKET_SEARCH(INDEX, ABSENT, SELF, KEY, ONERROR) { \
+ int _lo = 0; \
+ int _hi = (SELF)->len; \
+ int _i; \
+ int _cmp = 1; \
+ for (_i = _hi >> 1; _lo < _hi; _i = (_lo + _hi) >> 1) { \
+ TEST_KEY_SET_OR(_cmp, (SELF)->keys[_i], (KEY)) \
+ ONERROR; \
+ if (_cmp < 0) _lo = _i + 1; \
+ else if (_cmp == 0) break; \
+ else _hi = _i; \
+ } \
+ (INDEX) = _i; \
+ (ABSENT) = _cmp; \
+}
+
/*
** _bucket_get
**
@@ -41,45 +75,31 @@
static PyObject *
_bucket_get(Bucket *self, PyObject *keyarg, int has_key)
{
- int min, max, i, l, cmp, copied=1;
- PyObject *r;
- KEY_TYPE key;
-
- COPY_KEY_FROM_ARG(key, keyarg, copied);
- UNLESS (copied) return NULL;
-
- PER_USE_OR_RETURN(self, NULL);
-
- for (min=0, max=self->len, i=max/2, l=max; i != l; l=i, i=(min+max)/2)
- {
- TEST_KEY_SET_OR(cmp, self->keys[i], key) goto err;
- if (PyErr_Occurred()) goto err;
-
- if (cmp < 0) min=i;
- else if (cmp == 0)
- {
- if (has_key) r=PyInt_FromLong(has_key);
- else
- {
- COPY_VALUE_TO_OBJECT(r, self->values[i]);
- }
- PER_ALLOW_DEACTIVATION(self);
- PER_ACCESSED(self);
- return r;
- }
- else max=i;
+ int i, cmp;
+ KEY_TYPE key;
+ PyObject *r = NULL;
+ int copied = 1;
+
+ COPY_KEY_FROM_ARG(key, keyarg, copied);
+ UNLESS (copied) return NULL;
+
+ PER_USE_OR_RETURN(self, NULL);
+
+ BUCKET_SEARCH(i, cmp, self, key, goto Done);
+ if (has_key)
+ r = PyInt_FromLong(cmp ? 0 : has_key);
+ else {
+ if (cmp == 0) {
+ COPY_VALUE_TO_OBJECT(r, self->values[i]);
+ }
+ else
+ PyErr_SetObject(PyExc_KeyError, keyarg);
}
+Done:
PER_ALLOW_DEACTIVATION(self);
PER_ACCESSED(self);
- if (has_key) return PyInt_FromLong(0);
- PyErr_SetObject(PyExc_KeyError, keyarg);
- return NULL;
-
-err:
- PER_ALLOW_DEACTIVATION(self);
- PER_ACCESSED(self);
- return NULL;
+ return r;
}
static PyObject *
=== Zope/lib/python/BTrees/Maintainer.txt 1.8 => 1.9 ===
nodes are rarely empty). It's better to get into the loop right
away so the normal case makes progress ASAP.
+
+
+The BUCKET_SEARCH Macro
+=======================
+This has a different job than BTREE_SEARCH: the key 0 slot is
+legitimate in a bucket, and we want to find the index at which the
+key belongs. If the key is larger than the bucket's largest key, a
+new slot at index len is where it belongs, else it belongs at the
+smallest i with keys[i] >= the key we're looking for. We also need
+to know whether or not the key is present (BTREE_SEARCH didn't care;
+it only wanted to find the next node to search).
+
+The mechanics of the search are quite similar, though. The primary
+loop invariant changes to (say we're searching for key k):
+
+ K(lo-1) < k < K(hi)
+
+where K(i) means keys[i], and we pretend K(-1) is minus infinity and
+K(len) is plus infinity.
+
+If the bucket is empty, lo=hi=i=0 at the start, the loop body is never
+entered, and the macro sets INDEX to 0 and ABSENT to true. That's why
+_cmp is initialized to 1 (_cmp becomes ABSENT).
+
+Else the bucket is not empty, lo<hi at the start, and the loop body
+is entered. The invariant is obviously satisfied then, as lo=0 and
+hi=len.
+
+If K[i]<k, lo is set to i+1, preserving that K(lo-1) = K[i] < k.
+If K[i]>k, hi is set to i, preserving that K[hi] = K[i] > k.
+If the loop exits after either of those, _cmp != 0, so ABSENT becomes
+true.
+If K[i]=k, the loop breaks, so that INDEX becomes i, and ABSENT
+becomes false (_cmp=0 in this case).
+
+The same case analysis for BTREE_SEARCH on lo and hi holds here:
+
+a. (lo == i == hi) if and only if (lo == hi).
+b. (lo == i < hi) if and only if (lo+1 == hi).
+c. (lo < i < hi) if and only if (lo+1 < hi).
+
+So long as lo+1 < hi, we're in case #c, and either break with
+equality (in which case the right results are obviously computed) or
+narrow the range. If equality doesn't obtain, the range eventually
+narrows to cases #a or #b.
+
+To go from #c to #a, we must have lo+2==hi at the start, and
+K[i]=K[lo+1]<k. Then the new lo gets set to i+1 = lo+2 = hi, and the
+loop exits with lo=hi=i and _cmp<0. This is correct, because we
+know that k != K(i) (loop invariant! we actually know something
+stronger, that k < K(hi); since i=hi, this implies k != K(i)).
+
+Else #c eventually falls into case #b, lo+1==hi and i==lo. The
+invariant tells us K(lo-1) < k < K(hi) = K(lo+1), so if the key
+is present it must be at K(lo). i==lo in this case, so we test
+K(lo) against k. As always, if equality obtains we do the right
+thing, else case #b becomes case #a.
+
+When #b becomes #a, the last comparison was non-equal, so _cmp is
+non-zero, and the loop exits because lo==hi==i in case #a. The
+invariant then tells us K(lo-1) < k < K(lo), so the key is in fact
+not present, it's correct to exit with _cmp non-zero, and i==lo is
+again the index at which k belongs.
+
+Optimization points:
+
++ As for BTREE_SEARCH, shifting of signed ints is cheaper than
+ division.
+
++ Unlike as for BTREE_SEARCH, there's nothing special about searching
+ an empty bucket, and the macro computes thoroughly sensible results
+ in that case.
+
++ The order of _cmp comparisons differs from BTREE_SEARCH. When
+ searching a bucket, it's much more likely (than when searching a
+ BTree node) that the key is present, so testing __cmp==0 isn't a
+ systematic waste of cycles. At the extreme, if all searches are
+ successful (key present), on average this saves one comparison per
+ search, against leaving the determination of _cmp==0 implicit (as
+ BTREE_SEARCH does). But even on successful searches, __cmp != 0 is
+ a more popular outcome than __cmp == 0 across iterations (unless
+ the bucket has only a few keys), so it's important to check one
+ of the inequality cases first. It turns out it's better on average
+ to check K(i) < key (than to check K(i) > key), because when it
+ pays it narrows the range more (we get a little boost from setting
+ lo=i+1 in this case; the other case sets hi=i, which isn't as much
+ of a narrowing).