[ZODB-Dev] Updated BTree docs
Casey Duncan
casey at zope.com
Fri May 2 12:20:41 EDT 2003
Sorry, I forgot to mention that it takes a single argument, which is a minimum
value. It filters out any values less then this. I don't know what the use
case is for this feature, ZCatalog does not use it. If I were to guess I
would probably say it is some sore of score threshold for eliminating results
below a certain score.
On Friday 02 May 2003 10:54 am, Tim Peters wrote:
> Casey Duncan]
> > I think I can elaborate on the following passage in the docs:
> >
> > "... and byValue(), which should probably be ignored (it's hard to
> > explain exactly what it does, and as a result it's almost never
> > used - best to consider it deprecated). "
> >
> > byValue() returns (value, key) pairs in sorted order by value.
>
> If that were true, I wouldn't have a problem explaining what it does <wink>.
> Examples to ponder:
[snip]
> So this is really some combination of sorting, filtering, and type-dependent
> value arithmetic, all rolled into one. byValue() isn't called in the Zope3
> codebase so far, and I see one use in the Zope2 codebase (in Catalog.py).
> I've never seen it called with an argument other than 0 in real life (in
> which specific case, and if no value is less than 0, it's easy to explain
> what it does).
I didn't say it made sense ;^)
> > ZCatalog uses this to sort "scored" results, such as from text indexes,
> > which start as a mapping of rid->score.
>
> I'd like a method that did only that much a lot better. Note that in
> ZCTextIndex we didn't sort the whole thing, instead we used an N-best
> priority queue to remember just the best N scoring items. Even running at
> Python speed, and using a dirt-dumb list for the queue, this was usually
> much faster than sorting the whole result sequence (at C speed) first (&
> that's generally true if N is much less than the # of items in the whole
> result sequence).
We could have a keysForBestValues(N) method that did this, I dunno. I actually
put the N-best sorting algorithm in ZCatalog for 2.6.1, ironically though it
is never used for TextIndexes... I had planned to change that in 2.7.
> > I have actually been camping on some optimized code for this. The
> > current implementation is pretty lame. I came up with a new API,
> > keysByValue(), which returns the keys in order by value, which is
> > really all ZCatalog needs. The implementation I have for *IBTree
> > variants is 10x faster than the existing byValue implementation.
> >
> > I should probably get with you and/or Jim and discuss
> > generalizing this and integrating it into the BTrees module.
> > Basically I need to make it work for *OBTree variants and tangle
> > it up with the macros in there.
>
> Yup, more macros is exactly what BTrees need <wink>. If you lose your one
> use for the existing byValue() method then, I'd like to deprecate it for
> real, as I don't know of any other uses, it's not even tested, and is hard
> to explain.
Yup, I agree byValue should be deprecated in this case. Its a pretty weird
method.
-Casey
More information about the ZODB-Dev
mailing list