[Zope3-dev] N-Tuple Adapters
Phillip J. Eby
pje@telecommunity.com
Mon, 02 Jun 2003 19:25:38 -0400
At 05:42 PM 6/2/03 -0400, Shane Hathaway wrote:
>I've been pondering the idea of "n-tuple adapters". The idea is that some
>adapters require multiple independent inputs. Steve came up with the good
>name.
May I suggest "multiple dispatch", since that's a more well-known term in
the literature? Also, searching Google for "Python multiple dispatch" will
find you implementations you can play with, while searching for "n-tuple
adapters" won't. :) ("multimethod" is another good search term.)
>With n-tuple adapter registries, we might begin to solve a whole new class
>of problems. There could be numerous applications not only for Zope, but
>computing in general.
This is actually a very old, and, generally speaking, very solved
problem. See Lisp's CLOS, the Dylan language, and there's even a language
(Cecil I believe) that generalizes to "predicate dispatch" (selection of
code based on arbitrary query expressions, not just type).
>There's just one snag: how would we search for n-tuple adapters in a
>scaleable way, taking interface hierarchy into account? A naive
>implementation would perform a multi-dimensional search, where the number
>of dimensions corresponds with the number of items in the input tuple. To
>find an adapter that accepts (IFooFolder, IHTTPRequest), we have to search
>for a registered adapter that requires a set of interfaces, in the
>following order:
>
>1. (IFooFolder, IHTTPRequest)
>2. (IFolder, IHTTPRequest)
>3. (IFooFolder, IRequest)
>4. (IFolder, IRequest)
>
>The search is ordered from most specific to most general because we have
>to find the most specific adapter available. To complicate matters, we
>might prefer to reverse the order of #2 and #3, which is just as
>sensible. And this example uses a shallow interface hierarchy. A deeper
>hierarchy would require exponentially larger searches.
All you need are efficient sets and graphs - which I believe are included
with ZODB. :)
Here's a general approach to do n-way predicate dispatching. First, you
need "predicates" and "consequents". For the purposes we discuss here, we
can consider the "consequents" to be any Python object, forming the
"values" of our mapping, where the predicates form the "keys".
A predicate must be able to do two things:
1. Tell if it applies to a candidate input
2. Tell if it contains (i.e. logically implies) another predicate
(Note that we can create an N-way predicate class that does these things by
delegating to N sub-predicate instances.)
The basic algorithm is to ask all predicates if they apply to the candidate
input, and then topologically sort the applicable predicates by their
implication relationships. If the first two items in the sorted list are
equivalent, overlapping, or disjoint (i.e. they both imply each other, or
neither implies the other), then you have an ambiguous resolution order and
that's an error (at least in Dylan and Cecil; CLOS for example applies
additional rules to disambiguate).
As stated so far, this algorithm is order M (where M is the total number
of predicates) assuming that only a constant number of predicates match
each query. However, it can be improved upon if the "dispatcher" that
contains the predicates has a "feature graph" that maps from features of
the input data, to the predicates that require that feature. (This concept
is my invention, but like so many things I've invented, chances are good
somebody else came up with it first and I just don't know about it!)
For your example with interfaces, each "feature" would be something like
"the object in position 1 implements IFooFolder". We might represent that
as a tuple '(1,IFooFolder)'. Predicates 1 and 3 in your list would be
listed under that feature. The dispatcher would also need to be equipped
with a way to extract the features from an input query. In your interface
example, this would mean generating '(i,iface)' pairs for all positions 'i'
and all implemented interfaces 'iface'. For each position 'i', we union
the set of predicates that are registered for the implemented
interfaces. Then, we intersect the sets of predicates for each position,
resulting in a set of candidate predicates which can then be sorted.
Give the algorithm a try on paper, and you'll see that as long as you
include *all* the features of your input, you will end up with exactly the
right predicates.
Of course, this algorithm will take *longer* than a simple linear search
for applicability, if the number of predicates is small, due to the
complexity involved.
>I've been thinking about ways to deal with this, but I'd like some input
>on the general idea first. Do you think n-tuple adapters would be
>generally useful? Is there a way to think about them that would reduce or
>eliminate the deep search? Is there any research on this topic or
>something similar?
Yes, yes, and yes. :) Have a look at the book "San Francisco Design
Patterns", as well. It deals with n-way predicate dispatching (using SQL
in relational databases!) as a tool for implementing business rules and
many other interesting things.
Unfortunately for the general usefulness of multiple dispatching, efficient
implementations require that you define your requirements precisely. Is
feature extraction slower than feature comparison? Is it better to sort
for implication first and build a search tree, or are matches rare and the
set large? (In which case prefiltering is better.)
So far, I haven't actually *needed* an N-way dispatch that a linear search
wasn't sufficient for. Mainly this is because, if you think about it,
who's going to put in all those zillions of N-way rules? Chances are, your
brain can't handle it anyway. ;)
As you've already noticed, 1-way dispatch is *much* more common. 2-way is
often useful, typically for binary operations such as addition or
multiplication. By the time you hit 3-way, it's pretty uncommon, except
for the kind of business rule things covered in the SFDP book.
The thing is, what Zope does now may not even need 2-way dispatch. When
one or the other dimension dominates, it's not really 2-way dispatch, but
two 1-way dispatches. Let's take traversal for example. Since as you say,
you might use the same traversal for a given object across multiple
protocols, we can say that the "object" dimension dominates the "request"
dimension. This can then become a hierarchical search: find all adapters
that match the object, then find the "best" match for the request, from
among those adapters. Alternatively, adapt from the object to a generic
traversal interface, and then just give that handler the request. It's
always in a position to look up another adapter, if needed. (Perhaps by
adapting back from the protocol-specific request, to a generic request type
that fills its needs.)
So, again, defining the requirements precisely is important. For example,
if different protocols really need different traversers, why not just adapt
to different traversal interfaces?