[Zope3-dev] N-Tuple Adapters

Mon, 02 Jun 2003 19:25:38 -0400

At 05:42 PM 6/2/03 -0400, Shane Hathaway wrote:
>I've been pondering the idea of "n-tuple adapters".  The idea is that some 
>adapters require multiple independent inputs.  Steve came up with the good 
>name.

May I suggest "multiple dispatch", since that's a more well-known term in 
the literature?  Also, searching Google for "Python multiple dispatch" will 
find you implementations you can play with, while searching for "n-tuple 
adapters" won't.  :)  ("multimethod" is another good search term.)

>With n-tuple adapter registries, we might begin to solve a whole new class 
>of problems.  There could be numerous applications not only for Zope, but 
>computing in general.

This is actually a very old, and, generally speaking, very solved 
problem.  See Lisp's CLOS, the Dylan language, and there's even a language 
(Cecil I believe) that generalizes to "predicate dispatch" (selection of 
code based on arbitrary query expressions, not just type).

>There's just one snag: how would we search for n-tuple adapters in a 
>scaleable way, taking interface hierarchy into account?  A naive 
>implementation would perform a multi-dimensional search, where the number 
>of dimensions corresponds with the number of items in the input tuple.  To 
>find an adapter that accepts (IFooFolder, IHTTPRequest), we have to search 
>for a registered adapter that requires a set of interfaces, in the 
>following order:
>
>1. (IFooFolder, IHTTPRequest)
>2. (IFolder, IHTTPRequest)
>3. (IFooFolder, IRequest)
>4. (IFolder, IRequest)
>
>The search is ordered from most specific to most general because we have 
>to find the most specific adapter available.  To complicate matters, we 
>might prefer to reverse the order of #2 and #3, which is just as 
>sensible.  And this example uses a shallow interface hierarchy.  A deeper 
>hierarchy would require exponentially larger searches.

All you need are efficient sets and graphs - which I believe are included 
with ZODB.  :)

Here's a general approach to do n-way predicate dispatching.  First, you 
need "predicates" and "consequents".  For the purposes we discuss here, we 
can consider the "consequents" to be any Python object, forming the 
"values" of our mapping, where the predicates form the "keys".

A predicate must be able to do two things:

1. Tell if it applies to a candidate input
2. Tell if it contains (i.e. logically implies) another predicate

(Note that we can create an N-way predicate class that does these things by 
delegating to N sub-predicate instances.)

The basic algorithm is to ask all predicates if they apply to the candidate 
input, and then topologically sort the applicable predicates by their 
implication relationships.  If the first two items in the sorted list are 
equivalent, overlapping, or disjoint (i.e. they both imply each other, or 
neither implies the other), then you have an ambiguous resolution order and 
that's an error (at least in Dylan and Cecil; CLOS for example applies 
additional rules to disambiguate).

As stated so far, this algorithm is order M  (where M is the total number 
of predicates) assuming that only a constant number of predicates match 
each query.  However, it can be improved upon if the "dispatcher" that 
contains the predicates has a "feature graph" that maps from features of 
the input data, to the predicates that require that feature.  (This concept 
is my invention, but like so many things I've invented, chances are good 
somebody else came up with it first and I just don't know about it!)

For your example with interfaces, each "feature" would be something like 
"the object in position 1 implements IFooFolder".  We might represent that 
as a tuple '(1,IFooFolder)'.  Predicates 1 and 3 in your list would be 
listed under that feature.  The dispatcher would also need to be equipped 
with a way to extract the features from an input query.  In your interface 
example, this would mean generating '(i,iface)' pairs for all positions 'i' 
and all implemented interfaces 'iface'.  For each position 'i', we union 
the set of predicates that are registered for the implemented 
interfaces.  Then, we intersect the sets of predicates for each position, 
resulting in a set of candidate predicates which can then be sorted.

Give the algorithm a try on paper, and you'll see that as long as you 
include *all* the features of your input, you will end up with exactly the 
right predicates.

Of course, this algorithm will take *longer* than a simple linear search 
for applicability, if the number of predicates is small, due to the 
complexity involved.

>I've been thinking about ways to deal with this, but I'd like some input 
>on the general idea first.  Do you think n-tuple adapters would be 
>generally useful?  Is there a way to think about them that would reduce or 
>eliminate the deep search?  Is there any research on this topic or 
>something similar?

Yes, yes, and yes.  :)  Have a look at the book "San Francisco Design 
Patterns", as well.  It deals with n-way predicate dispatching (using SQL 
in relational databases!) as a tool for implementing business rules and 
many other interesting things.

Unfortunately for the general usefulness of multiple dispatching, efficient 
implementations require that you define your requirements precisely.  Is 
feature extraction slower than feature comparison?  Is it better to sort 
for implication first and build a search tree, or are matches rare and the 
set large?  (In which case prefiltering is better.)

So far, I haven't actually *needed* an N-way dispatch that a linear search 
wasn't sufficient for.  Mainly this is because, if you think about it, 
who's going to put in all those zillions of N-way rules?  Chances are, your 
brain can't handle it anyway.  ;)

As you've already noticed, 1-way dispatch is *much* more common.  2-way is 
often useful, typically for binary operations such as addition or 
multiplication.  By the time you hit 3-way, it's pretty uncommon, except 
for the kind of business rule things covered in the SFDP book.

The thing is, what Zope does now may not even need 2-way dispatch.  When 
one or the other dimension dominates, it's not really 2-way dispatch, but 
two 1-way dispatches.  Let's take traversal for example.  Since as you say, 
you might use the same traversal for a given object across multiple 
protocols, we can say that the "object" dimension dominates the "request" 
dimension.  This can then become a hierarchical search: find all adapters 
that match the object, then find the "best" match for the request, from 
among those adapters.  Alternatively, adapt from the object to a generic 
traversal interface, and then just give that handler the request.  It's 
always in a position to look up another adapter, if needed.  (Perhaps by 
adapting back from the protocol-specific request, to a generic request type 
that fills its needs.)

So, again, defining the requirements precisely is important.  For example, 
if different protocols really need different traversers, why not just adapt 
to different traversal interfaces?