[Zope3-checkins] CVS: Zope3/doc/schema - vocabularies.txt:1.1

Fred L. Drake, Jr. fred@zope.com
Thu, 26 Jun 2003 11:01:58 -0400


Update of /cvs-repository/Zope3/doc/schema
In directory cvs.zope.org:/tmp/cvs-serv23656

Added Files:
	vocabularies.txt 
Log Message:
Documentation on vocabularies, vocabulary fields, and how to work with them
in Zope 3.
This document is in reStructuredText format.


=== Added File Zope3/doc/schema/vocabularies.txt ===
=================
Vocabulary Fields
=================


Problem/Proposal
----------------

Vocabulary fields provide a field type for situations in which the set
of options may be highly dynamic or otherwise by decoupled from the
schema definition itself.  A vocabulary may be provided by (for
example) some object in the ZODB, a query to an external database, the
contents of a file maintained by another process, or static data
provided by a separately maintained piece of code.

A vocabulary field in a schema contains either a vocabulary object
(``IBaseVocabulary``) or a name of a vocabulary.  If a name is used,
the specific vocabulary to use will be supplied by a registry when the
schema field is bound to an instance of a content object.

**XXX** Need to mention tokens. Tokens are ASCII-encoded non-empty
strings that represent particular values. There is a 1-to-1
bidirectional mapping between tokens and values. One way to think of a
token is as an optimisation of getting a Browser ``id_string`` view on
a ``Term``. But, don't think that if doing so makes your head hurt.
Tokens effectively exist to allow any value to be encoded in an HTML
form.

**XXX** Vocabularies that remove values are a pain to support.  Need
to discuss this issue; can also occur when permissions change and
objects become invisible to the current user.  (Easy to have that
happen with catalog-based vocabularies.)


Vocabulary Flavors
------------------

When we look at potential vocabularies, we see that they come in many
flavors, and there's not always a whole lot that different types
share.  Some vocabularies are small and can easily be represented by a
simple list of values (such as the list of songs on an album); others
are large and aren't likely to be directly useful for many tasks
without interesting query interfaces (think *Oxford English
Dictionary*).  This later group includes everything for which no sane
user interface could present all the options at once: it doesn't have
to have millions of entries; 50 may be enough.

The very least a vocabulary can support is a membership test.  That
is, given a vocabulary object *v* and a candidate value *c*, *c*
``in`` *v* will return true iff value is a member of the vocabulary.
It may not be inexpensive or even meaningful to determine the size of
the vocabulary.  This is described by the ``IBaseVocabulary``
interface.

A more typical vocabulary would have a moderately limited number of
entries, but be easily iterable; lets say 10s to 1000s of values.  The
vobulary could consist of the people sharing a dorm room (two or three
usually!), or it could be the set of registered users of your site
(10, 100, or many, many thousand).  These cases are supported by the
IIterableVocabulary interface, which derives from ``IBaseVocabulary``.

In these cases, it will usually be easy to know how many entries are
in a vocabulary.  This will prove advantageous when the time comes to
build user interface components for your vocabularies.

**XXX** The comments on queryable vocabularies are out of date.  I'd
also like to remove the ``ISubsetVocabualary`` interface as it seems
not to be needed given the current (actual) arrangement of
``getQuery()`` and query views.

There is a vague notion of queryable vocabularies, and a stronger
notion of vocabularies which are subset of other vocabularies.  A
subset vocabulary would normally be used as the result of searching a
queryable vocabulary.  A subset would mix-in the ``ISubsetVocabulary``
interface and implement the ``getMasterVocabulary()`` method; this
would allow code working with the subset to always get back to the
full vocabulary.  The key ideas here are that a search result is
really just another vocabulary, and additional searches could be done
on either the results of a currently held search or the source of the
result.

Specific interfaces to perform queries on a vocabulary are not yet
defined.  These may prove to be less uniform than the other
interfaces, so it's too early to standardize them.  Additional
interface types may be defined to allow easy tests for specific kinds
of query support; these interfaces should be viewed as mix-ins, since
a single vocabulary may provide support for more than one type of
query.  For example, a dictionary may support both prefix-based
completion and part-of-speach or keyword searching.


Vocabulary Fields
-----------------

Vocabularies give us a way to talk about sets of possible values.  We
still need a way to use these in content schema.  For this, there are
a number of specific field types that can be employed.  These field
classes are all found in the ``zope.schema.vocabulary`` module; each
has a distinct interface in the ``zope.schema.interfaces`` module as
well.

These fields each add a single, required, keyword argument to the
constructor to that of the ``Field`` constructor.  This argument,
*vocabulary*, can be either a string which identifies a vocabulary for
the vocabulary registry, or an instance of a vocabulary.  (Vocabulary
registries are be discussed in `The Vocabulary Registry`_.)

In the simple case, we want to be able to talk about a field that
contains a value from a vocabulary, if a value for the field is
specified.  This simple case is supported by the ``VocabularyField``.
The field can be made optional by setting the *required* argument to
the constructor to ``False``.

The remaining four field types are all used to define fields for which
more than one value may be selected.  These fields combine options of
orderable vs. non-orderable and whether each value may be included
more than once (uniqueness).  These fields also implement
``IMinMaxLen``, allowing a schema to set the minumum and maximum
number of values for each field.

=========================  ==========  ==========
                           Ordered     Uniqueness
=========================  ==========  ==========
VocabularyBagField         No          No
VocabularyListField        Yes         No
VocabularySetField         No          Yes
VocabularyUniqueListField  Yes         Yes
=========================  ==========  ==========


Vocabulary Examples
-------------------

States
~~~~~~

Let's start with a simple case, without involving the rest of Zope.
This vocabulary represents the set of `state and territory
abbreviations`_ issued by the United States Postal Service.  The
values will be the two-letter abbreviations.

.. _state and territory abbreviations: http://www.usps.com/ncsc/lookups/abbreviations.html

This is the vocabulary implementation::

  from zope.schema.vocabulary import ITerm, IVocabulary
  from zope.interface import implements

  class State:
      __slots__ = 'value', 'title'
      implements(ITerm)

      def __init__(self, value, title):
          self.value = value
          self.title = title  # See ITerm

  # This table is based on information from the USPS:
  # http://www.usps.com/ncsc/lookups/abbreviations.html#states
  _states = {
      'AL': u'Alabama',
      'AK': u'Alaska',
      # ...
      'WY': u'Wyoming',
      }

  for value, title in _states.iteritems():
      _states[value] = State(value, title)

  class StateVocabulary(object):
      # Actually, we need only one of these. Using __slots__ will make
      # these objects small. We could also use __new__ to make this a
      # singleton.
      __slots__ = ()
      implements(IVocabulary)

      def __init__(self):
          pass

      def __contains__(self, value):
          return value in _states

      def __iter__(self):
          return _states.itervalues()

      def __len__(self):
          return len(_states)

Using this vocabulary is quite easy.  A simple schema for an address
might require the state to be specified::

  from zope.schema import VocabularyField

  class Address(Interface):
      """Schema for an address."""

      state = VocabularyField(
          vocabulary=StateVocabulary())

If a specific vocabulary is going to be used in many different schema,
a custom field class can be used.  (Note that we're talking about a
field class, not a field type: our example doesn't define a new
interface for the new class.)  Creating a custom field class that uses
this vocabulary is easy enough::

  from zope.schema import VocabularyField

  class StateSelectionField(VocabularyField):

      def __init__(self, **kw):
          super(StateSelectionField, self).__init__(
              vocabulary=StateVocabulary(), **kw)

(Note that by passing the *vocabulary* keyword argument to the
superclass constructor this way, we ensure a reasonable ``TypeError``
exception is generated if the caller tries to override the vocabulary
by passing the *vocabulary* argument themselves.)


Tab Completion
~~~~~~~~~~~~~~

**XXX** This example should probably be replaced since it's not
terribly relevant to Zope applications.

This example shows how a queryable vocabulary could be used to
implement tab-completion as might be found on a command line.

For this application, we need a vocabulary that's searchable by
prefix, and which we can actually display the contents of the
vocabulary.  Since these capabilities are already mixed into the
``IVocabulary`` interface, we can implement that one::

  from zope.schema.interfaces import ITerm, IVocabulary, IVocabularyQuery
  from zope.interface import implements, Interface


  class IPrefixQuery(IVocabularyQuery):
      """Interface for prefix queries."""

      def queryForPrefix(prefix):
          """Return a vocabulary that contains terms beginning with
          prefix."""


  class Term:
      implements(ITerm)

      def __init__(self, value):
          self.value = value


  class CompletionVocabulary(object):
      implements(IVocabulary)

      def __init__(self, values):
          # In practice, something more dynamic could be used to
          # get the list possible completions.
          # We force a _values to be a list so we can use .index().
          self._values = list(values)
          self._terms = map(Term, self._values)

      def __contains__(self, value):
          return value in self._values

      def __iter__(self):
          return iter(self._terms)

      def __len__(self):
          return len(self._values)

      def getQuery(self):
          return PrefixQuery(self)

      def getTerm(self, value):
          if value in self._values:
              return self._terms[self._values.index(value)]
          raise LookupError(value)


  class PrefixQuery:
      implements(IPrefixQuery)

      def __init__(self, vocabulary):
          self.vocabulary = vocabulary

      def queryForPrefix(self, prefix):
          L = [v for v in self.vocabulary._values if v.startswith(prefix)]
          if L:
              return CompletionVocabulary(L)
          else:
              raise LookupError("no entries matching prefix %r" % prefix)

A real application will probably want to use a more dynamic
determination of the possible vocabulary members, and may want to use
multiple vocabularies to represent different types of completion (for
example, a command shell may want different vocabulary implementations
to use for program names, hostnames, program arguments, etc.).  It may
also want to pass more information into the query method.

(It's not clear that this vocabulary would be particularly useful in a
Zope schema field, since there's no tab completion for web forms, but
it would be interesting for user-interface developers that want to
provide auto-completion support for some fields or command lines.)


Site Members
~~~~~~~~~~~~

**Simple case:** a vocabulary that tests for membership
(``__contains__()``).

**More interesting case:** vocabulary that supports selection of a
named member in a dialog (at least supporting
``IIterableVocabulary``).

**What would be really neat:** vocabulary that supports searching for
a member by lots of interesting criteria ("site managers", "editors",
"author of at least 20 news items", "authors of 5 most popular
articles").


Interfaces
----------

Module ``zope.schema.interfaces``:

``ITerm``
  The interface of objects returned by iterating over a vocabulary or
  requesting additional information about an term using the
  vocabulary's ``getTerm()`` method.  Instance must implement at least
  one attribute:

  ``value``
    The value that should be stored in content objects when this term
    is selected from the vocabulary.

``IBaseVocabulary``
  The interface of objects that represent vocabularies.  These objects
  support the containment test (``in``) and two additional methods:

  ``getQuery()``
    Return an object which is able to perform queries on the
    vocabulary, or ``None``.  The specific query interface should be
    represented by an interface implemented by the query object; that
    interface will be used to assemble the components of form widgets.
    (Read about *query views* in `Vocabulary Widget Support`_.)  If
    ``getQuery()`` returns ``None``, the vocabulary does not support
    queries.

  ``getTerm(value)``
    Return an ``ITerm`` instance for *value*.  If *value* is not a
    valid term in the vocabulary, ``LookupError`` is raised.

``IIterableVocabulary``
  Vocabulary which supports iteration over allowed values and
  determination of the size of the vocabulary using ``len()``.

``IVocabularyField``
  An ``IVocabularyField`` has the following attributes.  At least one
  of these attributes must be not be ``None``.

  ``vocabulary``
    An ``IBaseVocabulary`` object, or ``None``.

  ``vocabularyName``
    The name of the vocabulary that should be used for this field.

Module ``zope.schema.vocabulary``:

``IVocabularyRegistry``
  This is the interface for the vocabulary registry.  It defines a
  single method:

  ``get(object, name)``
    Return the vocabulary named *name* for the content object
    *object*.  This is only used when the field does not already
    provide the needed vocabulary.


Other APIs
----------

There are a handful of implementation classes and functions exposed in
the ``zope.schema`` package.

Module ``zope.schema``:

``VocabularyField``
  Class which implements ``IVocabularyField``; derived from
  ``IField``.  The constructor takes one required keyword argument,
  and passes the rest to the base class:

  ``vocabulary``
    An ``IBaseVocabulary`` object, or a name which may be used by an
    ``IVocabularyRegistry``.

Module ``zope.schema.vocabulary``:

``getVocabularyRegistry()``
  Return the currently-registered ``IVocabularyRegistry`` object.
  If one has not been registered, a default implementation will be
  provided.

``setVocabularyRegistry(registry)``
  Set the vocabulary registry.  This causes any previous registry
  (and any registrations it contains) to be discarded.

For more information on the vocabulary registry, see `The Vocabulary
Registry`_.

``SimpleVocabulary``
  Simple ``IVocabulary`` implementation intended for simple
  "one-off" vocabularies.  This class provides alternate
  constructors, easy support for marker interfaces, and can be
  readily extended by subclassing.


The Vocabulary Registry
-----------------------

The vocabulary registry is a separate registry, not dependent on the
Zope 3 component architecture, which allows indirection of vocabulary
lookup based on framework requirements.

This document describes the registry and discusses how Zope 3 uses it
to support vocabularies to be defined via ZCML.

Which Registry?
~~~~~~~~~~~~~~~

The ``zope.schema.vocabulary`` module provides a global registry of
vocabulary names mapped to factories.  Whenever a vocabulary field is
"bound" to a context, if the vocabulary is specified by name, the
registry is asked to provide an implementation for that name in that
context.

In Zope, there's a requirement that we support local vocabulary
registries as well as the global registry.  (The vocabulary service
has the same interface as the vocabulary registry,
``IVocabularyRegistry``, defined in ``zope.schema.interfaces``.)  The
default registry provided by ``zope.schema.vocabulary`` works well for
the global vocabulary service, but doesn't support local services.

Support for local vocabulary services is added in the
``zope.app.schema`` package (in the ``vocabulary`` module of that
package), which stores a reference to the global registry and adds an
intermediate registry which gets the registry to use based on the
service manager found using the context passed to the registry's
``get()`` method.

Note that there is not (currently) an implementation of a local
vocabulary service in Zope.

When a field is *bound* to a context (location), if the vocabulary was
specified by name rather than by instance, the field gets the
vocabulary registry using the ``getVocabularyRegistry()`` function in
the ``zope.schema.vocabulary`` package and calling the ``get()``
method of the registry.  What the ``zope.app.schema`` package does is
replace the simple registry with one that deals with the service
infrastructure, and registers the basic vocabulary registry as the
global vocabulary service.

Vocabularies and ZCML
~~~~~~~~~~~~~~~~~~~~~

Named, context-sensitive vocabularies can be specified in ZCML using
the **vocabulary** declaration from the main Zope XML namespace.
These registrations are always made in the global vocabulary service.
An example of the simplest registration is::

  <vocabulary
      name='my-vocab'
      factory='myproduct.vocabulary.MyVocabulary'
      />

The ``name`` attribute specifies a vocabulary name as a string; this
vocabulary will be available for vocabulary fields to look up by name,
so a schema field which refers to this vocabulary might look like
this::

  class MySchema(Interface):

      myfield = VocabularyField(title=_("Title"),
                                description=_("Field description"),
                                vocabulary="my-vocab")

The Python object ``myproduct.vocabulary.MyVocabulary`` must be an
object which can be called with a single positional argument: the
context object with which the field is being used.  This object might
not be a content object; it could be an *adding* or other special
object, so it's not a good idea to use it for anything other than it's
contextual binding.  ``myproduct.vocabulary.MyVocabulary`` must return
a vocabulary object which can be used to construct widgets for use in
forms; see `Vocabulary Widget Support`_ for more on that topic.

Vocabulary factories which can support additional keyword arguments
can be configured by specifying additional attributes in the
**vocabulary** element; these attributes will be passed along as
keyword parameters with string-valued arguments when the factory is
called.  For example, a vocabulary which can sort the values returned
by its iterator may accept the name of the sort key as a parameter::

  <vocabulary
      name='my-vocab'
      factory='myproduct.vocabulary.MyVocabulary'
      sortkey='title'
      />

(Imagine that sorting is performed on some attribute of the term
objects for this vocabulary, and the *sortkey* parameter gives the
name of that attribute.)


Vocabulary Widget Support
-------------------------

This describes how form widgets are created for vocabulary fields, and
tries to explain why there's so much indirection.

The widgets used select values from a vocabulary vary on three axes:

- field type (single select, bag, list, set, unique-list)

- vocabulary type

- vocabulary query type

To allow each of these axes to play a part in widget selection, an
involved process of indirection and assembly is used to create the
actual widgets.

Widget Construction
~~~~~~~~~~~~~~~~~~~

For "normal" widget construction, named views on the field are used to
locate either a *display* or *edit* widget.  Vocabulary fields are
similar; the fields offer both *display* and *edit* views, but the
factories for those views re-direct the request, loading a named view
of the vocabulary itself.

At this point, only the field type and the vocabulary type play a part
in widget selection.  The "base" widget is located as a named view on
the vocabulary, where the name is based on the type of view desired
(*display* or *edit*) and the type of field involved.  Since the
resulting widget needs to be a view on the field rather than the
vocabulary itself (for the rest of the form machinery), the bound
field is provided to the widget using the ``setField()`` method the
widget must provide.  The field needs to be available from the widget
as the ``context`` attribute, as it is for non-vocabulary-based
widgets.  The shared factories for all vocabulary field widgets take
care of making sure ``setField()`` is called properly.

For *edit* widgets, query support still needs to be added.  The
operations to do so are handled by the generic vocabulary widget
factories, but widget and vocabulary implementors need to understand
how this works.  As (will be) described in VocabularyFields,
vocabularies provide a ``getQuery()`` method which can return a query
object or ``None``.  If a query object is returned, it must implement
an interface for which a *query view* is defined for the specific type
of *edit* widget being used.  Query views are named views which
implement the ``IVocabularyQueryView`` defined in
``zope.app.interfaces.browser.form``.  The widget's ``setQuery()``
method is then called with the query object and the query view as
arguments; it will not be called if there is no query.  The widget is
responsible for calling the query view's ``setName()`` method, while
the shared factories call the query view's ``setWidget()`` method.

Names for Views
~~~~~~~~~~~~~~~

This table gives all the view names that are defined that widgets and
query views can be provided for.  Implementors need only provide
widgets and query views for the fields that are actually used for
their custom vocabularies, and some useful stock widgets are available
from the ``zope.app.browser.form.vocabularywidget`` module.

=========================  ================================  ================================  ================================
                           Display                           Edit                              Query
=========================  ================================  ================================  ================================
VocabularyField            field-display-widget              field-edit-widget                 widget-query-helper
VocabularyBagField         field-display-bag-widget          field-edit-bag-widget             widget-query-bag-helper
VocabularyListField        field-display-list-widget         field-edit-list-widget            widget-query-list-helper
VocabularySetField         field-display-set-widget          field-edit-set-widget             widget-query-set-helper
VocabularyUniqueListField  field-display-unique-list-widget  field-edit-unique-list-widget     widget-query-unique-list-helper
=========================  ================================  ================================  ================================