[Grok-dev] Re: Performance of OrderedContainer
Gary Poster
gary at zope.com
Thu Jun 19 21:08:09 EDT 2008
On Jun 19, 2008, at 5:18 PM, Sebastian Ware wrote:
> 18 jun 2008 kl. 10.11 skrev Martijn Faassen:
>
>> We also need to investigate whether blist could be made to make
>> sense i this role after all.
>
> According to the rejection notice in the PEP about Blists, the
> performance gain might not be all that huge. I don't know anything
> about this, but I just want to raise the issue so we don't turn
> OrderedContainer into a little beast...
>
> http://www.python.org/dev/peps/pep-3128/
While worth noting, this is only barely pertinent.
zc.blist is certainly not of interest to a ZODB ordered container
because of its raw list performance. The PersistentList uses Python's
C implementation, and will be significantly faster for standard list
operations. The Python 3000 PEP proposal was also after performance
gains that are not a current goal of the zc.blist implementation.
The blist pattern is of significant interest to users of the ZODB
because it allows small changes in a large sequence to only write to a
small number of persistent objects. That is the pertinent advantage
of my implementation. It is also one of the big advantages to the
ZODB BTrees, and a big reason why they are recommended over
PersistentMappings for large-sized mappings.
Writing large objects, especially when ZEO is involved, has bad
performance characteristics in the ZODB. Once you have a single large
object like a PersistentList or a PersistentMapping with many members,
making a single change means that the entire collection must be
written (and sent over the network, for ZEO).
The other advantage of the blist pattern, and of my implementation and
the Python 3000 implementation, is cheap copies. That may or may not
be pertinent to this discussion.
As you might expect, the rejected Python 3000 blist does not directly
help with the ZODB because good ZODB behavior was not a design goal.
The downside that is pertinent to both the Py3000 version and this one
is the complexity of the implementation. This complexity is hidden to
the user, since the API is the same as a list.
FWIW, I think that a pluggable ordering, such as the one in Martin's
Plone folder API, is a great option for a one-size-fits-all mapping.
Non-ordered mappings never get an overhead, but can be easily
converted to an ordered mapping. A collection of very many mappings
that are guaranteed to be relatively small, or medium sized and not
change very often, could use PersistentList for their ordering
implementation. Mappings that may get many members could use zc.blist,
or another approach or implementation that also divides up members in
buckets efficiently.
One-size-fits-all discussions for frameworks are always hard. Trade-
offs between speed, complexity, and storage are hard enough when you
are relatively sure of the use cases. It's nice that our component
architecture makes it easier to stitch things together, at least.
Gary
More information about the Grok-dev
mailing list