[Zope-dev] [BUG] Quadratic ZODB bloat caused by "PathIndex"

Dieter Maurer dieter@handshake.de
Thu, 20 Feb 2003 08:05:29 +0100


Zope 2.5.1

A "PathIndex" maps (pathsegment,level) onto the "IISet" of document ids
with "pathsegment" at "level" in their path.

An "IISet" is a single persistent object, written as a whole to
the ZODB. Its size is proportional to the number of entries.
Therefore a ZODB storage with undo support grows quadratically
with respect to the number of entries (between packs).

The standard "path" index indexes based on the physical path.
Therefore, the size of the index entry of (at least) one
of the top level pathsegments is in the order of all indexed
objects.

Once, you have lots of indexed objects you will observe
significant ZODB growth between packs.


The fix would be easy: "PathIndex" should use "IITreeSet" rather
than "IISet" to store the document id lists (as do other indexes).
(There are more bugs in "PathIndex": e.g. it does not remove
old index information when a new "index_object" brings in new data.
A code review would be appropriate.)


A quick workaround: delete the "path" index unless you really need it.


I will file a collector report.



Dieter