[ZODB-Dev] Re: What makes the ZODB slow?

Fri Jun 23 13:04:29 EDT 2006

On Fri, 2006-06-23 at 15:11 +0200, Florent Guillaume wrote:
> > I often daydream of a ZODB that will one day have such great performance
> > that it won't be necessary to adopt a hybrid backend. I know there is a
> > huge difference between objects and records in an RDBMS, but in an
> > attempt to understand more, I want to know what makes the ZODB so much
> > slower than a relational database when writing a lot? Is it possible to
> > speed it up in any way? 
> > 
> > Other questions that come to mind:
> > 
> > What overhead does undo add to performance?
> > Can state be serialised more economically to reduce disk IO?
> > Is the ZODB really slow, or is it just Zope and Plone or grand object
> > frameworks built on top it that make it appear slow? (In all my
> > benchmarks this is shown to be mostly true)
> 
> The ZODB is actually very fast. It has one drawback, which is that 
> concurrent writes are resolved only for class designed for that (namely 
> BTrees), otherwise it's left up to the application to deal with it when 
> it receives a ConflictError.
> 
> So you're probably observing slowness in the frameworks on top of it.

This is not really the fundamental explanation I was fishing for, and I
don't think that you are entirely right.

I don't think one can call the ZODB fast (I hope to some day). It might
be fast in it's handling of hierarchical data or reading lots of
objects, but I won't exactly call it fast. Just compare the speed new
objects are created in the ZODB, with the speed of records being created
in an RDMBS. In a test where one commits an instance of a Persistent
subclass that have only 2 string attributes, 300 objects per second are
created on average. Writing the exact same strings to a two column table
in an RDBMS, yields more than 3000 records per second including indexing
of the data. In the ZODB I still have to index data which will add
additional overhead. Adding more columns to the SQL table and writing
more data to it, doesn't hurt performance either.

The above test most probably doesn't compare apples with apples, but
maybe in pointing out why not, more fundamental differences become
clear. Maybe the fundamental difference is that pickles of objects have
a bigger footprint and yield to more disk IO, or most of the ZODB is
implemented in Python. I don't know, and I'm still curious.

-- 
Roché Compaan
Upfront Systems                   http://www.upfrontsystems.co.za