[Zope] ZODB performance: reads to writes

Wed, 28 Jun 2000 10:57:11 -0400

Ty Sarna wrote:
> 
> In article <000d01bfddfb$4546f070$3e48a4d8@digicool.com>,
> Evan Simpson <evan@digicool.com> wrote:
> > ----- Original Message -----
> > From: Jimmie Houchin <jhouchin@texoma.net>
> >
> > > Will an app as described above still suffer from problems with high
> > writes?
> >
> > Possibly, but only if there are hidden hotspots.  For example, in your
> [...]>
> > 2. Implement the application-level conflict handling you read about, so that
> > Folders and Catalogs can decide that two writes don't conflict after all,
> > and merge them into a single update.
> 
> Unfortunately, this doesn't deal with cases where the conflicting state
> is contained in many objects (see note by PJE in the ZODB Wiki).

Yes it does. (See my response to PJE's note.)

> Also, there is a whole other area of difficulty for high-write-volume
> ZODBs, which is the ammount of IO that needs to be done.  First, by
> nature ZODB can't rewrite a single attribute of an object, it has to
> rewrite the entire thing.

Each object (that subclasses Persistent) is analigous to a database 
record. When you modify a part of the object (that isn't it's own
persistent object) then you write the entire record. This seems
pretty reasonable to me. Part of ZODB database design, where it
matters, is to balence the size of database objects. If objects are
too big, then the amount of data written on a change is larger.
If objects are too small, then you may incur too much persistence
overhead. Most apps don't need this level of tuning.

> Indexing is also a bear from an IO perspective.  First, BTrees currently
> keep a count at each level, so every change to a btree changes a node at
> each level of the BTree.  For a ZCatalog, there are a lot of btrees
> (something like 2n+4 for n indexes, I think -- don't quote me on that,
> it's been a while), and each one changes (last I looked, every index was
> updated even if the value indexed in a particular one hadn't changed.
> This may have been improved since).  Not only is this bad from a hotspot
> point of view (always a conflict on the root node of the tree), but you
> end up doing a *lot* of IO.  During my experiments that led to
> BerkeleyStorage, I was watching the Data.fs grow by 47K per transaction
> for adding indexed objects of ~1K in size.  Watching this with
> tranalyzer, this turns out to be 1K of object, and 46K of updated btree
> pages :). 

This is a significant problem. The current BTree implementation,
which predates Principia, was designed for very different applications
than it's being used for now. We are working on a new BTree 
implementation that does away with these counts. This should have
a huge impact. We are also looking at getting rid of other hot spots
in the current ZCatalog (e.g. internal id assignment).

> Note that BerkeleyStorage only prevents the file from growing
> that much -- it still has to do all that IO (in fact, it has to do ~2-3
> times that much IO, due to the nature of BerkeleyDB.  A relational
> storage would have similar issues.  For ammount of IO done, FileStorage
> is about as efficient as you can possibly be -- it's just that it trades
> that off against space reclamation).
> 
> Also, with any kind of Berkeley or Relational storage, there is a second
> hidden IO and storage penalty: you're storing a btree inside a btree. In
> other words, the lower-level DB uses btrees to store your objects,
> including interior nodes of the higher-level ZODB btree. Every interior
> node of the ZODB Btree needs a leaf node (and supporting interior nodes)
> in the DB's btrees. so you get taxed twice, on both I/O and storage
> space used.

I don't agree with the conclusion of this analysis. The indexes
used in the underlying storage are indexing totally different information.
They are effectively using indexes to provide persistent memory management.
They aren't indexing the application keys.

OTOH, I have some sympathy with a related issue.  You and Phillip
have argued that the ZODB should provide indexes, rather than leaving
indexes to application level code to avoid maintaining undo information
for indexes. After all, indexes can, in theory, be recomputed from
data records after an undo. While I think that this idea has some
merit, I don't think it offers enogh benefit to make it a high
priority.

Jim

--
Jim Fulton           mailto:jim@digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org    

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.