[ZODB-Dev] ZODB 3.10.0b5 released (was Storage API change: Checking for reading out-of-date data)

Jim Fulton jim at zope.com
Fri Sep 3 11:21:38 EDT 2010


This change is released in ZODB 3.10.0b5,

   http://pypi.python.org/pypi/ZODB3/3.10.0b5

Jim

On Mon, Aug 30, 2010 at 5:36 PM, Jim Fulton <jim at zope.com> wrote:
> ZODB used multi-version concurrency control to assure that data read
> are consistent.  It doesn't check that or require data read to be up
> to date.  For read-only transactions, this is approriate.
>
> Even for write transactions, not checking whether reads are up to date
> isn't typically a problem, since the important data read is also
> updated and we check for write conflicts.
>
> The approach used by ZODB is a common one and represents a generally
> good tradeoff between consisntency and performance.
>
> The approach, however, can run into probems when data from one object
> are read and used to update a different object.  I've mistakenly
> tended to view this situation as an edge case.  However, BTrees,
> perhaps the most heavily used data structure in ZODB applications,
> follow this data access pattern. In particular, internal nodes are
> read to determine which subnodes data should be written to. An out of
> date internal node can lead to data in BTrees being missplaced.  This
> doesn't happen very often, and when it does happen, it's been pretty
> mysterious.
>
> This is a fairly serious problem.  It's serious enough that I'm, going
> to add some APIs in ZODB 3.10 to deal with it.  One of these is:
>
>  class ReadVerifyingStorage(IStorage):
>
>      def checkCurrentSerialInTransaction(oid, serial):
>          """Check whether the given serial number is current.
>
>          The method is called during the first phase of 2-phase commit
>          to verify that data read in a transaction is current.
>
>          The storage should raise a ConflictError if the serial is not
>          current, although it may raise the exception later, in a call
>          to store or in a call to tpc_vote.
>
>          If no exception is raised, then the serial must remain current
>          through the end of the transaction.
>          """
>
> The tricky thing about this is the last paragraph.  If the method
> doesn't raise an error, then there can't be updates to the object
> until after the transaction commits.  For most current
> implementations, this implies that the storage lock is help when this
> is called.  For ZEO, some special care will be necessary because the
> storage lock isn't acquired until the very end of the first phase of
> 2-phase commit.
>
> I'm particularly concerned about the impact on RelStorage.
>
> This API will be used whenever a BTree is modified, so it will be used
> fairly often. It won't be used for all reads, although furture
> versions of ZODB might provide an option to check al reads, or all
> reads in trannsactions that write data.
>
> Jim
>
> --
> Jim Fulton
>



-- 
Jim Fulton


More information about the ZODB-Dev mailing list