[ZODB-Dev] ZODB 3.10.0b5 released (was Storage API change: Checking for reading out-of-date data)
Jim Fulton
jim at zope.com
Fri Sep 3 11:21:38 EDT 2010
This change is released in ZODB 3.10.0b5,
http://pypi.python.org/pypi/ZODB3/3.10.0b5
Jim
On Mon, Aug 30, 2010 at 5:36 PM, Jim Fulton <jim at zope.com> wrote:
> ZODB used multi-version concurrency control to assure that data read
> are consistent. It doesn't check that or require data read to be up
> to date. For read-only transactions, this is approriate.
>
> Even for write transactions, not checking whether reads are up to date
> isn't typically a problem, since the important data read is also
> updated and we check for write conflicts.
>
> The approach used by ZODB is a common one and represents a generally
> good tradeoff between consisntency and performance.
>
> The approach, however, can run into probems when data from one object
> are read and used to update a different object. I've mistakenly
> tended to view this situation as an edge case. However, BTrees,
> perhaps the most heavily used data structure in ZODB applications,
> follow this data access pattern. In particular, internal nodes are
> read to determine which subnodes data should be written to. An out of
> date internal node can lead to data in BTrees being missplaced. This
> doesn't happen very often, and when it does happen, it's been pretty
> mysterious.
>
> This is a fairly serious problem. It's serious enough that I'm, going
> to add some APIs in ZODB 3.10 to deal with it. One of these is:
>
> class ReadVerifyingStorage(IStorage):
>
> def checkCurrentSerialInTransaction(oid, serial):
> """Check whether the given serial number is current.
>
> The method is called during the first phase of 2-phase commit
> to verify that data read in a transaction is current.
>
> The storage should raise a ConflictError if the serial is not
> current, although it may raise the exception later, in a call
> to store or in a call to tpc_vote.
>
> If no exception is raised, then the serial must remain current
> through the end of the transaction.
> """
>
> The tricky thing about this is the last paragraph. If the method
> doesn't raise an error, then there can't be updates to the object
> until after the transaction commits. For most current
> implementations, this implies that the storage lock is help when this
> is called. For ZEO, some special care will be necessary because the
> storage lock isn't acquired until the very end of the first phase of
> 2-phase commit.
>
> I'm particularly concerned about the impact on RelStorage.
>
> This API will be used whenever a BTree is modified, so it will be used
> fairly often. It won't be used for all reads, although furture
> versions of ZODB might provide an option to check al reads, or all
> reads in trannsactions that write data.
>
> Jim
>
> --
> Jim Fulton
>
--
Jim Fulton
More information about the ZODB-Dev
mailing list