[ZODB-Dev] Commit or lock object accross transactions

Wed Jul 16 14:43:04 EDT 2003

On Wed, Jul 16, 2003 at 11:42:43AM -0400, Jeremy Hylton wrote:
> > The problem with this is that their might be legitimate transactions in
> > process on other clients.
> 
> A client that has a transaction in progress will not see the changes
> committed by another client until the transaction finishes.  Changes
> only become visible at transaction boundaries.

Generating a unique and sequential ID over a global space is not a very
trivial matter, and I'm not sure all of it is the ZODB's fault in this
case.  There are a number of potential constraints that should to be
considered. The ones I work with in my app are:

    - The ID should be globally unique.
    - The transaction using an allocated ID may be cancelled after it
      is allocated.
    - The ID should be presentable to the end-user in the UI before the
      transaction is commited.

  - Scenario 1: Locking primitive. A locking primitive emphasizes the
    first constraint, but it doesn't help the other ones. Orphaned IDs
    would need to be collected and reused (hopefully by the very next
    transaction).  Granted, it's a nice feature, and if there is a way
    to implement it in the ZODB, it would be great.

    The last constraint is okay here, because the ID is guaranteed to not
    change when commit() happens.

    A substantial con here is that the ZODB doesn't provide such a
    primitive, so you'd need to implement it, or using a secondary
    database for IDs.

  - Scenario 2: Lazy consistency. Assuming no locking primitive is
    available, the ZODB style of consistency management is lazy -- if
    conflicts do happen (i.e., if two identical IDs are allocated at the
    same time), we can try and resolve the conflict (by assigning
    another ID to the conflicting transaction). Orphaned IDs are less of
    a problem in this scenario (they only appear if you *undo* a
    transaction that was committed, since the ID is guaranteed with the
    commit).

    The last constraint is a problem with the lazy model, though, since the
    ID *may* change when the transaction is being committed -- the user that
    wrote down 5563 needs to be notified that when it was committed, a
    conflict was found and it changed to 5564 -- he scratches out the 3 and
    writes a 4 down.

What I recommend, after looking at the options, is analyzing the
constraints and seeing what your priority is. If a skipped ID every once
in a while is okay, you don't have to worry about orphaned IDs, which
simplifies things somewhat. If it's okay to present an ID to the user
and have it change (presenting a special dialog or page that notifies
him, or if the user doesn't need to see the ID before committing, that's
good too.

Someone (I think Casey) once suggested to me using the BTrees' Length to
define IDs, because it had built-in conflict-resolution (which would
allocate another value if a conflict happened). [However, because the
Length only stores the current value, it won't help us with orphaned IDs
-- you'd need to search through all allocated IDs to check for them.]

Toby once suggested obtaining an ID in a "short" transaction (get_ID(),
commit()) and then using it in your "long" transaction. This
reduces the chance of getting a conflict, but reintroduces the risk of
orphaned IDs (since the latter transaction may be cancelled - and what
then? Undo the transaction where the ID was obtained?)

See also ChrisM's post on counters, which discussed Length's conflict
resolution:
http://zope.nipltd.com/public/lists/zope-archive.nsf/AGByKey/558F6496C424804A

Anyway, that's half a chapter in a book <wink>.
--
Christian Reis, Senior Engineer, Async Open Source, Brazil.
http://async.com.br/~kiko/ | [+55 16] 261 2331 | NMFL