[Zope-dev] Request for comments: Devilstick persistence/storage

Thu Nov 13 10:05:25 EST 2008

I would like to request comments on our idea how to use different 
storages for our new model-driven approch with the name Devilstick. You 
dont need to know devilstick or its ideas in depth to give valuable 
input. More it would helps us to get input from people knowing zopes 
persistency layer in depth.

Devilstick is model driven framework to describe and manage data inside 
and outside ZODB. some more information at http://devilstickproject.net

At Blackforest sprint in august we researched how the goal to support 
different storages than ZODB can be achieved. After first thinking about 
an own layer we got there the idea of using the usal persistence and 
transaction API of zope. IIRC it was a result of a conversation between 
Florian Friesdorf and Roger Ineichen and probably others.

Today is the last day of Bolzano sprint. We researched a lot how it 
currently works with ZODB and discussed about how to use all this 
framework for devilstick.

Our outcome is a document describing what we found and how we want to use 
all this. It follows here. 

for the devilstick-team
Jens Klein 

=========================================================================
-----------------------------------
Introduction to Devilstick Storages
-----------------------------------

This document describes the future.

One of Devilsticks power is to support different storages than ZODB 
easily. 

The storage layer uses 100% zopes persistency implementation. At some 
entry point we enter the model driven world of devilstick: We hit 'Cage'. 
The Cage itself is not a data-access-object (DAO). But its the bridge to 
the otherstorage layer. Inside Devilstick DAOs are still persistent 
objects. They may still live in ZODB. But they can live complete outside 
if it is needed. They may live in SQL-databases, in LDAP, filesystem or 
fetched over a webservice. 

For more about DAOs and its API please read API.txt. 

Excursus: Zopes Persistence Framework 
-------------------------------------

Classic zope objects are derived from 'persistence.Persistent'. Those 
objects are tracking themselfes for modifications. Once a modification is 
detected it joins it's data-manager to the current transaction-manager. 
All this happens in zope fully transparent. 

The data-manager is the key to the storage layer. Zope is designed to use
different data-managers. Datamanagers are described well in 
'transaction.interfaces.IDataManager'. They care about storing all data 
in a 2-phase commit. There is usally one data-manager for all modified 
object of one database.

Transaction-manager collects all datamanagers (which are called resources 
inside the transaction-manager) with modifications. Once the transaction 
is committed the 2-phase commit is started: 1st 'tpc_begin' is called on 
each data-manager, 2nd the 'commit' is called for each, then 'tpc_vote' 
and finally 'tpc_finish'.

After creation of a persistent object it has an attribute called '_p_jar' 
set to 'None'. _p_jar gets a datamanager set - almost magically - after 
it was added to a container. The datamanager taken there is copied over 
from the containers  _p_jar attribute. Container and new object are 
marked as modifed and the datamanager joins the transaction. On commit 
both are written to the database. 

Devilstick persistency
----------------------

To provide other storages we alreay have a powerful framework: the 
persistent api and transaction api. Devilstick uses both. To use a 
different storage simply a new data-manager is needed. Anyway, for 
several uses-cases its fine to stay in the ZODB.

Such a alternative datamanager might work different inside than the 
current ZODB one. Since we deal with SQL or LDAP we want to update a 
database with one query for several objects involved. So on commit we may 
need to look at the modified objects and build one sql-query from a bunch 
of modifications. Frameworks like SQLAlchemy may help us here for SQL and 
others are probably available for different use-cases.

Entry-Points: Cages
-------------------

We need one point where the datamanager is switched to a different 
storage.A model is assigned and there the world of generic DAOs is 
entered. This entry point is called 'Cage'. A cage is still persistent in 
the ZODB and uses the zopes default data-manager. A cage has the root 
container DAO (which is a generic molecule DAO) set as an attribute. Here 
some example code how it looks like: 

    >>> cage = Cage()
    >>> cage._p_jar
    None

    >>> somezodbcontainer._p_jar
    <Connection at ... >

    >>> somezodbcontainer['data'] = cage
    >>> cage._p_jar    
    <Connection at ... >

    >>> cage._root
    None

    >>> cage.model = 'examplemodel'
    >>> cage._root
    <Molecule at ...>

    >>> cage._root._p_jar
    <MyStoragesDataManager at ...>

The cage also bridges the container API of the root molecule. It 
simplifies the usage of the API and avoids to introduce a extra access 
step on the cage. This way its more intuitive.

    >>> cage._root.keys()
    ['m1', 'm2', 'm3']

    >>> cage.keys()
    ['m1', 'm2', 'm3']

    >>> cage['m1] is cage._root['m1']
    True