[ZODB-Dev] what appears inside zodb storage?
Guido van Rossum
guido@python.org
Wed, 18 Dec 2002 21:28:29 -0500
> I wonder whether someone can verify my current understanding of what appears
> inside a zodb storage like file storage. (If other storages have contents
> that differ substantially, this would be good to know in some detail.)
>
> I understand a storage contains mainly a collection of pickled objects,
> where each object has an oid, and an index that maps oids to objects.
> In addition to this, there are transacted updates to the objects.
>
> Maybe an updated object is updated by writing a new version entirely, and
> making the map cause the oid to refer to the new version while leaving the
> old one alone (without deleting it), so packing is needed to make storage
> smaller.
Yes.
> So my theory is that a file storage contains pickled objects and a map of
> oids to those objects, and maybe old stale versions of objects, and a
> chained linked list of transactions that allow earlier views of the world
> to be taken instead of the last one.
Pretty much. Berkeley storage uses different data structures but
pretty much stores the same conceptual info. (It has a way of
automatically packing, i.e. garbage collecting, revisions of objects
older than a given delay.)
> When zope btrees are used and these are stored persistently (are they
> always stored persistently?) where are the btrees stored?
Each "node" in a BTree is a separate persistent object. If a BTree
consists of 10 nodes and only 3 of those are modified by a particular
transaction, only the pickles for those 3 nodes are written as part of
the transaction record.
> Maybe I should be reading the code to verify this model, but I was hoping
> someone on this list could correct me so when I describe this to Chandler
> folks I can get it right.
The comments at the top of FileStorage.py may shed some light.
> My motivation for asking today is a desire for versions in Chandler that
> support synchronization and replication. A simple transaction model
> (which zodb might have, but I don't know) need only have a way to indicate
> which version of an object should apply for a given transaction. It need
> not make it easy to consult specific versions of an object.
In general the latest revision of an object is always used -- modulo
(transactional) undo. There's also a concept of "ZODB versions" which
is really long-term locking of selected objects; don't use this.
--Guido van Rossum (home page: http://www.python.org/~guido/)