[ZODB-Dev] Re: [Dev] ZODB is not a Storage Technology (Re: other formats )
Mike C. Fletcher
mcfletch@rogers.com
Sat, 09 Nov 2002 03:42:35 -0500
Okay, here's a quick overview of the guts, presented as an outline. I've
assumed you'll be reading the summaries with the source-code open in
another window to see what's being described, so I've not gone into any
details as to how anything is done.
The objects likely best to concentrate on for understanding the
low-level guts are the FileStorage, the Connection, and the
_defaulttransaction. I've given you quick summaries of what you'll find
in most of the files in the ZODB4 CVS packages (ZODB, Transaction and
Persistence), the zLOG project is just logging facilities, nothing
really close to the core of the ZODB. The indentation is primarily
showing usage patterns (for instance, fsindex is really only used by
FileStorage AFAIK), though I've also used it to group items which can be
considered sub-categories of the superior item.
I'll work on details tomorrow if I can get some more time,
questions/directions in which you'd like more coverage quite welcome.
BTW: I've copied the ZODB-dev list so that others can correct anything
I've messed up, or add anything that they consider critical to
understanding the system.
Enjoy,
Mike
ZODB:
Storage (BaseStorage sub-classes):
"""Storages are responsible for maintaining object state records
They can also maintain undo (transaction) and versional records.
"""
FileStorage:
"""Default ZODB storage
The FileStorage is a linear aggregate of all transactions,
and transactions are aggregates of all changed objects.
Transactions are added at the end of the file, with
later changes to a particular object conceptually overwriting
the earlier changes.
Versions (personal views of the dbase) are just transactions
which are declared to have version information. The versions
form linked lists (they point to the last transaction in the
version).
Storages which have undo support (such as filestorage) have
a pack method which basically copies all objects forward until
there is a single current set. Then discards anything not in
the current set.
"""
fsIndex:
"""Index from persistent OID -> file position index
The fsIndex provides optimised index to individual objects
within the data file of the FileStorage. The index can
be rebuilt merely be scanning through the entire datafile.
"""
TmpStore:
"""Storage for transaction save-points"""
DBMStorage:
"""Simple storage based on GDBM/AnyDBM"""
MappingStorage:
"""A demonstration of a volatile in-memory storage"""
utility mechanisms:
TimeStamp:
"""TimeStamp C exetension type"""
Serialize:
"""Pickle-like storage (cPickle plus some custom code)"""
referencesf:
"""finds object refs in pickle strings"""
file_lock:
"""(small) wrapper to do cross-platform locking of files"""
fsdump, fsrecover:
"""Debugging/utility code"""
Connection:
"""Object-space in which application objects live
Uses an in-memory object-cache (see below)
Provides object-access (get root dict, get object by oid)
though normal access is via getting root and then
drilling down through the object references.
Other than this, almost the entire class is support
for the transaction and persistence mechanisms.
"""
ExportImport:
"""Mix-in providing XML import/export"""
DB:
"""Manages multiple Connections to a storage
Provides a pool of connections
Provides mechanisms for applying functions
to all object caches in all connections
Tracks object modifications for versions? (not
sure about this, I've never used versions)
Provides most of the primitives on which Connection and
Transaction build the transaction mechanism. (tpc_*)
"""
Transaction:
_defaultTransaction:
"""The default transaction machinery
Combined with the connection object, this is most
of the transaction-driving code in the system. It
is fairly tightly coupled to the Persistent module
(e.g. it assumes _p_jar and the like on all registered
objects).
"""
Transaction:
"""Data-storage for the current transaction"""
Manager:
"""Entry point for transaction APIs"""
Persistence:
_persistent:
"""Python 2.2.2 implementation of IPersistent
Basically, this is a Pure-python version of the cPersistence
code that really gets used (I'm not sure if there's code
anywhere to fall back to using this version if the cPersistence
code isn't compiled).
This is quite useful for figuring out what's going on,
but (having used it for a few months), it seemed too slow
to be of use in a real-world system (too much time spent in
__getattribute__).
"""
cPersistence:
"""Provides optimised IPersistent implementation"""
Cache:
"""Provides an in-memory object cache to reduce reloads from disk
Basically this is a high-level cache, it has a target size
and a few methods implementing garbage collection. The
DB calls the connection's GC methods, then the connection calls
it's cache's GC methods.
"""
particular data-types:
PersistentDict, PersistentList:
"""Dictionary and List types which track their changes
Basically allow you to use them as lists/dicts without
needing to spend code tracking changes yourself. These
items, however, re-store the entire list/dict on each
save, so see BTree for large dicts.
"""
BTrees:
"""BTree implementation using individually persistent nodes
Allows large dictionaries to be stored so that only a small
sub-set of the dictionary needs to be re-stored on modifications
"""
Function, Module, Package:
"""References to these types w/ importing
Never used these myself (I think they're new),
they appear to store name-references, or actual
code objects in the case of functions.
"""
John Anderson wrote:
> I'd be interested in an overview of the guts. Start with a big
> picture, then move into some details and describe what's in which
> files. I'd like to eventually learn the code base so I can decide how
> to improve it.
>
> John
>
> Mike C. Fletcher wrote:
>
>> At what level would you like the description (I've been using ZODB
>> for years now, and have just released a calendaring application on
>> it). I assume you understand the basics, so are you looking for
>> analysis of where/how it starts to fail/how to update it, or what the
>> actual machinery inside is doing for any given action?
>>
>> I'll push some time around and try to get a description posted this
>> weekend if you can tell me which area you need.
>>
>> Enjoy,
>> Mike
>>
...