[ZODB-Dev] High Write Applications
Phillip J. Eby
pje at telecommunity.com
Sat Aug 2 17:49:21 EDT 2003
At 03:32 PM 8/2/03 +0100, Chris Withers wrote:
>Phillip J. Eby wrote:
>>The root causes is that RDBMS transaction logs record high-level details,
>>not blob snapshots. For example, when an RDBMS logs that you changed row
>>#2967's "foo" column to value "bar", it doesn't usually also log the
>>entire data page the row was contained in, plus copies of all the b-tree
>>index pages that changed as a consequence of the change.
>>Thus, ZODB's disk usage per write transaction generally exceeds RDBMS
>>disk usage for the same transaction by at *least* an order of magnitude,
>>even before catalog indexes come into play.
>
>oh :-(
>
>>To do this, ZODB would have to be able to understand and log
>>*differences*, rather than just snapshotting object states.
>
>How would it go about doing this?
If I could answer that question, I'd have suggested Jim implement it years
ago. :)
>>And, it would need to be able to manage periodic checkpointing, so that
>>recovering a database wouldn't require rerunning all the transactions
>>that had ever been done on it.
>
>Hmmm, I'm no expert in this kind of thing. How does periodic checkpointing
>work?
It just means that there needs to be a consistent snapshot of the entire
database available on disk. Last I looked, ZODB file-storage did this by
using a "quick-load" index file with pointers into the transaction log for
the current version of each object. BerkeleyDB does checkpoints when asked
to; this basically amounts to simply ensuring that all dirty DB pages in
memory are written to disk and a filesystem sync() is performed.
>>By the way, if you've looked at systems like Prevayler,
>
>I haven't ;-) Where can I read more?
Google for "Prevayler".
More information about the ZODB-Dev
mailing list