[Zodb-checkins] CVS: ZODB3/bsddb3Storage/bsddb3Storage - Full.py:1.47
Barry Warsaw
barry@wooz.org
Fri, 8 Nov 2002 14:35:52 -0500
Update of /cvs-repository/ZODB3/bsddb3Storage/bsddb3Storage
In directory cvs.zope.org:/tmp/cvs-serv7162
Modified Files:
Full.py
Log Message:
A new algorithm for packing which seems much more straightforward.
Here's how it works:
- On every store(), we write an entry to a objrev table containing the
tuple of information (newserial, oid, oldserial). We don't write
this entry if the store is the first revision of an object on a new
version.
We do basically the same thing on restore() and transactionalUndo().
- On an abortVersion(), we write two entries to the objrev table, one
that has (newserial, oid, oldserial) -- which points to the old
serial in the version, and (newserial, oid, nvserial) -- which
points to the non-version revision of the version revision.
- On commitVersion(), we do the same as abortVersion() except that we
don't write the non-version data if we're committing to a different
version.
- Now, when we pack, all we need to do is cruise from the beginning of
the objrev table until we find an entry with a newserial > packtime.
If the oldserial is ZERO, it's an object creation event which we
don't need to worry about because there's no previous revision. But
otherwise, we can delete the oid+oldserial revision because we know
it's not current. We do this, updating pickle refcounts and then
collecting any objects that are left unreferenced.
The cute thing is that autopacking will use the same algorithm. The
main difference between autopack and classic pack, is that the latter
does a mark and sweep garbage collection phase after the normal objrev
collection phase. Also, this algorithm means autopack needs only
three pieces of information:
- How often the thread should run (e.g. once per hour)
- How far in the past it should pack (e.g. pack to 4 hours ago). We
don't need a start time for the autopack window, because we'll
always just start at the beginning of the objrev table.
- How often should autopack also do a classic pack (e.g. do a classic
pack once per day).
Autopack isn't implemented in this checkin, but I believe it will be
nearly trivial to add. That comes next.
=== ZODB3/bsddb3Storage/bsddb3Storage/Full.py 1.46 => 1.47 === (881/981 lines abridged)
--- ZODB3/bsddb3Storage/bsddb3Storage/Full.py:1.46 Tue Nov 5 18:07:31 2002
+++ ZODB3/bsddb3Storage/bsddb3Storage/Full.py Fri Nov 8 14:35:51 2002
@@ -24,7 +24,7 @@
# This uses the Dunn/Kuchling PyBSDDB v3 extension module available from
# http://pybsddb.sourceforge.net. It is compatible with release 3.4 of
-# PyBSDDB3.
+# PyBSDDB3. The only recommended version of BerkeleyDB is 4.0.14.
from bsddb3 import db
from ZODB import POSException
@@ -41,21 +41,15 @@
# functionality.
from BerkeleyBase import BerkeleyBase
-# Flags for transaction status in the transaction metadata table. You can
-# only undo back to the last pack, and any transactions before the pack time
-# get marked with the PROTECTED_TRANSACTION flag. An attempt to undo past a
-# PROTECTED_TRANSACTION will raise an POSException.UndoError. By default,
-# transactions are marked with the UNDOABLE_TRANSACTION status flag.
-UNDOABLE_TRANSACTION = 'Y'
-PROTECTED_TRANSACTION = 'N'
-
ABORT = 'A'
COMMIT = 'C'
PRESENT = 'X'
ZERO = '\0'*8
+
+# Special flag for uncreated objects (i.e. Does Not Exist)
DNE = '\377'*8
# DEBUGGING
-#DNE = 'nonexist' # does not exist
+#DNE = 'nonexist'
try:
# Python 2.2
@@ -91,7 +85,8 @@
#
# - Object ids (oid) are 8-bytes
# - Objects have revisions, with each revision being identified by a
- # unique serial number.
+ # unique serial number. We sometimes refer to 16-byte strings of
+ # oid+serial as a revision id.
# - Transaction ids (tid) are 8-bytes
# - Version ids (vid) are 8-bytes
# - Data pickles are of arbitrary length
@@ -138,16 +133,9 @@
# prevrevid is the tid pointing to the previous state of the
# object. This is used for undo.
#
[-=- -=- -=- 881 lines omitted -=- -=- -=-]
- return tid, status, user, desc, ext
+ packtime = self._last_packtime()
+ if tid <= packtime:
+ packedp = True
+ else:
+ packedp = False
+ userlen, desclen = unpack('>II', data[:8])
+ user = data[8:8+userlen]
+ desc = data[8+userlen:8+userlen+desclen]
+ ext = data[8+userlen+desclen:]
+ return tid, packedp, user, desc, ext
finally:
if c:
c.close()
@@ -1678,14 +1741,14 @@
if self._closed:
raise IOError, 'iterator is closed'
# Let IndexErrors percolate up.
- tid, status, user, desc, ext = self._storage._nexttxn(
+ tid, packedp, user, desc, ext = self._storage._nexttxn(
self._tid, self._first)
self._first = False
# Did we reach the specified end?
if self._stop is not None and tid > self._stop:
raise IndexError
self._tid = tid
- return _RecordsIterator(self._storage, tid, status, user, desc, ext)
+ return _RecordsIterator(self._storage, tid, packedp, user, desc, ext)
def close(self):
self._closed = True
@@ -1715,14 +1778,14 @@
description = None
_extension = None
- def __init__(self, storage, tid, status, user, desc, ext):
+ def __init__(self, storage, tid, packedp, user, desc, ext):
self._storage = storage
self.tid = tid
# Impedence matching
- if status == UNDOABLE_TRANSACTION:
- self.status = ' '
- else:
+ if packedp:
self.status = 'p'
+ else:
+ self.status = ' '
self.user = user
self.description = desc
self._extension = ext