[Zope-Checkins] CVS: Zope/lib/python/Products/Transience - HowTransienceWorks.stx:1.1 Transience.py:1.25 TransienceInterfaces.py:1.14 TransientObject.py:1.7

Chris McDonough chrism@zope.com
Thu, 20 Jun 2002 21:51:44 -0400


Update of /cvs-repository/Zope/lib/python/Products/Transience
In directory cvs.zope.org:/tmp/cvs-serv20260

Modified Files:
	Transience.py TransienceInterfaces.py TransientObject.py 
Added Files:
	HowTransienceWorks.stx 
Log Message:
New TransientObjectContainer implementation.

Changes:

 - More stable under high usage, especially in the face of
   situations under which there are many ZODB conflict
   errors. The previous implementation had stability problems
   when many conflict errors were encountered; especially
   conflicts that were generated as a result of a simultaneous
   change to a subobject of the TOC (such as in the case of a Zope
   application which makes heavy use of both frames and
   sessions).
   
 - More conflict-resistant.  Instead of ignoring the likelihood
   that multiple threads will attempt to perform the same actions
   simultaneously in methods of the TOC (which often causes
   conflicts), the new implementation attempts to avoid conflicts
   by employing a chance-based housekeeping model.  In this model,
   one thread is "elected" by chance to do the kinds of tasks that
   cause the most conflicts.

 - Now uses a "timeslice" based model instead of a "ring" based
   model.  This also helps cut down on conflicts and makes
   the code slighly less obfuscated (not much, though! ;-)

 - Quite a few more comments in the code.

 - Changes to the sessioning stresstest (which exposed the
   bug that made me reimplement the TOC in the first place).

 - Updates to unit tests.

 - A "HowTransienceWorks.stx" document which attempts to
   explain how the code works.  It's not stellar, but
   it's a start.

 - Changes to the TransientObject class that the TOC
   hands out (typically as a "session data object"), in order
   to make invalidation less Rube-Goldberg-ish.

The structure of the TOC object has changed enough that in order to
maintain b/w compatibility, an in-place upgrade of "old" instances
is implied by running them with this code.   "Upgraded" instances
are not backwards-incompatible, however, so folks can hopefully
move back and forth between Zope versions without much hassle.



=== Added File Zope/lib/python/Products/Transience/HowTransienceWorks.stx ===
How Transience Works

  The object responsible for managing the expiration of "transient"
  objects is the TransientObjectContainer, the class definition for
  which is located in
  Products.Transience.Transience.TransientObjectContainer.  An
  instance of this class is found in the default Zope installation at
  /temp_folder/session_data.

  The TransientObjectContainer (TOC) holds Transient Objects (TOs).

  A TO is obtained via its container via a call to
  TOC.new_or_existing(key), where "key" is usually the "browser id"
  associated with a visitor (See Products.Session.BrowserIdManager).

  If the TOC has a "current" TO corresponding to "key", it is
  returned.

  If the TOC does not have a "current" TO corresponding to "key", (due
  to the expiration of the TO or because it never existed in the first
  place) a "new" TO is manufactured and returned.

Timeslices

  Transience defines the notion of a "timeslice".  A "timeslice" is an
  integer that represents some "slice" of time, defined by a "period".
  For example, if a period is 20 seconds long, three ordered time
  slices might be expressed as 0, 20, and 40.  The next timeslice
  would be 60, and so on.  For an absolute time to "belong" to a
  timeslice, it would need to be equal to or greater than one
  timeslice integer, but less than the subsequent timeslice integer.

Data Structures Maintained by a Transient Object Container

  The TOC maintains five important kinds of data structures:

  - a "_data" structure, which is an IOBTree mapping a "timeslice"
    integer to a "bucket" (see next bullet for definition of bucket).

  - One or more "buckets", which are OOBTree objects which map a "key"
    (usually browser id) to a TransientObject.  Buckets are stored
    inside of the "_data" structure.  There is a concept of a
    "current" bucket, which is the bucket that is contained within the
    _data structured with a key equal to the "current" timeslice.

  - An "index" which is an OOBTree mapping transient object "key" to
    "timeslice", letting us quickly figure out which element in the _data
    mapping contains the transient object related to the key.  It is
    stored as the attribute "_index" of the TOC.  When calling code
    wants to obtain a Transient Object, its key is looked up in the
    index, which returns a timeslice.  We ask the _data structure for the
    bucket it has stored under that timeslice.  Then the bucket is asked
    for the object stored under the key.  This returns the Transient Object.

  - A "last timeslice" integer, which is equal to the "last" timeslice
    under which TOC housekeeping operations were performed.

  - A "next to deindex" integer, which is a timeslice
    representing the next bucket which requires "deindexing"
    (the removal of all the keys of the bucket from the index).

  When a Transient Object is created via new_or_existing, it is added
  to the "current" bucket.  As time goes by, the bucket to which the
  TO was added ceases to be the "current" bucket.  If the transient
  object is "accessed" (it is called up out of the TOC via the TOC's
  'get' method), it is again moved to the "current" bucket defined by
  the current time's timeslice.

  During the course of normal operations, a TransientObject will move
  from an "old" bucket to the "current" bucket many times, as long as
  it continues to be accessed.  It is possible for a TransientObject
  to *never* expire, as long as it is called up out of its TOC often
  enough.

  If a TransientObject is not accessed in the period of time defined by
  the TOC's "timeout", it is deindexed and eventually garbage collected.

How the TransientObjectContainer Determines if a TransientObject is "Current"

  A TO is current if it has an entry in the "index".  When a TO has an
  entry in the index, it implies that the TO resides in a bucket that
  is no "older" than the TOC timeout period, based on the bucket's
  timeslice.

Housekeeping: Finalization, Notification, Garbage Collection, and
Bucket Replentishing

  The TOC performs "deindexing", "notification", "garbage
  collection", and "bucket replentishing".  It performs these tasks
  "in-band".  This means that the TOC does not maintain a separate
  thread that wakes up every so often to do these housekeeping tasks.
  Instead, during the course of normal operations, the TOC
  opportunistically performs them.

  Deindexing is defined as the act of making an "expired" TO
  inaccessible (by deleting it from the "index").  After a TO is
  deindexed, it may not be used by application code any longer,
  although it may "stick around" in a bucket for a while until the
  bucket is eventually garbage collected.

  Notification is defined as optionally calling a function at TOC
  finalization time.  The optional function call is user-defined, but
  it is managed by the "notifyDestruct" method of the TOC.

  Garbage collection is defined as deleting "expired" buckets in the
  _data structure (the _data structure maps a timeslice to a bucket).

  Bucket replentishing is defined as the action of (opportunistically)
  creating more buckets to insert into the the _data structure,
  replacing ones that are deleted during garbage collection.  The act
  of deleting a bucket does not necessarily imply that a new bucket
  will be immediately created thereafter.  We create new buckets in
  batches to reduce the possibility of conflicts.

Goals

 - A low number of ZODB conflict errors (which reduce performance).

 - Stability.

To Do

  - Testing under ZEO.



=== Zope/lib/python/Products/Transience/Transience.py 1.24 => 1.25 === (976/1076 lines abridged)
 ##############################################################################
 """
-Transient Object Container class.
+Transient Object Container Class ('timeslice'-based design).
 
 $Id$
 """
@@ -20,33 +20,37 @@
 
 import Globals
 from Globals import HTMLFile
-from TransienceInterfaces import ItemWithId,\
+from TransienceInterfaces import Transient, DictionaryLike, ItemWithId,\
+     TTWDictionary, ImmutablyValuedMappingOfPickleableObjects,\
      StringKeyedHomogeneousItemContainer, TransientItemContainer
-from TransientObject import TransientObject
 from OFS.SimpleItem import SimpleItem
 from Persistence import Persistent
+from Acquisition import Implicit
 from AccessControl import ClassSecurityInfo, getSecurityManager
 from AccessControl.SecurityManagement import newSecurityManager
 from AccessControl.User import nobody
-from BTrees import OOBTree
+from BTrees.OOBTree import OOBTree, OOBucket, OOSet
+from BTrees.IOBTree import IOBTree
 from BTrees.Length import Length
 from zLOG import LOG, WARNING, BLATHER
-import os, math, time, sys, random
+import os.path
+import os
+import math, sys, random
+import time
+from types import InstanceType
+from TransientObject import TransientObject
+import thread
+import ThreadLock
+import Queue
 
-DEBUG = os.environ.get('Z_TOC_DEBUG', '')
+_marker = []
 
-def DLOG(*args):
-    tmp = []
-    for arg in args:
-        tmp.append(str(arg))
-    LOG('Transience DEBUG', BLATHER, ' '.join(tmp))
+DEBUG = os.environ.get('Z_TOC_DEBUG', '')
 
 class MaxTransientObjectsExceeded(Exception): pass
 

[-=- -=- -=- 976 lines omitted -=- -=- -=-]

+    """
+    A persistent object representing a typically increasing integer that
+    has conflict resolution uses the greatest integer out of the three
+    available states
+    """
+    def __init__(self, v):
+        self.value = v
+
+    def set(self, v):
+        self.value = v
+        
+    def __getstate__(self):
+        return self.value
+
+    def __setstate__(self, v):
+        self.value = v
+
+    def __call__(self):
+        return self.value
+
+    def _p_resolveConflict(self, old, state1, state2):
+        DEBUG and TLOG('Resolving conflict in Increaser')
+        if old <= state1 <= state2: return state2
+        if old <= state2 <= state1: return state1
+        return old
+
+    def _p_independent(self):
+        return 1
 
 class Ring(Persistent):
-    """ ring of buckets """
+    """ ring of buckets.  This class is only kept for backwards-compatibility
+    purposes (Zope 2.5X). """
     def __init__(self, l, index):
         if not len(l):
             raise "ring must have at least one element"
+        DEBUG and TLOG('initial _ring buckets: %s' % map(oid, l))
         self._data = l
         self._index = index
 
@@ -497,9 +898,4 @@
     def _p_independent(self):
         return 1
 
-    # this should really have a _p_resolveConflict, but
-    # I've not had time to come up with a reasonable one that
-    # works in every circumstance.
-
 Globals.InitializeClass(TransientObjectContainer)
-


=== Zope/lib/python/Products/Transience/TransienceInterfaces.py 1.13 => 1.14 ===
     def get(k, default='marker'):
         """
-        Return value associated with key k.  If k does not exist and default
-        is not marker, return default, else raise KeyError.
+        Return value associated with key k.  Return None or default if k
+        does not exist.
         """
 
     def has_key(k):


=== Zope/lib/python/Products/Transience/TransientObject.py 1.6 => 1.7 ===
 
     def invalidate(self):
+        if hasattr(self, '_invalid'):
+            # we dont want to invalidate twice
+            return
+        trans_ob_container = getattr(self, 'aq_parent', None)
+        if trans_ob_container is not None:
+            if trans_ob_container.has_key(self.token):
+                del trans_ob_container[self.token]
         self._invalid = None
 
     def isValid(self):