[Zodb-checkins] SVN: ZODB/trunk/ Improve thread-death error reporting in tests.

Tue Aug 10 21:56:43 EDT 2004

Log message for revision 26990:
  Improve thread-death error reporting in tests.
  
  Reworked the way some of the ZEO tests use threads, so that unittest is
  more likely to notice the real cause of a failure (which usually occurs in
  a thread), and less likely to latch on to spurious problems resulting from
  the real failure.
  
  Detail:  The TestThread class got the (unittest) test case as an argument,
  and remembered it.  If the run() method saw the thread die with an
  exception, it called testcase.fail() with the formatted exception.  But
  this can't work as apparently intended:  unittest's fail() raises an
  AssertionError then, and the piece of unittest running the test expects
  to catch that exception and append the message to the result object's
  list of failures.  But the piece of unittest running the test isn't on
  *this* thread's call stack!  It's on the main thread's call stack.
  So unittest simply raised an exception that wasn't noticed by anything,
  and the thread died then, leaving a traceback somewhere in the middle
  of the output.  unittest didn't know anything about that, and sometimes
  thought the test passed despite the thread failure.
  
  Alas, when a thread dies like this it's also got a decent chance of
  leaving things in a state where the test's final checking code can't
  succeed, but the checking code doesn't know the thread died either,
  and spurious unittest failures and errors could get reported then.
  
  Reworked things so that only the main thread ever tries to tell
  unittest that a test failed.  A TestThread remembers an exception-
  death now, but doesn't report it until the main thread tries to join
  it.  This way unittest knows the test failed; records and reports
  the true cause of failure; and, because tests generally join all their
  threads before doing their final checking, unittest stops the test
  before final checking if a thread death occurred, and so there's
  less chance of seeing reports of spurious errors and failures.
  


Changed:
  U   ZODB/trunk/NEWS.txt
  U   ZODB/trunk/src/ZEO/tests/CommitLockTests.py
  U   ZODB/trunk/src/ZEO/tests/InvalidationTests.py
  U   ZODB/trunk/src/ZEO/tests/TestThread.py


-=-
Modified: ZODB/trunk/NEWS.txt
===================================================================

--- ZODB/trunk/NEWS.txt	2004-08-10 23:44:33 UTC (rev 26989)
+++ ZODB/trunk/NEWS.txt	2004-08-11 01:56:42 UTC (rev 26990)
@@ -52,11 +52,16 @@
 
 Collector #1397: testTimeStamp fails on FreeBSD
 
-The *BSD distributions are unique in that their mktime() implementation
-usually ignores the input tm_isdst value.  Test checkFullTimeStamp()
-was sensitive to this platform quirk.
+    The *BSD distributions are unique in that their mktime()
+    implementation usually ignores the input tm_isdst value.  Test
+    checkFullTimeStamp() was sensitive to this platform quirk.
 
+Reworked the way some of the ZEO tests use threads, so that unittest is
+more likely to notice the real cause of a failure (which usually occurs in
+a thread), and less likely to latch on to spurious problems resulting from
+the real failure.
 
+
 What's new in ZODB3 3.3 beta 1
 ==============================
 Release date: 07-Jun-2004

Modified: ZODB/trunk/src/ZEO/tests/CommitLockTests.py
===================================================================
--- ZODB/trunk/src/ZEO/tests/CommitLockTests.py	2004-08-10 23:44:33 UTC (rev 26989)
+++ ZODB/trunk/src/ZEO/tests/CommitLockTests.py	2004-08-11 01:56:42 UTC (rev 26990)
@@ -35,12 +35,12 @@
     # run the entire test in a thread so that the blocking call for
     # tpc_vote() doesn't hang the test suite.
 
-    def __init__(self, testcase, storage, trans, method="tpc_finish"):
+    def __init__(self, storage, trans, method="tpc_finish"):
         self.storage = storage
         self.trans = trans
         self.method = method
         self.ready = threading.Event()
-        TestThread.__init__(self, testcase)
+        TestThread.__init__(self)
 
     def testrun(self):
         try:
@@ -115,7 +115,7 @@
             txn = transaction.Transaction()
             tid = self._get_timestamp()
 
-            t = WorkerThread(self, storage, txn)
+            t = WorkerThread(storage, txn)
             self._threads.append(t)
             t.start()
             t.ready.wait()

Modified: ZODB/trunk/src/ZEO/tests/InvalidationTests.py
===================================================================
--- ZODB/trunk/src/ZEO/tests/InvalidationTests.py	2004-08-10 23:44:33 UTC (rev 26989)
+++ ZODB/trunk/src/ZEO/tests/InvalidationTests.py	2004-08-11 01:56:42 UTC (rev 26990)
@@ -49,6 +49,7 @@
     # - self.stop attribute (an event)
     # - self._testrun() method
 
+    # TestThread.run() invokes testrun().
     def testrun(self):
         try:
             self._testrun()
@@ -64,8 +65,7 @@
     # to 'tree'.  If sleep is given, sleep
     # that long after each append.  At the end, instance var .added_keys
     # is a list of the ints the thread believes it added successfully.
-    def __init__(self, testcase, db, threadnum, startnum,
-                 step=2, sleep=None):
+    def __init__(self, db, threadnum, startnum, step=2, sleep=None):
         self.db = db
         self.threadnum = threadnum
         self.startnum = startnum
@@ -132,9 +132,9 @@
     # to 'tree' until Event stop is set.  If sleep is given, sleep
     # that long after each append.  At the end, instance var .added_keys
     # is a list of the ints the thread believes it added successfully.
-    def __init__(self, testcase, db, stop, threadnum, commitdict,
+    def __init__(self, db, stop, threadnum, commitdict,
                  startnum, step=2, sleep=None):
-        TestThread.__init__(self, testcase)
+        TestThread.__init__(self)
         self.db = db
         self.stop = stop
         self.threadnum = threadnum
@@ -180,9 +180,9 @@
     # more than 25 objects so that it can test code that runs vote
     # in a separate thread when it modifies more than 25 objects.
 
-    def __init__(self, testcase, db, stop, threadnum, commitdict, startnum,
+    def __init__(self, db, stop, threadnum, commitdict, startnum,
                  step=2, sleep=None):
-        TestThread.__init__(self, testcase)
+        TestThread.__init__(self)
         self.db = db
         self.stop = stop
         self.threadnum = threadnum
@@ -192,15 +192,6 @@
         self.added_keys = []
         self.commitdict = commitdict
 
-    def testrun(self):
-        try:
-            self._testrun()
-        except:
-            # Report the failure here to all the other threads, so
-            # that they stop quickly.
-            self.stop.set()
-            raise
-
     def _testrun(self):
         cn = self.db.open()
         while not self.stop.isSet():
@@ -260,9 +251,9 @@
 
 class VersionStressThread(FailableThread):
 
-    def __init__(self, testcase, db, stop, threadnum, commitdict, startnum,
+    def __init__(self, db, stop, threadnum, commitdict, startnum,
                  step=2, sleep=None):
-        TestThread.__init__(self, testcase)
+        TestThread.__init__(self)
         self.db = db
         self.stop = stop
         self.threadnum = threadnum
@@ -272,15 +263,6 @@
         self.added_keys = []
         self.commitdict = commitdict
 
-    def testrun(self):
-        try:
-            self._testrun()
-        except:
-            # Report the failure here to all the other threads, so
-            # that they stop quickly.
-            self.stop.set()
-            raise
-
     def _testrun(self):
         commit = 0
         key = self.startnum
@@ -416,7 +398,14 @@
                 break
             # Some thread still hasn't managed to commit anything.
         stop.set()
+        # Give all the threads some time to stop before trying to clean up.
+        # cleanup() will cause the test to fail if some thread ended with
+        # an uncaught exception, and unittest will call the base class
+        # tearDown then immediately, but if other threads are still
+        # running that can lead to a cascade of spurious exceptions.
         for t in threads:
+            t.join(10)
+        for t in threads:
             t.cleanup()
 
     def checkConcurrentUpdates2Storages_emulated(self):
@@ -432,8 +421,8 @@
         time.sleep(0.1)
 
         # Run two threads that update the BTree
-        t1 = StressTask(self, db1, 1, 1,)
-        t2 = StressTask(self, db2, 2, 2,)
+        t1 = StressTask(db1, 1, 1,)
+        t2 = StressTask(db2, 2, 2,)
         _runTasks(100, t1, t2)
 
         cn.sync()
@@ -458,8 +447,8 @@
 
         # Run two threads that update the BTree
         cd = {}
-        t1 = self.StressThread(self, db1, stop, 1, cd, 1)
-        t2 = self.StressThread(self, db2, stop, 2, cd, 2)
+        t1 = self.StressThread(db1, stop, 1, cd, 1)
+        t2 = self.StressThread(db2, stop, 2, cd, 2)
         self.go(stop, cd, t1, t2)
 
         while db1.lastTransaction() != db2.lastTransaction():
@@ -487,8 +476,8 @@
 
         # Run two threads that update the BTree
         cd = {}
-        t1 = self.StressThread(self, db1, stop, 1, cd, 1, sleep=0.01)
-        t2 = self.StressThread(self, db1, stop, 2, cd, 2, sleep=0.01)
+        t1 = self.StressThread(db1, stop, 1, cd, 1, sleep=0.01)
+        t2 = self.StressThread(db1, stop, 2, cd, 2, sleep=0.01)
         self.go(stop, cd, t1, t2)
 
         cn = db1.open()
@@ -516,9 +505,9 @@
         # at the same time.
 
         cd = {}
-        t1 = self.StressThread(self, db1, stop, 1, cd, 1, 3)
-        t2 = self.StressThread(self, db2, stop, 2, cd, 2, 3, 0.01)
-        t3 = self.StressThread(self, db2, stop, 3, cd, 3, 3, 0.01)
+        t1 = self.StressThread(db1, stop, 1, cd, 1, 3)
+        t2 = self.StressThread(db2, stop, 2, cd, 2, 3, 0.01)
+        t3 = self.StressThread(db2, stop, 3, cd, 3, 3, 0.01)
         self.go(stop, cd, t1, t2, t3)
 
         while db1.lastTransaction() != db2.lastTransaction():
@@ -552,9 +541,9 @@
         # at the same time.
 
         cd = {}
-        t1 = VersionStressThread(self, db1, stop, 1, cd, 1, 3)
-        t2 = VersionStressThread(self, db2, stop, 2, cd, 2, 3, 0.01)
-        t3 = VersionStressThread(self, db2, stop, 3, cd, 3, 3, 0.01)
+        t1 = VersionStressThread(db1, stop, 1, cd, 1, 3)
+        t2 = VersionStressThread(db2, stop, 2, cd, 2, 3, 0.01)
+        t3 = VersionStressThread(db2, stop, 3, cd, 3, 3, 0.01)
         self.go(stop, cd, t1, t2, t3)
 
         while db1.lastTransaction() != db2.lastTransaction():
@@ -591,9 +580,9 @@
         # at the same time.
 
         cd = {}
-        t1 = LargeUpdatesThread(self, db1, stop, 1, cd, 1, 3, 0.02)
-        t2 = LargeUpdatesThread(self, db2, stop, 2, cd, 2, 3, 0.01)
-        t3 = LargeUpdatesThread(self, db2, stop, 3, cd, 3, 3, 0.01)
+        t1 = LargeUpdatesThread(db1, stop, 1, cd, 1, 3, 0.02)
+        t2 = LargeUpdatesThread(db2, stop, 2, cd, 2, 3, 0.01)
+        t3 = LargeUpdatesThread(db2, stop, 3, cd, 3, 3, 0.01)
         self.go(stop, cd, t1, t2, t3)
 
         while db1.lastTransaction() != db2.lastTransaction():

Modified: ZODB/trunk/src/ZEO/tests/TestThread.py
===================================================================
--- ZODB/trunk/src/ZEO/tests/TestThread.py	2004-08-10 23:44:33 UTC (rev 26989)
+++ ZODB/trunk/src/ZEO/tests/TestThread.py	2004-08-11 01:56:42 UTC (rev 26990)
@@ -13,30 +13,44 @@
 ##############################################################################
 """A Thread base class for use with unittest."""
 
-from cStringIO import StringIO
 import threading
-import traceback
+import sys
 
 class TestThread(threading.Thread):
-    __super_init = threading.Thread.__init__
-    __super_run = threading.Thread.run
+    """Base class for defining threads that run from unittest.
 
-    def __init__(self, testcase, group=None, target=None, name=None,
-                 args=(), kwargs={}, verbose=None):
-        self.__super_init(group, target, name, args, kwargs, verbose)
+    The subclass should define a testrun() method instead of a run()
+    method.
+
+    Call cleanup() when the test is done with the thread, instead of join().
+    If the thread exits with an uncaught exception, it's captured and
+    re-raised when cleanup() is called.  cleanup() should be called by
+    the main thread!  Trying to tell unittest that a test failed from
+    another thread creates a nightmare of timing-depending cascading
+    failures and missed errors (tracebacks that show up on the screen,
+    but don't cause unittest to believe the test failed).
+
+    cleanup() also joins the thread.  If the thread ended without raising
+    an uncaught exception, and the join doesn't succeed in the timeout
+    period, then the test is made to fail with a "Thread still alive"
+    message.
+    """
+
+    def __init__(self):
+        threading.Thread.__init__(self)
+        # In case this thread hangs, don't stop Python from exiting.
         self.setDaemon(1)
-        self._testcase = testcase
+        self._exc_info = None
 
     def run(self):
         try:
             self.testrun()
-        except Exception:
-            s = StringIO()
-            traceback.print_exc(file=s)
-            self._testcase.fail("Exception in thread %s:\n%s\n" %
-                                (self, s.getvalue()))
+        except:
+            self._exc_info = sys.exc_info()
 
     def cleanup(self, timeout=15):
         self.join(timeout)
+        if self._exc_info:
+            raise self._exc_info[0], self._exc_info[1], self._exc_info[2]
         if self.isAlive():
             self._testcase.fail("Thread did not finish: %s" % self)