[ZODB-Dev] B-Tree Concurrency Issue (_OOBTree.pyd segfaults)
Tim Peters
tim at zope.com
Fri Apr 15 17:33:50 EDT 2005
Below is a self-contained program that reproduces both symptoms you
described, although I had to use 3 threads to make them frequent. Typical
output:
filling 10000 ... 100 200 300 400 500 600 ...
... 9600 9700 9800 9900 10000
iteration 0
Exception in thread Thread-3:Traceback (most recent call last):
File "C:\python23\lib\threading.py", line 442, in __bootstrap
self.run()
File "crash.py", line 48, in run
self.work(self.tree)
File "crash.py", line 52, in work
tree._check()
AssertionError: Bucket length < 1
iteration 1
Exception in thread Thread-6:Traceback (most recent call last):
File "C:\python23\lib\threading.py", line 442, in __bootstrap
self.run()
File "crash.py", line 48, in run
self.work(self.tree)
File "crash.py", line 52, in work
tree._check()
AssertionError: Bucket length < 1
iteration 2
...
AssertionError: Bucket length < 1
iteration 3
...
AssertionError: Bucket length < 1
iteration 4
....
AssertionError: Bucket length < 1
iteration 5
Exception in thread Thread-18:Traceback (most recent call last):
File "C:\python23\lib\threading.py", line 442, in __bootstrap
self.run()
File "crash.py", line 48, in run
self.work(self.tree)
File "crash.py", line 52, in work
tree._check()
Exception in thread Thread-17:AssertionError: Bucket length < 1
Traceback (most recent call last):
File "C:\python23\lib\threading.py", line 442, in __bootstrap
self.run()
File "crash.py", line 48, in run
self.work(self.tree)
File "crash.py", line 56, in work
for x in tree.keys():
RuntimeError: the bucket being iterated changed size
iteration 6
...
AssertionError: Bucket length < 1
...
These problems go away if I introduce a module-level mutex:
mut = threading.Lock()
and change TestThread.run() to serialize the threads, like so:
def run(self):
while time.time() < self.deadline:
mut.acquire()
self.work(self.tree)
mut.release()
So, empirically, adding that mutex meets the "[you must] perform whatever
locking is required" caveat in this case.
I don't know why it fails without this. Sorry, but it can't be a priority
for me to investigate this either, as it's not an intended use case (I asked
Jim, and he agrees).
If someone else would like to dig into it, great, the patch and bug trackers
are open 24 hours a day.
Test program (for ZODB 3.2; I used a pre-3.2.7 development version):
"""
import threading
import random
import time
import ZODB
from ZODB.FileStorage import FileStorage
from BTrees.OOBTree import OOBTree
from Persistence import Persistent
class P(Persistent):
def __init__(self, value):
self.value = value
STORAGE = 'CR2.fs'
N = 10000 # number of BTree keys
RUNTIME = 5 # seconds to run each iteration
st = FileStorage(STORAGE)
db = ZODB.DB(st)
cn = db.open()
rt = cn.root()
print "filling", N, "...",
n = 0
tree = rt['tree'] = OOBTree()
while n < N:
t = range(15)
random.shuffle(t)
if not tree.has_key(t):
tree[t] = P(t)
n += 1
if n % 100 == 0:
print n,
get_transaction().commit()
get_transaction().commit()
print
db.close()
class TestThread(threading.Thread):
def __init__(self, tree, deadline):
threading.Thread.__init__(self)
self.tree = tree
self.deadline = deadline
def run(self):
while time.time() < self.deadline:
self.work(self.tree)
class Check(TestThread):
def work(self, tree):
tree._check()
class Crawl(TestThread):
def work(self, tree):
for x in tree.keys():
tree[x].value
n = 0
while True:
print "iteration", n
st = FileStorage(STORAGE)
db = ZODB.DB(st)
cn = db.open()
tree = cn.root()['tree']
deadline = time.time() + RUNTIME
threads = [Check(tree, deadline), Crawl(tree, deadline),
Check(tree, deadline)]
for t in threads:
t.start()
for t in threads:
t.join()
get_transaction().abort()
db.close()
n += 1
"""
More information about the ZODB-Dev
mailing list