[ZODB-Dev] ZODB Deadlock issue python frame traces.
Jeremy Hylton
jeremy@digicool.com
Thu, 10 May 2001 16:40:25 -0400 (EDT)
John,
I'm not sure I understand what the problem is just from the stack
trace. Apparently, you've got a deadlock problem. Which thread is
blocked? From the stack trace, it looks like one thread is trying to
acquire the FileStorage "synchronized method" lock in a load() call.
The other thread is attempting to acquire the DB object's method
lock.
The two locks use different implementations. The FileStorage lock
is a reentrant lock implemented by the ThreadLock module from
ExtensionClass. The DB lock is *either* a lock from the Python
thread module or a no-op lock from bpthread. One thing that
confuses me about the stack trace is that it shows line 101 of
bpthread, but my copy of bpthread only has 99 lines! If bpthread
is showing up anywhere, however, it suggests that there is a
serious problem with the Python install. bpthread only defines an
allocate_lock() class if it can't import allocate_lock from
thread. What's gone wrong there?
Quite apart from my confusion about the bpthread module: Which thread
is blocked? Is it the FileStorage lock that is already held or is it
the DB lock? In either case, are there other threads in the system?
The FileStorage lock should be reentrant, so another thread must hold
it for that thread to be block. But you haven't shown any other
thread executing in FileStorage! And FileStorage is pretty careful
about using try-finally to guarantee that the lock is released. (But
maybe there's a bug there.)
Or are you showing the lock owners and not the place where the
deadlock occurs?
Jeremy
PS I often find it helpful to debug locking problems by printing out a
message whenever a lock is acquired or released. Of course, the
introduction of extra I/O can make the program behave differently,
but, still, it has helped more often than not. I don't seem to have
any code on my work machine, but I recall writing a little wrapper
using Python 2.1's sys._getframe() function that made it very easy.
The wrapper can store the real lock as an attribute and implement
acquire() and release() by calling them on its lock attribute. The
sys._getframe() can be used to print the location of the call.