On Wed, Aug 29, 2012 at 06:30:50AM -0400, Jim Fulton wrote:
On Wed, Aug 29, 2012 at 2:29 AM, Marius Gedminas <marius@gedmin.as> wrote:
On Tue, Aug 28, 2012 at 06:31:05PM +0200, Vincent Pelletier wrote:
On Tue, 28 Aug 2012 16:31:20 +0200, Martijn Pieters <mj@zopatista.com> wrote :
Anything else different? Did you make any performance comparisons between RelStorage and NEO?
I believe the main difference compared to all other ZODB Storage implementation is the finer-grained locking scheme: in all storage implementations I know, there is a database-level lock during the entire second phase of 2PC, whereas in NEO transactions are serialised only when they alter a common set of objects.
This could be a compelling point. I've seen deadlocks in an app that tried to use both ZEO and PostgreSQL via the Storm ORM. (The thread holding the ZEO commit lock was blocked waiting for the PostgreSQL commit to finish, while the PostgreSQL server was waiting for some other transaction to either commit or abort -- and that other transaction couldn't proceed because it was waiting for the ZEO lock.)
This sounds like an application/transaction configuration problem.
*shrug* Here's the code to reproduce it: http://pastie.org/4617132
To avoid this sort of deadlock, you need to always commit in a a consistent order. You also need to configure ZEO (or NEO) to time-out transactions that take too long to finish the second phase.
The deadlock happens in tpc_begin() in both threads, which is the first phase, AFAIU. AFAICS Thread #2 first performs tpc_begin() for ClientStorage and takes the ZEO commit lock. Then it enters tpc_begin() for Storm's StoreDataManager and blocks waiting for a response from PostgreSQL -- which is delayed because the PostgreSQL server is waiting to see if the other thread, Thread #1, will commit or abort _its_ transaction, which is conflicting with the one from Thread #2. Meanwhile Thread #1 is blocked in ZODB's tpc_begin(), trying to acquire the ZEO commit lock held by Thread #2. I'm too fried right now to understand who's at fault here. Workarounds probably exist (use RelStorage instead of ZEO? Configure Storm to use a lower PostgreSQL transaction isolation level?). Maybe this problem would go away if Storm always went into tpc_begin() before ZEO. I've pinged the people in #storm on FreeNode about this, but haven't filed any bugs yet. Marius Gedminas -- Q: Wanting both frequent updates and stability/support is just wishing for a pony! A: Well, we're riding our ponies to the tune of several billion page views per month. Where's your pony? Oh, you didn't get one? -- http://meta.wikimedia.org/wiki/Wikimedia_Ubuntu_migration_FAQ