[ZODB-Dev] transaction timeouts raise no exceptions and cause strange errors on clients

Jim Fulton jim at zope.com
Thu Nov 4 09:43:30 EDT 2010


On Thu, Nov 4, 2010 at 7:49 AM, Chris Withers <chris at simplistix.co.uk> wrote:
> Hi All,
>
> I have a ZODB 3.9.6 ZEO storage server configured with a transaction
> time of 20 seconds.
>
> On one zeo client I saw the following:
>
> ------
> 2010-11-04T10:40:50 INFO ZEO.ClientStorage unpackedstorage Disconnected
> from storage: "('xx', 6000)"
> ------
> 2010-11-04T10:40:50 INFO ZEO.ClientStorage unpackedstorage Testing
> connection <ManagedClientConnection ('xx', 6000)>
> ------
> 2010-11-04T10:40:50 INFO ZEO.zrpc.Connection(C) (xx:6000) received
> handshake 'Z309'
> ------
> 2010-11-04T10:40:50 ERROR ZODB.Connection Couldn't load state for 0x15ef67
> Traceback (most recent call last):
>   File "ZODB3-3.9.6-py2.6-linux-i686.egg/ZODB/Connection.py", line 847,
> in setstate
>     self._setstate(obj)
>   File "ZODB3-3.9.6-py2.6-linux-i686.egg/ZODB/Connection.py", line 897,
> in _setstate
>     p, serial = self._storage.load(obj._p_oid, '')
>   File "ZODB3-3.9.6-py2.6-linux-i686.egg/ZODB/Connection.py", line
> 1282, in load
>     return self._storage.load(oid, '')
>   File "ZODB3-3.9.6-py2.6-linux-i686.egg/ZEO/ClientStorage.py", line
> 813, in load
>     data, tid = self._server.loadEx(oid)
>   File "ZODB3-3.9.6-py2.6-linux-i686.egg/ZEO/ClientStorage.py", line
> 88, in __getattr__
>     raise ClientDisconnected()
> ClientDisconnected
> ------
> 2010-11-04T10:40:51 INFO ZEO.ClientStorage unpackedstorage Server
> authentication protocol None
> ------
> 2010-11-04T10:40:51 INFO ZEO.ClientStorage unpackedstorage Connected to
> storage: ('xx', 6000)
> ------
> 2010-11-04T10:40:51 INFO ZEO.ClientStorage unpackedstorage No
> verification necessary (last_inval_tid up-to-date)
> ------
> 2010-11-04T10:40:51 ERROR Zope.SiteErrorLog 1288867251.890.173783627248
> https://xx
> Traceback (innermost last):
> ...
>   Module ZODB.Connection, line 847, in setstate
>   Module ZODB.Connection, line 897, in _setstate
>   Module ZODB.Connection, line 1282, in load
>   Module ZEO.ClientStorage, line 813, in load
>   Module ZEO.ClientStorage, line 88, in __getattr__
> ClientDisconnected
> ------
> 2010-11-04T10:40:54 CRITICAL txn.-1271567456 A storage error occurred
> during the second phase of the two-phase commit.  Resources may be in an
> inconsistent state.
> ------
> 2010-11-04T10:40:55 ERROR Zope.SiteErrorLog 1288867255.090.422928174987
> https://xx
> Traceback (innermost last):
>   Module ZPublisher.Publish, line 135, in publish
>   Module Zope2.App.startup, line 291, in commit
>   Module transaction._manager, line 89, in commit
>   Module transaction._transaction, line 329, in commit
>   Module transaction._transaction, line 431, in _commitResources
>   Module ZODB.Connection, line 794, in tpc_finish
>   Module ZEO.ClientStorage, line 1148, in tpc_finish
> ClientDisconnected: Calling tpc_finish() on a disconnected transaction
>
> This was all pretty alarming, especially as I could see no errors logged
> anywhere else.
>
> Eevntually, in the storage server logs, I found:
>
> 2010-11-04T10:40:50 INFO ZEO.StorageServer (7008/xx) Transaction timeout
> after 20 seconds
> ------
> 2010-11-04T10:40:50 INFO ZEO.StorageServer (7008/xx) disconnected during
> transaction <transaction._transaction.Transaction object at 0x9b4106c>
> ------
> 2010-11-04T10:40:50 INFO ZEO.StorageServer (7008) new connection ('xx',
> 34996): <ManagedServerConnection ('xx', 34996)>
> ------
> 2010-11-04T10:40:50 INFO ZEO.zrpc.Connection(S) (xx) received handshake
> 'Z309'
> ------
> 2010-11-04T10:40:55 WARNING ZEO.StorageServer (7008/xx) no current
> transaction: tpc_abort()
> ------
>
> I'm curious:
>
> - Why is the timeout not logged at ERROR or even CRITICAL?

Because it was considered to be neither.  It's also not a server error.
I can definitely accept the argument that it should be logged with more
feeling.


>
> - Is the "inconsistent state" warning genuine here?

Yes.  You don't really know whether the transaction committed or not.
It's much worse if multiple storages are involved.

>  I restarted the
>   client to be on the safe side, but it'd be good to know for sure.

Restarting wouldn't necessarily make any difference, although it
is probably a good precaution.

>
> - If that warning is genuine, is there any way the timeout could not
>   leave the client in an inconsistent state?

Absolutely.  At a minimum, the client doesn't know whether the transaction
committed or not. If there were multiple databases, it might have
committed in one,
but not the other.

This is why I tend to view transaction timeouts as a last resort and
set the limit
much higher, 5 minutes.

Jim

-- 
Jim Fulton


More information about the ZODB-Dev mailing list