[ZODB-Dev] transaction timeouts raise no exceptions and cause strange errors on clients
Jim Fulton
jim at zope.com
Thu Nov 4 09:43:30 EDT 2010
On Thu, Nov 4, 2010 at 7:49 AM, Chris Withers <chris at simplistix.co.uk> wrote:
> Hi All,
>
> I have a ZODB 3.9.6 ZEO storage server configured with a transaction
> time of 20 seconds.
>
> On one zeo client I saw the following:
>
> ------
> 2010-11-04T10:40:50 INFO ZEO.ClientStorage unpackedstorage Disconnected
> from storage: "('xx', 6000)"
> ------
> 2010-11-04T10:40:50 INFO ZEO.ClientStorage unpackedstorage Testing
> connection <ManagedClientConnection ('xx', 6000)>
> ------
> 2010-11-04T10:40:50 INFO ZEO.zrpc.Connection(C) (xx:6000) received
> handshake 'Z309'
> ------
> 2010-11-04T10:40:50 ERROR ZODB.Connection Couldn't load state for 0x15ef67
> Traceback (most recent call last):
> File "ZODB3-3.9.6-py2.6-linux-i686.egg/ZODB/Connection.py", line 847,
> in setstate
> self._setstate(obj)
> File "ZODB3-3.9.6-py2.6-linux-i686.egg/ZODB/Connection.py", line 897,
> in _setstate
> p, serial = self._storage.load(obj._p_oid, '')
> File "ZODB3-3.9.6-py2.6-linux-i686.egg/ZODB/Connection.py", line
> 1282, in load
> return self._storage.load(oid, '')
> File "ZODB3-3.9.6-py2.6-linux-i686.egg/ZEO/ClientStorage.py", line
> 813, in load
> data, tid = self._server.loadEx(oid)
> File "ZODB3-3.9.6-py2.6-linux-i686.egg/ZEO/ClientStorage.py", line
> 88, in __getattr__
> raise ClientDisconnected()
> ClientDisconnected
> ------
> 2010-11-04T10:40:51 INFO ZEO.ClientStorage unpackedstorage Server
> authentication protocol None
> ------
> 2010-11-04T10:40:51 INFO ZEO.ClientStorage unpackedstorage Connected to
> storage: ('xx', 6000)
> ------
> 2010-11-04T10:40:51 INFO ZEO.ClientStorage unpackedstorage No
> verification necessary (last_inval_tid up-to-date)
> ------
> 2010-11-04T10:40:51 ERROR Zope.SiteErrorLog 1288867251.890.173783627248
> https://xx
> Traceback (innermost last):
> ...
> Module ZODB.Connection, line 847, in setstate
> Module ZODB.Connection, line 897, in _setstate
> Module ZODB.Connection, line 1282, in load
> Module ZEO.ClientStorage, line 813, in load
> Module ZEO.ClientStorage, line 88, in __getattr__
> ClientDisconnected
> ------
> 2010-11-04T10:40:54 CRITICAL txn.-1271567456 A storage error occurred
> during the second phase of the two-phase commit. Resources may be in an
> inconsistent state.
> ------
> 2010-11-04T10:40:55 ERROR Zope.SiteErrorLog 1288867255.090.422928174987
> https://xx
> Traceback (innermost last):
> Module ZPublisher.Publish, line 135, in publish
> Module Zope2.App.startup, line 291, in commit
> Module transaction._manager, line 89, in commit
> Module transaction._transaction, line 329, in commit
> Module transaction._transaction, line 431, in _commitResources
> Module ZODB.Connection, line 794, in tpc_finish
> Module ZEO.ClientStorage, line 1148, in tpc_finish
> ClientDisconnected: Calling tpc_finish() on a disconnected transaction
>
> This was all pretty alarming, especially as I could see no errors logged
> anywhere else.
>
> Eevntually, in the storage server logs, I found:
>
> 2010-11-04T10:40:50 INFO ZEO.StorageServer (7008/xx) Transaction timeout
> after 20 seconds
> ------
> 2010-11-04T10:40:50 INFO ZEO.StorageServer (7008/xx) disconnected during
> transaction <transaction._transaction.Transaction object at 0x9b4106c>
> ------
> 2010-11-04T10:40:50 INFO ZEO.StorageServer (7008) new connection ('xx',
> 34996): <ManagedServerConnection ('xx', 34996)>
> ------
> 2010-11-04T10:40:50 INFO ZEO.zrpc.Connection(S) (xx) received handshake
> 'Z309'
> ------
> 2010-11-04T10:40:55 WARNING ZEO.StorageServer (7008/xx) no current
> transaction: tpc_abort()
> ------
>
> I'm curious:
>
> - Why is the timeout not logged at ERROR or even CRITICAL?
Because it was considered to be neither. It's also not a server error.
I can definitely accept the argument that it should be logged with more
feeling.
>
> - Is the "inconsistent state" warning genuine here?
Yes. You don't really know whether the transaction committed or not.
It's much worse if multiple storages are involved.
> I restarted the
> client to be on the safe side, but it'd be good to know for sure.
Restarting wouldn't necessarily make any difference, although it
is probably a good precaution.
>
> - If that warning is genuine, is there any way the timeout could not
> leave the client in an inconsistent state?
Absolutely. At a minimum, the client doesn't know whether the transaction
committed or not. If there were multiple databases, it might have
committed in one,
but not the other.
This is why I tend to view transaction timeouts as a last resort and
set the limit
much higher, 5 minutes.
Jim
--
Jim Fulton
More information about the ZODB-Dev
mailing list