[Zope-dev] ZEO TempStorage: Odd behavior on ZEO restart

Sebastian Tiedtke st at mdvtts.com
Sun Jul 11 01:41:16 EDT 2010


Hi all,

We're working on a large scale Plone deployment and recently came across
some odd behavior with tempstorage at our ZEO server. A limited number, yet
highly frequented, of pages on our Plone site query backend web services
that provide location based services such as geocoding and routing (a to b).
Since those calls to the services are rather computation intense we decided
to store intermediate and resulting data in sessions vs. re-querying the
service for no reason.

The system setup consists of a central ZEO server that provides the storage
for content as well as a temp storage for sessioning. There are multiple
other servers that run multiple Plone instances with ZEO clients. The ZEO
server setup is straight forward but I'm happy to provide the configuration
files. Generally the concept works great.

However, whenever we restart the ZEO server and keep the client instances
running we get odd behavior with the client instances' sessioning. Every
client appears to reconnect and, if required, revalidates its cache.

2010-07-11T14:46:03 INFO ZEO.ClientStorage (6132) Disconnected from storage:
"('ITIS.mentzdv.de', 8100)"
2010-07-11T14:46:03 INFO ZEO.ClientStorage (6132) Disconnected from storage:
"('ITIS.mentzdv.de', 8100)"
2010-07-11T14:46:09 INFO ZEO.ClientStorage (6132) Testing connection
<ManagedClientConnection ('127.0.0.1', 8100)>
2010-07-11T14:46:09 INFO ZEO.zrpc.Connection(C) (127.0.0.1:8100) received
handshake 'Z303'
2010-07-11T14:46:09 INFO ZEO.ClientStorage (6132) Server authentication
protocol None
2010-07-11T14:46:09 INFO ZEO.ClientStorage (6132) Connected to storage: ('
ITIS.mentzdv.de', 8100)
2010-07-11T14:46:09 INFO ZEO.ClientStorage (6132) Testing connection
<ManagedClientConnection ('127.0.0.1', 8100)>
2010-07-11T14:46:09 INFO ZEO.ClientStorage (6132) last inval tid:
'\x03\x87n\x9d|\x1a\xc2\xaa' 2010-07-11 04:45:29.087000

2010-07-11T14:46:09 INFO ZEO.ClientStorage (6132) last transaction:
'\x00\x00\x00\x00\x00\x00\x00\x00' 1900-01-01 00:00:00.000000
2010-07-11T14:46:09 INFO ZEO.zrpc.Connection(C) (127.0.0.1:8100) received
handshake 'Z303'
2010-07-11T14:46:10 INFO ZEO.ClientStorage (6132) Verifying cache
2010-07-11T14:46:10 INFO ZEO.ClientStorage (6132) Server authentication
protocol None
2010-07-11T14:46:10 INFO ZEO.ClientStorage (6132) Connected to storage: ('
ITIS.mentzdv.de', 8100)
2010-07-11T14:46:10 INFO ZEO.ClientStorage (6132) No verification necessary
(last_inval_tid up-to-date)
2010-07-11T14:46:10 INFO ZEO.ClientStorage (6132) endVerify finishing
2010-07-11T14:46:10 INFO ZEO.ClientStorage (6132) endVerify finished


The problem is that whenever we hit a session bound page we get following
error and stack trace:

2010-07-11T14:58:12 ERROR ZODB.Connection Couldn't load state for 0x82
Traceback (most recent call last):
  File
"g:\mentzdv\plone\instances\zeo04\zope2\lib\python\ZODB\Connection.py", line
761, in setstate
    self._setstate(obj)
  File
"g:\mentzdv\plone\instances\zeo04\zope2\lib\python\ZODB\Connection.py", line
801, in _setstate
    p, serial = self._storage.load(obj._p_oid, self._version)
  File
"G:\mentzdv\plone\instances\zeo04\zope2\lib\python\ZEO\ClientStorage.py",
line 727, in load
    return self.loadEx(oid, version)[:2]
  File
"G:\mentzdv\plone\instances\zeo04\zope2\lib\python\ZEO\ClientStorage.py",
line 750, in loadEx
    data, tid, ver = self._server.loadEx(oid, version)
  File
"G:\mentzdv\plone\instances\zeo04\zope2\lib\python\ZEO\ServerStub.py", line
196, in loadEx
    return self.rpc.call("loadEx", oid, version)
  File
"G:\mentzdv\plone\instances\zeo04\zope2\lib\python\ZEO\zrpc\connection.py",
line 650, in call
    raise inst # error raised by server
KeyError: '\x00\x00\x00\x00\x00\x00\x00\x82'


In the ZMI under the Control-Panel/Database Tab the temporary database seems
to be fine. The /temp_folder returns a similar stack trace though:
Site Error

An error was encountered while publishing this resource.

*KeyError*
Sorry, a site error occurred.

Traceback (innermost last):

   - Module ZPublisher.Publish, line 202, in publish_module_standard
   - Module ZPublisher.Publish, line 150, in publish
   - Module plone.app.linkintegrity.monkey, line 21, in
   zpublisher_exception_hook_wrapper
   - Module Zope2.App.startup, line 221, in zpublisher_exception_hook
   - Module ZPublisher.Publish, line 119, in publish
   - Module ZPublisher.mapply, line 88, in mapply
   - Module ZPublisher.Publish, line 42, in call_object
   - Module App.special_dtml, line 65, in __call__
   - Module DocumentTemplate.DT_String, line 476, in __call__
   - Module Products.Transience.Transience, line 926, in housekeep
   - Module Products.Transience.Transience, line 929, in _housekeep
   - Module Products.Transience.Transience, line 569, in _finalize
   - Module Products.Transience.Transience, line 593, in _do_finalize_work
   - Module ZODB.Connection, line 761, in setstate
   - Module ZODB.Connection, line 801, in _setstate
   - Module ZEO.ClientStorage, line 727, in load
   - Module ZEO.ClientStorage, line 750, in loadEx
   - Module ZEO.ServerStub, line 196, in loadEx
   - Module ZEO.zrpc.connection, line 650, in call

KeyError: '\x00\x00\x00\x00\x00\x00\x00\x82' (Also, the following error
occurred while attempting to render the standard error message, please see
the event log for full details: You are not allowed to access 'title_or_id'
in this context)

We know for a fact that one client instance needs to create the actual
temporary container at the ZEO server upon its restart. Restarting a single
client instance will make make this instance work but unfortunately won't
recover all other client instances. All of those problems disappear as soon
as we restart all instances; until the next ZEO server restart of course.
The problem doesn't appear to happen if connections drops happen but the ZEO
server keeps running.

Is that behavior intentional? Any kind of help is very much appreciated.

We're currently using following versions (yes, we're also running on windows
but I find it hard to believe that the OS is causing those issues).

Zope Version
 (Zope 2.10.9-final, python 2.4.4, win32)
 Python Version
 2.4.4 (#71, Oct 18 2006, 08:34:43) [MSC v.1310 32 bit (Intel)]
 System Platform
 win32
 SOFTWARE_HOME
 G:\mentzdv\plone\instances\zeo04\zope2\lib\python
 ZOPE_HOME
 G:\mentzdv\plone\instances\zeo04\zope2
 INSTANCE_HOME
 G:\mentzdv\plone\instances\zeo04\parts\instance1
 CLIENT_HOME
 G:\mentzdv\plone\instances\zeo04\var\instance1
 Network Services
 ZServer.HTTPServer.zhttp_server (Port: 8101)
 Process Id
 6132 (5328)
 Running For
 1 hour 3 min 34 sec

Thanks,
Sebastian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.zope.org/pipermail/zope-dev/attachments/20100711/079bbf0d/attachment-0001.html 


More information about the Zope-Dev mailing list