On Mon, Nov 28, 2005 at 08:19:23PM +0100, Dieter Maurer wrote:
Paul Winkler wrote at 2005-11-27 21:17 -0500:
... I've seen the same symptoms a number of times recently with zope 2.7.x. In our case, it seems to be related to ZEO. Zope seems to have lost its connection to ZEO but doesn't realize it somehow. My theory is that the symptom starts when all worker threads are waiting for objects that aren't in the ZEO client cache, so they're all waiting on ZEO requests.
Do you have a firewall between Zope and ZEO?
Yes, we do, and it is under control of another part of the company :-(
Usually, the OS can inform both ends of a connection when the connection is torn down. However, some firewalls tear a connection down in a way that the endpoints do not get informed.
I suspected as much... thanks.
We had to implement a keep alive mechanism to prevent our firewall from behaving in this nasty way.
OK. Can you give a high-level summary of what you did? I thought of using heartbeat to detect loss of connection, but I'm not sure what I could do on failure short of restarting Zope. -- Paul Winkler http://www.slinkp.com