[Zope-dev] TCP CLOSE_WAIT leaks

Paul Winkler pw_lists at slinkp.com
Wed Mar 29 13:11:51 EST 2006


On Thu, Mar 30, 2006 at 02:32:58AM +1000, Alan Milligan wrote:
> I managed to get a DeadlockDebugger trace on this thing, it made very
> interesting reading:
(snip)
>   File "/opt/zope2.8/lib/python/ZEO/ClientStorage.py", line 781, in loadEx
>     return data, tid, ver
> 
> *every* thread was block-waiting on zeo (from a wide range of different
> Zope/Plone types)!  It looks to me like Apache has timed out, clearing
> down it's end, Zope however is still having to wait for zeo which is
> completely borked.
> 
> I've consequently ditched zeo and everything is again well-behaved.

Is your zeo server on a separate box? Is there a firewall between them?

The *only* time I've ever had problems like that was in the following
scenario:

* firewall between zope and zeo

* minimal traffic at times (it was a secondary system, most of its
  usage was when our primary data center was down for maintenance)

* firewall was of an evil type that tears down "unused" connections
  without either end being able to know it happened 

In this scenario, after suitably long period of no traffic between Zope
and Zeo, the firewall would disconnect them but they would still think
they were connected, and we would get a problem like yours.

Dieter Maurer observed the same thing and gave me the hint that this
might be the problem.  Implementing his suggested "keepalive" product 
was less trouble than arguing with the firewall administrators.
http://aspn.activestate.com/ASPN/Mail/Message/zope-list/2918584

I used something very close to that, I believe I just saved it as 
Products/ZeoKeepalive/__init__.py.
(I've changed jobs so I'm going by memory.)

-- 

Paul Winkler
http://www.slinkp.com


More information about the Zope-Dev mailing list