[Zope-dev] RE: [Zope] Re: Zope hanging (poss. threads-related
)
Marcus Collins
mcollins@sunesi.com
Fri, 14 Apr 2000 20:21:44 +0200
> -----Original Message-----
> From: Tony Rossignol [mailto:tonyr@ep.newtimes.com]
> Sent: 14 April 2000 19:19
> To: Marcus Collins; zope-dev@zope.org
> Subject: Re: [Zope-dev] RE: [Zope] Re: Zope hanging (poss.
> threads-related)
> Thank you for starting this. I'll try to gather up information
> I've been trying to collect here and post it in the next few days.
Thanks! Maybe you could also look at extending the DebugLogger output
(http://www.zope.org/Members/tseaver/Projects/HighlyAvailableZope/DebugLogge
r) and posting the results of any hanging there?
> RE: could be zombie related -
>
> Where might I find more info on this? Could this zombie issue be
> present in FCGI as well?
Amos Lattier remarked in
http://lists.zope.org/pipermail/zope-dev/2000-April/004194.html that:
"The ZServer zombie stuff is to get rid of zombie client
connections, not zombie publishing threads. These are quite
different beasts."
I'm not yet grokking the whole picture, so I can't really answer to that.
Note that there is an outstanding issue in the Collector at
http://classic.zope.org:8080/Collector/954/view that might be related. As
you previously noted, there is no zombie_timeout in the FCGI server.
> We have noticed once restarts start they get worse when under a
> load. I've been suspecting that the longer pages take to load the
> more people are just stopping the page load, and this would/could
> create a zombie.
You'll sometimes note on the console or your logs something like the
following:
2000-04-14T14:00:26 ERROR(200) ZServer uncaptured python exception, closing
channel <PCGIChannel at 87567b0> (socket.error:(32, 'Broken pipe')
[/usr/local/Zope-2.1.6-src/ZServer/medusa/asynchat.py|initiate_send|211]
[/usr/local/Zope-2.1.6-src/ZServer/medusa/asyncore.py|send|237])
This occurs (I surmise) when the client closes the channel before ZServer
has sent its response. I presume the fact that it closes the channel would
mean no zombie, but I'd like to know more about this.
> The problem is I don't know how to identify or research what
> is going on under the hood. Could when someone terminates a
> connection be an issue here? I mean; if the request in queued
> waiting for a thread might act differently than a request that is
> being processed by Zope and waiting on a DB query, or even
> termination once zope has already started passing results back
> through the pipe. Our restarts are so hard to tie down I'm
> guessing it's a very subtle issue or a combination of just the
> wrong factors.
We have quite a number of the above errors occurring, and they seem to
correlate to people terminating the connection (or browser timeout), from
what I've seen internally. I also feel at a loss here, focussing on a very
small part of ZServer and possibly missing the big picture.
> Just some more food for thought.
Thanks.
-- Marcus