RE: [Zope-dev] RE: [Zope] Re: Zope hanging (poss. threads-related )
-----Original Message----- From: Tony Rossignol [mailto:tonyr@ep.newtimes.com] Sent: 14 April 2000 19:19 To: Marcus Collins; zope-dev@zope.org Subject: Re: [Zope-dev] RE: [Zope] Re: Zope hanging (poss. threads-related)
Thank you for starting this. I'll try to gather up information I've been trying to collect here and post it in the next few days.
Thanks! Maybe you could also look at extending the DebugLogger output (http://www.zope.org/Members/tseaver/Projects/HighlyAvailableZope/DebugLogge r) and posting the results of any hanging there?
RE: could be zombie related -
Where might I find more info on this? Could this zombie issue be present in FCGI as well?
Amos Lattier remarked in http://lists.zope.org/pipermail/zope-dev/2000-April/004194.html that: "The ZServer zombie stuff is to get rid of zombie client connections, not zombie publishing threads. These are quite different beasts." I'm not yet grokking the whole picture, so I can't really answer to that. Note that there is an outstanding issue in the Collector at http://classic.zope.org:8080/Collector/954/view that might be related. As you previously noted, there is no zombie_timeout in the FCGI server.
We have noticed once restarts start they get worse when under a load. I've been suspecting that the longer pages take to load the more people are just stopping the page load, and this would/could create a zombie.
You'll sometimes note on the console or your logs something like the following: 2000-04-14T14:00:26 ERROR(200) ZServer uncaptured python exception, closing channel <PCGIChannel at 87567b0> (socket.error:(32, 'Broken pipe') [/usr/local/Zope-2.1.6-src/ZServer/medusa/asynchat.py|initiate_send|211] [/usr/local/Zope-2.1.6-src/ZServer/medusa/asyncore.py|send|237]) This occurs (I surmise) when the client closes the channel before ZServer has sent its response. I presume the fact that it closes the channel would mean no zombie, but I'd like to know more about this.
The problem is I don't know how to identify or research what is going on under the hood. Could when someone terminates a connection be an issue here? I mean; if the request in queued waiting for a thread might act differently than a request that is being processed by Zope and waiting on a DB query, or even termination once zope has already started passing results back through the pipe. Our restarts are so hard to tie down I'm guessing it's a very subtle issue or a combination of just the wrong factors.
We have quite a number of the above errors occurring, and they seem to correlate to people terminating the connection (or browser timeout), from what I've seen internally. I also feel at a loss here, focussing on a very small part of ZServer and possibly missing the big picture.
Just some more food for thought.
Thanks. -- Marcus
Marcus Collins wrote:
-----Original Message----- From: Tony Rossignol [mailto:tonyr@ep.newtimes.com] Sent: 14 April 2000 19:19 To: Marcus Collins; zope-dev@zope.org Subject: Re: [Zope-dev] RE: [Zope] Re: Zope hanging (poss. threads-related)
Thank you for starting this. I'll try to gather up information I've been trying to collect here and post it in the next few days.
Thanks! Maybe you could also look at extending the DebugLogger output (http://www.zope.org/Members/tseaver/Projects/HighlyAvailableZope/DebugLogge r) and posting the results of any hanging there?
RE: could be zombie related -
Where might I find more info on this? Could this zombie issue be present in FCGI as well?
Amos Lattier remarked in http://lists.zope.org/pipermail/zope-dev/2000-April/004194.html that:
"The ZServer zombie stuff is to get rid of zombie client connections, not zombie publishing threads. These are quite different beasts."
I'm not yet grokking the whole picture, so I can't really answer to that. Note that there is an outstanding issue in the Collector at http://classic.zope.org:8080/Collector/954/view that might be related. As you previously noted, there is no zombie_timeout in the FCGI server.
What the Zombie timeout means is that after a publishing thread gets done answering a request, the socket may not go away. This many for a a number of reasons, the client 'hung' and is not 'putting down the phone after the converstation is over' (so to speak) or network troubles may prevent the connection from closing properly. This means that there is a 'zombie' connection laying around. This zombie will probably end up going away on its own, but if not, ZServer will kill it after a period of time. The only reasorce laying around during the life of a Zombie is an tiny little unused open socket, the Mack truck of a Zope thread that served the request for the zombie socket does not 'hang' for that entire period of time, but goes on after it has completed the request to serve other requests. Amos is correct in that these problems are almost always at the Application level, and not at the ZServer level. The fact that Pavlos can prevent hanging by inserting a print statement in the asyncore loop is suspicious, but we do not have enough information yet to point fingers anywhere. -Michel
participants (2)
-
Marcus Collins -
Michel Pelletier