[Zope] Re: Zope hanging (poss. threads-related)

Marcus Collins mcollins@sunesi.com
Thu, 13 Apr 2000 19:46:03 +0200


I'm convinced there's some deep, dark, timing problem, and that it's
thread-related...

This morning, I had a thread hang, again on Zope 2.1.3 running with four
threads. I was watching for unfinished requests at the time, and caught this
twenty minutes after it occurred.

I tried viewing a frameset page on the site. No problem. Then I tried
launching /manage, and suddenly *all* the threads hung. This is the first
time all the threads have hung. It's also the first time I've managed to
catch an unfinished request and try viewing framesets within half an hour,
which leads me to believe that, previously, ZServer was doing a good job of
cleaning up the hung zombie threads (recall the zombie_timeout of 30
minutes). 

I'm actually wondering if reducing that zombie_timeout (and
maintenance_interval in medusa/http_server.py) would go anywhere towards
alleviating this problem as a temporary measure. Would there be any reasons
not to try this?

I've added quite a lot of DebugLogger stuff to ZServer/PCGIServer.py, and
modified the log analyser accordingly, in the hopes of nailing this sucker:

def send_response(self):
    # create an output pipe by passing request to ZPublisher,
    # and requesting a callback of self.log with the module
    # name and PATH_INFO as an argument.
    self.done=1        
    # MC 2000-04-13 additional logging
    DebugLogger.log('X', id(self), 'send_response: before PCGIResponse')
    response=PCGIResponse(stdout=PCGIPipe(self), stderr=StringIO())
    # MC 2000-04-13 additional logging
    DebugLogger.log('X', id(self), 'send_response: before HTTPRequest')
    request=HTTPRequest(self.data, self.env, response)
    # MC 2000-04-13 additional logging
    DebugLogger.log('X', id(self), 'send_response: before handle')
    handle(self.server.module, request, response)

Everything before the call to handle() is 100%. Sometimes, however, we don't
get from handle() to the next stage. This is on Zope 2.1.6, which I've been
running with up to 100 threads, although I unfortunately can't excercise
that many! In fact, it has proven quite a mission to get Zope to hang, maybe
because of the increased latency in serving requests due to the additional
logging.

I've added this logging to the Zope 2.1.3 serving the live site, and will
report my findings as soon as something untoward occurs. Maybe others who
are experiencing hanging would also be able to do some extra logging and
report the results [now, there, I see Wiki would be really useful!].

In the meantime, any suggestions as to where to go next will be keenly acted
on!

Thanks all!

-- Marcus