After running properly for about an hour or two, my Zope instance stops responding to HTTP requests and begins consuming all available CPU time. How should I go about troubleshooting this problem? Specs: Digital AlphaServer 2000 5/250 FreeBSD/alpha 5.3-RELEASE Python 2.3.4 Zope 2.7.4 Plone 2.0.5 I have already looked at the following: 1. I recently imported a number of objects from a Zope instance running on a completely different architecture (i386 to alpha). i386 is big-endian, alpha is little-endian. Could this be a problem when exporting and importing objects? 2. I examined the Zope web access and event log files, but I can find no common event prior to when the Zope instance stops responding. There are no error messages in the Z2.log file, neither are there any errors in the event.log file (in fact, the last line is "Zope Ready to handle requests"). Running Zope in debug mode does not result in additional logging beyond this. 3. I traced the Zope process. There are a number of calls to poll(2) and the SIGPROF signal handler (which resets itself, apparently, as I see follow-up calls to gettimeofday(2) and sigprocmask(2)). These calls repeat. 4. I ran lsof on the Zope process. Oddly enough, there are still a few established TCP connections between Zope and the Apache web server in front of Zope. There are several other TCP connections in the state "no PCB, CANTSENDMORE, CANTRCVMORE". What else should I check? Best wishes, Matthew -- "The challenge of a moral life is to do nothing that requires forgiveness." - Roger Ebert in his review of _The Woodsman_
On Wed, Mar 09, 2005 at 04:22:39PM -0500, Matthew X. Economou wrote:
After running properly for about an hour or two, my Zope instance stops responding to HTTP requests and begins consuming all available CPU time. How should I go about troubleshooting this problem? (snip) What else should I check?
Two more suggestions: * try enabling the trace aka debug log in etc/zope.conf. (But this won't help until after you've restarted zope. I generally turn on trace logging for all my zopes, it's helped pinpoint problems on a couple of occasions.) There's a script for analzying it, this comes with zope in bin/requestprofiler.py. * google for "debug spinning zope". This too has helped me find a method that was hanging (although in my case it wasn't CPU, it was waiting forever on a connection that was blocked by incorrect firewall config). If a method never returns, all it takes is for all your zope worker threads to hit that method and bye-bye zope. -- Paul Winkler http://www.slinkp.com
Matthew X. Economou wrote at 2005-3-9 16:22 -0500:
After running properly for about an hour or two, my Zope instance stops responding to HTTP requests and begins consuming all available CPU time. How should I go about troubleshooting this problem?
Install Florent's "DeadlockDebugger" (which is also good to analyse long running requests). -- Dieter
participants (3)
-
Dieter Maurer -
Matthew X. Economou -
Paul Winkler