Hello, I am working since a long time with zope and was continuously worried about a few problems, unfortunately none of them was fixed along the years: - zserver can not 'recover' busy thread - log show nothing in case of blocking: log is written when the request is completed, and the log is not flushed at every request. I don't thing Big-M will help. - when all thread are busy there is no way to access zope, even to restart it. I try to fix this several year ago (I am not an expert of zserver) with timeout-socket: no luck So if you find a solution to avoid the lock, please email me. I just know the monitoring/restarting solution. I can just tell that having a very basic zope system seams to help: the one I speak bellow use Zope + Psyco db adapter product. Nothing else, it run behind apache and until now I had no blocking situation, that's since 4 months. The annoying thing about this is when you tell other IT peoples: Zope is very nice, powerful, free, but there is maybe a stability problem. Then later on you see thoses guys going for another solution. I am working for the European-Space-Agency and I can tell that at my site (Italie) there is only one extranet site Zope powered, made by me, and I promote zope since years(or zope 1.0). Obviously I am not a good 'promoter' gilles ----- Original Message ----- From: "Paul Winkler" <pw_lists@slinkp.com> To: <zope-dev@zope.org> Sent: Thursday, February 06, 2003 9:24 PM Subject: [Zope-dev] What makes Zope twirl?
Or: when zope goes into a nonresponsive state, what can you do to diagnose the cause?
The even that prompts this question:
Our production system (2 zeo clients) went down today. Platform: Linux 2.4, Zope 2.5.1 from source (wo_pcgi), Python 2.1.3 from source, running behind Apache for one site, and a custom java proxy for another site (don't ask). ZServer is not exposed to anything except the servers running Apache and the Java proxy.
All the zope processes were still running, CPU usage was low (almost nil for python), * there was plenty of free physical memory & swap. Yet Zope was not responding to requests. A look at the access logs revealed that zope had not logged anything since the time we noticed the outage. Nothing unusual before that except AltaVista crawling our site (a measly 2 requests / second).
A restart seemed to fix everything, though one of the zeo servers went down again (same symptoms) about 20 minutes after starting. Restarted it again and both servers have been fine for hours now.
This seems to be rare; I haven't seen it before on this particular server, but I saw a similar wedge on our dev machine about 3 weeks ago.
I've looked at ALL the logs (access log, zeo log, zope stdout / stderr log) and found nothing at all unusual, just the aforementioned AltaVista crawl and a couple of RAM Cache errors from non-pickleable objects that I need to dis-associate from the cache. But none of this is new.
Is it time for "Big M"? WOuld that give me anything useful?
* this does not sound like other zope "spins" I have heard of, in which python eats 99% CPU indefinitely due to (probably) an application error. see for example: http://www.zopezen.org/Members/zopista/News_Item.2003-01-28.1025
--
Paul Winkler http://www.slinkp.com Look! Up in the sky! It's CAPTAIN STETOSCOPE! (random hero from isometric.spaceninja.com)
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )