Zope woes continue - server going down regularly.
This morning : using Zope 2.01 ------------------------------
Alternatively, the server just hangs and doesn't respond (but ps -aux shows the process is running).
Is it spinning? (consuming 100% CPU resources?) or hung? (consuming none..)
Sorry, you're right, it was 'spinning' - 100% CPU resources. Does that suggest anything in particular ?
Ok, I feel like the Cold-Fusion-Man today, I have a window open on my desktop showing the 'top' output of my server. Whenever the CPU usage goes up to over 90%, I know that the PCGI process is no longer responding and I restart Zope. Restarting the server does not help any more than restarting Zope itself. Repeat ad nauseum every 30-45 minutes if I'm lucky. This afternoon : using Zope 2.10 -------------------------------- I installed and set up Zope 2.10 (using PCGI behind Apache) and transfered my site over it. It *seemed* to be holding up OK. CPU never went up to 90%. But I noticed that several times I'd get the fd=3 error. Then within 2 hours of use, I get this error when external users try to access the site : Zope Error Zope has encountered an error while publishing this resource. Error Type: NameError Error Value: name_param Even from the administration interface, I can't access anything. Everything returns : Zope Error Zope has encountered an error while publishing this resource. Error Type: NameError Error Value: type And now CPU is at 0% - probably because nobody's any longer accessing our site :( So, again I had to restart Zope and then the site returned to normal. In a few hours, I'm going to have to (a) leave the computer and (b) sleep... and I'll be unable to do this manual monitoring/restarting of the server. So, I'm quite desperate to seek a solution to this soon. To this end, does anybody have any ideas : 1) where to get better debugging information to help identify the source of the problem. 2) better yet, any idea what the problem(s) could be ? Configuration : --------------- - FreeBSD 3.2 - Zope 2.1 running behind Apache - MySQL database using ZMySQLDA - 40,000 dynamic pageviews per day though it'll be about zero now, after 24 hours of downtime. z2.py : ------- ## General configuration options IP_ADDRESS='xxx.xxx.xxx.xxx' (I'd rather not say whilst my server's buggered) DNS_IP='' UID='nobody' LOG_FILE='Z2.log' HTTP_PORT=8080 HTTP_ENV={} FTP_PORT=8021 PCGI_FILE='Zope.cgi' MONITOR_PORT=8099 MODULE='Zope' NUMBER_OF_THREADS=16 LOCALE_ID=None FCGI_PORT=None Zope.cgi is unchanged since installation. chas
Hello, If there is no other solution, you can do like me: use a 'watch-dog' script which restart Zope in case it stop responding :-) (stupid solution, but it works) Gilles ----- Original Message ----- From: "chas" <panda@skinnyhippo.com> To: <michel@digicool.com> Cc: <zope@zope.org> Sent: Tuesday, December 07, 1999 10:17 AM Subject: [Zope] Zope woes continue - server going down regularly.
This morning : using Zope 2.01 ------------------------------
Alternatively, the server just hangs and doesn't respond (but ps -aux shows the process is running).
Is it spinning? (consuming 100% CPU resources?) or hung? (consuming none..)
Sorry, you're right, it was 'spinning' - 100% CPU resources. Does that suggest anything in particular ?
Ok, I feel like the Cold-Fusion-Man today, I have a window open on my desktop showing the 'top' output of my server. Whenever the CPU usage goes up to over 90%, I know that the PCGI process is no longer responding and I restart Zope. Restarting the server does not help any more than restarting Zope itself. Repeat ad nauseum every 30-45 minutes if I'm lucky.
This afternoon : using Zope 2.10 -------------------------------- I installed and set up Zope 2.10 (using PCGI behind Apache) and transfered my site over it. It *seemed* to be holding up OK. CPU never went up to 90%. But I noticed that several times I'd get the fd=3 error.
Then within 2 hours of use, I get this error when external users try to access the site :
Zope Error Zope has encountered an error while publishing this resource. Error Type: NameError Error Value: name_param
Even from the administration interface, I can't access anything. Everything returns :
Zope Error Zope has encountered an error while publishing this resource. Error Type: NameError Error Value: type
And now CPU is at 0% - probably because nobody's any longer accessing our site :(
So, again I had to restart Zope and then the site returned to normal. In a few hours, I'm going to have to (a) leave the computer and (b) sleep... and I'll be unable to do this manual monitoring/restarting of the server. So, I'm quite desperate to seek a solution to this soon.
To this end, does anybody have any ideas : 1) where to get better debugging information to help identify the source of the problem. 2) better yet, any idea what the problem(s) could be ?
Configuration : --------------- - FreeBSD 3.2 - Zope 2.1 running behind Apache - MySQL database using ZMySQLDA - 40,000 dynamic pageviews per day though it'll be about zero now, after 24 hours of downtime.
z2.py : ------- ## General configuration options IP_ADDRESS='xxx.xxx.xxx.xxx' (I'd rather not say whilst my server's
buggered)
DNS_IP='' UID='nobody' LOG_FILE='Z2.log' HTTP_PORT=8080 HTTP_ENV={} FTP_PORT=8021 PCGI_FILE='Zope.cgi' MONITOR_PORT=8099 MODULE='Zope' NUMBER_OF_THREADS=16 LOCALE_ID=None FCGI_PORT=None
Zope.cgi is unchanged since installation.
chas
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope No cross posts or HTML encoding! (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
If there is no other solution, you can do like me: use a 'watch-dog' script which restart Zope in case it stop responding :-) (stupid solution, but it works)
Thanks Gilles - if it's not asking too much, could you pls also send me a sample script. Otherwise, I'm looking at a cronjob to stop then start Zope regularly (thanks for the idea, M) but that's even more stupid. Looking at the larger picture, it's very worrying that people are having to employ such scripts (it really does echo the earlier "CF vs Zope" thread we had the list where CF was ridiculed for such instability). Thanks for your help - support from the list is all that's keeping me sane at the moment, chas
chas wrote:
This morning : using Zope 2.01 ------------------------------
Alternatively, the server just hangs and doesn't respond (but ps -aux shows the process is running).
Is it spinning? (consuming 100% CPU resources?) or hung? (consuming none..)
Sorry, you're right, it was 'spinning' - 100% CPU resources. Does that suggest anything in particular ?
Ok, I feel like the Cold-Fusion-Man today, I have a window open on my desktop showing the 'top' output of my server. Whenever the CPU usage goes up to over 90%, I know that the PCGI process is no longer responding and I restart Zope. Restarting the server does not help any more than restarting Zope itself. Repeat ad nauseum every 30-45 minutes if I'm lucky.
This afternoon : using Zope 2.10 --------------------------------
OK, let's focus on 2.1.
I installed and set up Zope 2.10 (using PCGI behind Apache) and transfered my site over it. It *seemed* to be holding up OK. CPU never went up to 90%. But I noticed that several times I'd get the fd=3 error.
Then within 2 hours of use, I get this error when external users try to access the site :
Zope Error Zope has encountered an error while publishing this resource. Error Type: NameError Error Value: name_param
Could you provide a traceback?
Even from the administration interface, I can't access anything. Everything returns :
Zope Error Zope has encountered an error while publishing this resource. Error Type: NameError Error Value: type
Again, a traceback would be helpful.
And now CPU is at 0% - probably because nobody's any longer accessing our site :(
So, again I had to restart Zope and then the site returned to normal. In a few hours, I'm going to have to (a) leave the computer and (b) sleep... and I'll be unable to do this manual monitoring/restarting of the server. So, I'm quite desperate to seek a solution to this soon.
To this end, does anybody have any ideas : 1) where to get better debugging information to help identify the source of the problem.
- Obviously something changed. You were running fine for 2 months, than started having problems. You can work forward by analysing changes, or you could work backward by debugging. - Tracebacks are helpful. If you aren't in debug mode, then you'll need to view the document source to see them, - The Zope event log can be very helpful. See doc/LOGGING.txt, - Look at the "debug" screen at Control_Panel/manage_debug. This can be used to spot memory leaks and stuck database connections. - If the above doesn't show anything, then a more detailed log may be created: Add this stanza to your z2.py start script. # turn on debug logging from ZServer import DebugLogger logfile=os.path.join(INSTANCE_HOME,'var/debug.log') DebugLogger.log=DebugLogger.DebugLogger(logfile).log It should be insertted after Zope is imported, e.g. after this line: exec "import "+MODULE in {} You'll want to watch the debug log file since it gets large quickly. You might also want to read the docstrings in ZServer/DebugLogger.py for more information about the debug log format. In particular, we want to look for requests that don't complete or for apparent leaking requests.
2) better yet, any idea what the problem(s) could be ?
Not off hand. Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
On Tue, 7 Dec 1999, chas wrote:
Sorry, you're right, it was 'spinning' - 100% CPU resources. Does that suggest anything in particular ?
Ok, I feel like the Cold-Fusion-Man today, I have a window open on my desktop showing the 'top' output of my server. Whenever the CPU usage goes up to over 90%, I know that the PCGI process is no longer responding and I restart Zope. Restarting the server does not help any more than restarting Zope itself. Repeat ad nauseum every 30-45 minutes if I'm lucky.
My guess is you have some code that is going into an infinite loop or deadlocking. And since you are the only one reporting it a further guess would be it is code that is specific to your site, or a 3rd party product being used in an unusual fashion, or a combination of products. I believe DTML has checks in it to stop this sort of thing, so it would likely be some python code somewhere that is blocking. For a test, start up your server in single thread mode. When the problem next occurs, the last page accessed should be the cause of your problem. If the standard log doesn't do the trick (medusa might log incomming requests before they are passed off to Zope handlers), you can probably put your own logging information in zhttp_channel.work in ZServer/HTTPServer.py unless you work out how to turn on the built in debug logging. -- ___ // Zen (alias Stuart Bishop) Work: zen@cs.rmit.edu.au // E N Senior Systems Alchemist Play: zen@shangri-la.dropbear.id.au //__ Computer Science, RMIT WWW: http://www.cs.rmit.edu.au/~zen
participants (4)
-
chas -
Gilles Lavaux -
Jim Fulton -
Stuart 'Zen' Bishop