And one other debugging thing to look for: when it's hung, check to see if the process is consuming much CPU. --Paul
-----Original Message----- From: Amos Latteier [mailto:amos@aracnet.com] Sent: Thursday, August 12, 1999 5:23 PM To: Jonas Juselius; jason@brahms.siteprotect.com; Robin Becker Cc: zope@zope.org Subject: Re: [Zope] ZServer lockups
At 11:23 AM 8/11/99 +0300, Jonas Juselius wrote:
I'm experiencing the same kind of problems, although I'm using the Zope server (no Apache)... Netscape seems to lock up waiting for data infinitely. I can usually get around this problem by re-clicking on a url a couple of times, which resolves the problem for the moment. But after ZServer has gone into this state, it locks up again and again until ZServer is restarted... I'm using Zope-2.0.0b3-linux2-x86.
jonas
On Tue, Aug 10, 1999 at 12:33:56PM -0500, jason@brahms.siteprotect.com wrote:
I'm experiencing occasional lockups of ZServer (currently being used in conjuction with PCGI and Apache). When this happens, I get no response at all from the http interface (Netscape
eventually times
out and says the document contains no data).
A number of folks have posted about Zope lock up issues. It is very important to me to resolve this issues.
If any of you can reproduce a Zope lock up, please, please, please, submit this issue to the Collector, ASAP.
Tracking down problems like these can be difficult. There are lots of things that can go wrong which could cause Zope to stop responding, for example serious problems in any of these areas could halt Zope:
* ZServer * The Zope process itself * The object database
When ZServer occasionally locks up for me, it usually because I did something that's causing the publishing process to hang. This happens when you are coding external methods or Python products and make certain kinds of mistakes that put Zope in a loop.
If ZServer itself breaks usually you will get something written to the info log, or else ZServer will exit with a traceback. If you suspect that ZServer is the problem try running it with the -D switch so that the info logging goes to the STDOUT and so you can read the traceback in case it exits.
If you think that your object database is messed up you can test things by using the scripts in the utilities directory. You can also use the monitor to poke around at objects and check out if their state is weird.
In general if you've got an object database problem you should still be able to connect to the ZServer monitor and ftp servers which does not immediately rely on the object database.
If Zope is completely unresponsive, then there's not a lot you can do except kill the process. If you can kill it in the foreground with a control-c then you will get a traceback which may give you some clue as to what was wrong.
So to reiterate: If you can reproduce a lockup submit it to the Collector. If you are developing external methods or Products there is a good chance the your code is causing the problem. If not, poke around and see what you can find, and post your findings to the Zope list, with enough information we should be able to solve the problem.
Good luck!
-Amos
_______________________________________________ Zope maillist - Zope@zope.org http://www.zope.org/mailman/listinfo/zope
(To receive general Zope announcements, see: http://www.zope.org/mailman/listinfo/zope-announce
For developer-specific issues, zope-dev@zope.org - http://www.zope.org/mailman/listinfo/zope-dev )
In article <613145F79272D211914B0020AFF64019262879@gandalf.digicool.com> , Paul Everitt <Paul@digicool.com> writes
And one other debugging thing to look for: when it's hung, check to see if the process is consuming much CPU.
... ok I don't know what's causing my lockups. I have the 1.9 HTTPResponse etc and I can reliably after around 2000 hits from my torture tester get the win32 ZServer to lock as far as HTTP is concerned. I can use the monitor to see that things are alive in the medusa loop. My cpu hog thread is responsive (the job queue is empty) and I'm able to add a job and have the server thread go. And the system monitor reports no serious work going on unless I start it. I am unable to get in via netscape. The interesting thing is that the torture script is also hung so it may be a strangeness related to win9x ie too many requests hanging on the port or somesuch. Killing the torturer doesn't help though. I suspect that HTTP publication is locked somehow. When it happens again is there anything I can look at to test various/threads etc? -- Robin Becker
Can you telnet into the HTTP port and get connected, but then get no response from a GET? --Paul Robin Becker wrote:
In article <613145F79272D211914B0020AFF64019262879@gandalf.digicool.com> , Paul Everitt <Paul@digicool.com> writes
And one other debugging thing to look for: when it's hung, check to see if the process is consuming much CPU.
... ok I don't know what's causing my lockups. I have the 1.9 HTTPResponse etc and I can reliably after around 2000 hits from my torture tester get the win32 ZServer to lock as far as HTTP is concerned. I can use the monitor to see that things are alive in the medusa loop.
My cpu hog thread is responsive (the job queue is empty) and I'm able to add a job and have the server thread go. And the system monitor reports no serious work going on unless I start it. I am unable to get in via netscape.
The interesting thing is that the torture script is also hung so it may be a strangeness related to win9x ie too many requests hanging on the port or somesuch. Killing the torturer doesn't help though.
I suspect that HTTP publication is locked somehow. When it happens again is there anything I can look at to test various/threads etc? -- Robin Becker
_______________________________________________ Zope maillist - Zope@zope.org http://www.zope.org/mailman/listinfo/zope
(To receive general Zope announcements, see: http://www.zope.org/mailman/listinfo/zope-announce
For developer-specific issues, zope-dev@zope.org - http://www.zope.org/mailman/listinfo/zope-dev )
In article <37B40AF3.85BFA549@digicool.com>, Paul Everitt <paul@digicool.com> writes
Can you telnet into the HTTP port and get connected, but then get no response from a GET?
--Paul
short answer is yes I can a GET + <CR>+<CR> gives me a bad request from medusa if I try GET /<CR><CR> nothing comes back. netscape is still locked even after restarting netscape it still hangs on the http://localhost. Killed the torturer, netscape still hangs. My netstat -a gives C:\Python\devel>netstat -a Active Connections Proto Local Address Foreign Address State TCP jessikat:ftp 0.0.0.0:9391 LISTENING TCP jessikat:1090 0.0.0.0:42174 LISTENING TCP jessikat:80 0.0.0.0:45293 LISTENING TCP jessikat:8021 0.0.0.0:30949 LISTENING TCP jessikat:1782 0.0.0.0:35017 LISTENING TCP jessikat:1783 0.0.0.0:52424 LISTENING TCP jessikat:1788 0.0.0.0:203 LISTENING TCP jessikat:80 jessikat:1782 ESTABLISHED TCP jessikat:80 jessikat:1783 ESTABLISHED TCP jessikat:8099 0.0.0.0:10471 LISTENING TCP jessikat:1782 jessikat:80 ESTABLISHED TCP jessikat:1783 jessikat:80 ESTABLISHED TCP jessikat:1788 jessikat:80 CLOSE_WAIT TCP jessikat:1790 jessikat:80 TIME_WAIT TCP jessikat:1791 jessikat:80 TIME_WAIT TCP jessikat:19999 0.0.0.0:43196 LISTENING TCP jessikat:19999 127.9.9.9:1090 ESTABLISHED TCP jessikat:1090 127.9.9.9:19999 ESTABLISHED C:\Python\devel>
Robin Becker wrote:
In article <613145F79272D211914B0020AFF64019262879@gandalf.digicool.com> , Paul Everitt <Paul@digicool.com> writes
And one other debugging thing to look for: when it's hung, check to see if the process is consuming much CPU.
... ok I don't know what's causing my lockups. I have the 1.9 HTTPResponse etc and I can reliably after around 2000 hits from my torture tester get the win32 ZServer to lock as far as HTTP is concerned. I can use the monitor to see that things are alive in the medusa loop.
My cpu hog thread is responsive (the job queue is empty) and I'm able to add a job and have the server thread go. And the system monitor reports no serious work going on unless I start it. I am unable to get in via netscape.
The interesting thing is that the torture script is also hung so it may be a strangeness related to win9x ie too many requests hanging on the port or somesuch. Killing the torturer doesn't help though.
I suspect that HTTP publication is locked somehow. When it happens again is there anything I can look at to test various/threads etc? -- Robin Becker
_______________________________________________ Zope maillist - Zope@zope.org http://www.zope.org/mailman/listinfo/zope
(To receive general Zope announcements, see: http://www.zope.org/mailman/listinfo/zope-announce
For developer-specific issues, zope-dev@zope.org - http://www.zope.org/mailman/listinfo/zope-dev )
-- Robin Becker
Hi there - I've reproduced the same problem here on my setup, also on Win NT. Here's what I did, and what happened: 1. I set up a Zope installation (Zope 2.0.0b4 w/ patches from the latest CVS), with an index_html 2. I used Socrates (postcard-shareware testing program) to pound Zope with a pretty reasonably low request rate. 3. I tried accessing index_html through Netscape while Socrates was still testing index_html. 4. Both Netscape and Socrates got no response; the Zope machine had a pop-up dialog on it saying that 'An application error occurred, Exception: Access violation (0xc0000005), Address 0x00f7a88f)' with python.exe (I ran z2.py from the command line with -D, hoping to get a traceback or something)... effectively it looks like the procedure above can kill Zope/ZServer on Win32. One thing that I noticed was it seemed like it was fine if I accessed pages other than the one that was being checked by Socrates - I was able to click around the management screens a bit before having it lock up - when I clicked on the top level folder in the management screen (<pure speculation>probably requiring an access of the same object or objects which were also being accessed by another thread???</pure speculation>) that was when I got no response. Have you found any more clues about this? --Brian Robin Becker <robin@jessikat.demon.co.uk> wrote:
In article <613145F79272D211914B0020AFF64019262879@gandalf.digicool.com> , Paul Everitt <Paul@digicool.com> writes
And one other debugging thing to look for: when it's hung, check to see if the process is consuming much CPU.
... ok I don't know what's causing my lockups. I have the 1.9 HTTPResponse etc and I can reliably after around 2000 hits from my torture tester get the win32 ZServer to lock as far as HTTP is concerned. I can use the monitor to see that things are alive in the medusa loop.
My cpu hog thread is responsive (the job queue is empty) and I'm able to add a job and have the server thread go. And the system monitor reports no serious work going on unless I start it. I am unable to get in via netscape.
The interesting thing is that the torture script is also hung so it may be a strangeness related to win9x ie too many requests hanging on the port or somesuch. Killing the torturer doesn't help though.
I suspect that HTTP publication is locked somehow. When it happens again is there anything I can look at to test various/threads etc? -- Robin Becker
_______________________________________________ Zope maillist - Zope@zope.org http://www.zope.org/mailman/listinfo/zope
(To receive general Zope announcements, see: http://www.zope.org/mailman/listinfo/zope-announce
For developer-specific issues, zope-dev@zope.org - http://www.zope.org/mailman/listinfo/zope-dev )
Just to add to my last message - I tried running z2.py with a single thread (-t 1), however I was still able to produce a similar lockup. (I guess this makes sense, since we're talking about something with reading documents, not concurrent writes, probably this doesn't involve anything with transactions or the ZODB? I'm just inferring this here, it seems to make sense...!) --Brian Brian Hooper <brian@garage.co.jp> wrote:
Hi there -
I've reproduced the same problem here on my setup, also on Win NT. Here's what I did, and what happened:
1. I set up a Zope installation (Zope 2.0.0b4 w/ patches from the latest CVS), with an index_html 2. I used Socrates (postcard-shareware testing program) to pound Zope with a pretty reasonably low request rate. 3. I tried accessing index_html through Netscape while Socrates was still testing index_html. 4. Both Netscape and Socrates got no response; the Zope machine had a pop-up dialog on it saying that 'An application error occurred, Exception: Access violation (0xc0000005), Address 0x00f7a88f)' with python.exe (I ran z2.py from the command line with -D, hoping to get a traceback or something)... effectively it looks like the procedure above can kill Zope/ZServer on Win32.
One thing that I noticed was it seemed like it was fine if I accessed pages other than the one that was being checked by Socrates - I was able to click around the management screens a bit before having it lock up - when I clicked on the top level folder in the management screen (<pure speculation>probably requiring an access of the same object or objects which were also being accessed by another thread???</pure speculation>) that was when I got no response.
Have you found any more clues about this?
--Brian
Robin Becker <robin@jessikat.demon.co.uk> wrote:
In article <613145F79272D211914B0020AFF64019262879@gandalf.digicool.com> , Paul Everitt <Paul@digicool.com> writes
And one other debugging thing to look for: when it's hung, check to see if the process is consuming much CPU.
... ok I don't know what's causing my lockups. I have the 1.9 HTTPResponse etc and I can reliably after around 2000 hits from my torture tester get the win32 ZServer to lock as far as HTTP is concerned. I can use the monitor to see that things are alive in the medusa loop.
My cpu hog thread is responsive (the job queue is empty) and I'm able to add a job and have the server thread go. And the system monitor reports no serious work going on unless I start it. I am unable to get in via netscape.
The interesting thing is that the torture script is also hung so it may be a strangeness related to win9x ie too many requests hanging on the port or somesuch. Killing the torturer doesn't help though.
I suspect that HTTP publication is locked somehow. When it happens again is there anything I can look at to test various/threads etc? -- Robin Becker
_______________________________________________ Zope maillist - Zope@zope.org http://www.zope.org/mailman/listinfo/zope
(To receive general Zope announcements, see: http://www.zope.org/mailman/listinfo/zope-announce
For developer-specific issues, zope-dev@zope.org - http://www.zope.org/mailman/listinfo/zope-dev )
Last time I locked up I tried poking with netscape and no joy, but with IE5 after a pause things suddenly unloked and ZServer came back to life. Weird, but maybe this is a hidden win32 'feature' that Bill hasn't told us about. -- Robin Becker
As a data point: I have been noticing freezes with ZServer running against Internet Explorer 5. I notice that clicking on another link (which almost always works) and then clicking the frozen link will occasionally "break" the freeze. This is running on local host, latest beta of Zope. Once the freeze is "broken", performance is normal (fast). I have not yet been able to determine what is the particular problem, nor whether Netscape has similar interactions. Enjoy yourself, Mike -----Original Message----- From: zope-admin@zope.org [mailto:zope-admin@zope.org]On Behalf Of Robin Becker Sent: August 18, 1999 7:46 AM To: zope@zope.org Subject: Re: [Zope] ZServer lockups Last time I locked up I tried poking with netscape and no joy, but with IE5 after a pause things suddenly unloked and ZServer came back to life. Weird, but maybe this is a hidden win32 'feature' that Bill hasn't told us about. -- Robin Becker ...
participants (4)
-
Brian Hooper -
Mike Fletcher -
Paul Everitt -
Robin Becker