OK, did a ./configure --without-py_malloc with python 2.1.1 and it didn't take care of the problem. Still getting: 2001-11-07T20:30:11 ERROR(200) zdaemon zdaemon: Wed Nov 7 15:30:11 2001: Aiieee! 9400 exited with error code: 11 Yes, it did compile (make clean, configure, make) and the Makefile has the argument --without-py_malloc The start script is pointed to the newly compiled python and I'm still getting the same error. Aack! -chris ------------------------------ Chris Kratz chris.kratz@vistashare.com ----- Original Message ----- From: "Chris Kratz" <chris.kratz@vistashare.com> To: "Chris Kratz" <chris.kratz@vistashare.com> Sent: Wednesday, November 07, 2001 2:49 PM Subject: Re: [Zope] Zope hiccuping
After posting this, I ran into some messages posted on the mailing list that seem to say that py_malloc is the culprit for this particular problem. I'm in the process of trying this solution. My apologies for the spam. On the other hand if there are any other thoughts, I would appreciate hearing them as well.
-Chris ------------------------------ Chris Kratz chris.kratz@vistashare.com
----- Original Message ----- From: "Chris Kratz" <chris.kratz@vistashare.com> To: <zope@zope.org> Sent: Wednesday, November 07, 2001 2:33 PM Subject: [Zope] Zope hiccuping
We have been noticing that periodically, we get an error from IE that says "Cannot find server or DNS error". It is not easily reproducable (except by just clicking links on the server) and a F5 refresh in the browser [almost] always loads the page correctly. I turned on logging today with the -M startup option and observed the following entries when it happened:
B 145697132 2001-11-07T17:15:38 GET /OutcomeTracker/Dev_News I 145697132 2001-11-07T17:15:38 0 A 145697132 2001-11-07T17:15:39 200 32155 E 145697132 2001-11-07T17:15:39 B 145934020 2001-11-07T17:15:40 GET /OutcomeTracker/PeopleOrganizations/index_html I 145934020 2001-11-07T17:15:40 0 B 135053764 2001-11-07T17:15:56 POST /OutcomeTracker/Activities/index_html I 135053764 2001-11-07T17:15:56 2831 B 146464388 2001-11-07T17:15:57 GET /OutcomeTracker/PeopleOrganizations/index_html I 146464388 2001-11-07T17:15:57 0 A 146464388 2001-11-07T17:16:03 200 31200 E 146464388 2001-11-07T17:16:03
Notice how the Get /OutcomeTracker/PeopleOrganizations/index_html never gets the A or E lines, but only has a B and I line. The subsequent refresh finished the request. Interestingly, the two incompleted requests are not logged to z2.log. We can see the request before and the request after, but that's it. The other strangeness is that in the postgres log, we see a "pq_recvbuf: unexpected EOF on client connection". This seemed to point to zope threads dying. Since I'm not getting anything in the logs(*see below), I started running tests with one eye on the currently running processes. And sure enough, whenever I got that error at the browser (cannot find server...), *All* of the zope threads (except the main starter thread) die quietly and come back with new PIDs. It really appears like it reruns the entire startup sequence again. With Z_DEBUG_MODE on I can watch it go through the startup sequence again whenever this happens. But, there are no tracebacks. It's just like somebody clicked restart in the middle of a process.
The one glimmer of hope is in the stupid log file:
2001-11-07T19:30:23 ERROR(200) zdaemon zdaemon: Wed Nov 7 14:30:23 2001: Aiieee! 1925 exited with error code: 11 ...restarting...
Here's the questions,
1. It appears that something is causing those threads to crash (or end), but nothing is getting put in the log file. Is there any way to get the tracebacks I assume are happening or to find out what is going on? 2. Alternatively, is there a way to run zope in single threaded mode? Z_DEBUG_MODE appears to only apply to the main thread because it goes ahead and spawns additional threads. If I use -t 0 I get two processes, but no response from a web browser request. If I use -t 1, I get three processes owned by nobody and the original one by root. 3. Any further ideas on how to debug this thing? Where do I find what error code 11 is?
Thanks for you time and help,
-Chris
------------------------------ Chris Kratz chris.kratz@vistashare.com
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
If it happens regularly (or better, if you can make it happen) it would be helpful to collect big M log data for a series of failures. Then attempt to find a pattern. You might find the requestprofiler.py script in "utilities" useful when analyzing big M log data. ----- Original Message ----- From: "Chris Kratz" <chris.kratz@vistashare.com> To: <zope@zope.org> Sent: Wednesday, November 07, 2001 3:46 PM Subject: Re: [Zope] Zope hiccuping
OK, did a ./configure --without-py_malloc with python 2.1.1 and it didn't take care of the problem. Still getting:
2001-11-07T20:30:11 ERROR(200) zdaemon zdaemon: Wed Nov 7 15:30:11 2001: Aiieee! 9400 exited with error code: 11
Yes, it did compile (make clean, configure, make) and the Makefile has the argument --without-py_malloc The start script is pointed to the newly compiled python and I'm still getting the same error.
Aack!
-chris ------------------------------ Chris Kratz chris.kratz@vistashare.com
----- Original Message ----- From: "Chris Kratz" <chris.kratz@vistashare.com> To: "Chris Kratz" <chris.kratz@vistashare.com> Sent: Wednesday, November 07, 2001 2:49 PM Subject: Re: [Zope] Zope hiccuping
After posting this, I ran into some messages posted on the mailing list that seem to say that py_malloc is the culprit for this particular problem. I'm in the process of trying this solution. My apologies for the spam. On the other hand if there are any other thoughts, I would appreciate hearing them as well.
-Chris ------------------------------ Chris Kratz chris.kratz@vistashare.com
----- Original Message ----- From: "Chris Kratz" <chris.kratz@vistashare.com> To: <zope@zope.org> Sent: Wednesday, November 07, 2001 2:33 PM Subject: [Zope] Zope hiccuping
We have been noticing that periodically, we get an error from IE that says "Cannot find server or DNS error". It is not easily reproducable (except by just clicking links on the server) and a F5 refresh in the browser [almost] always loads the page correctly. I turned on logging today with the -M startup option and observed the following entries when it happened:
B 145697132 2001-11-07T17:15:38 GET /OutcomeTracker/Dev_News I 145697132 2001-11-07T17:15:38 0 A 145697132 2001-11-07T17:15:39 200 32155 E 145697132 2001-11-07T17:15:39 B 145934020 2001-11-07T17:15:40 GET /OutcomeTracker/PeopleOrganizations/index_html I 145934020 2001-11-07T17:15:40 0 B 135053764 2001-11-07T17:15:56 POST /OutcomeTracker/Activities/index_html I 135053764 2001-11-07T17:15:56 2831 B 146464388 2001-11-07T17:15:57 GET /OutcomeTracker/PeopleOrganizations/index_html I 146464388 2001-11-07T17:15:57 0 A 146464388 2001-11-07T17:16:03 200 31200 E 146464388 2001-11-07T17:16:03
Notice how the Get /OutcomeTracker/PeopleOrganizations/index_html never gets the A or E lines, but only has a B and I line. The subsequent refresh finished the request. Interestingly, the two incompleted requests are not logged to z2.log. We can see the request before and the request after, but that's it. The other strangeness is that in the postgres log, we see a "pq_recvbuf: unexpected EOF on client connection". This seemed to point to zope threads dying. Since I'm not getting anything in the logs(*see below), I started running tests with one eye on the currently running processes. And sure enough, whenever I got that error at the browser (cannot find server...), *All* of the zope threads (except the main starter thread) die quietly and come back with new PIDs. It really appears like it reruns the entire startup sequence again. With Z_DEBUG_MODE on I can watch it go through the startup sequence again whenever this happens. But, there are no tracebacks. It's just like somebody clicked restart in the middle of a process.
The one glimmer of hope is in the stupid log file:
2001-11-07T19:30:23 ERROR(200) zdaemon zdaemon: Wed Nov 7 14:30:23 2001: Aiieee! 1925 exited with error code: 11 ...restarting...
Here's the questions,
1. It appears that something is causing those threads to crash (or end), but nothing is getting put in the log file. Is there any way to get the tracebacks I assume are happening or to find out what is going on? 2. Alternatively, is there a way to run zope in single threaded mode? Z_DEBUG_MODE appears to only apply to the main thread because it goes ahead and spawns additional threads. If I use -t 0 I get two processes, but no response from a web browser request. If I use -t 1, I get three processes owned by nobody and the original one by root. 3. Any further ideas on how to debug this thing? Where do I find what error code 11 is?
Thanks for you time and help,
-Chris
------------------------------ Chris Kratz chris.kratz@vistashare.com
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
Hmmm, I included the big M logging in my first email with two crashes. Perhaps there is a way to get additional information? The big M logging only shows that two requests came in and never left the server. The stupid log file says that we errored out with error code 11. It's very easy to find these by cross referencing the postgres log file ('pq_recvbuf: unexpected EOF on client connection') with the big M log file and the stupid log file. I think what brought this on was our switching to psycopg from PoPy. We have switched back to PoPy and the problem has cleared up though we have our original problem of server hangs (db threads in UPDATE WAITING mode). It's frustrating not to have a db adaptor that doesn't have problems with postgres. PoPy works great except for the server stalling issue. I am still optimistically hoping that Psycopg will eventually be our da of choice. But, there were enough issues that we went back to PoPy. Better the beast you know... Any other thoughts? -Chris ------------------------------ Chris Kratz chris.kratz@vistashare.com ----- Original Message ----- From: "Chris McDonough" <chrism@zope.com> To: "Chris Kratz" <chris.kratz@vistashare.com>; <zope@zope.org> Sent: Wednesday, November 07, 2001 4:47 PM Subject: Re: [Zope] Zope hiccuping
If it happens regularly (or better, if you can make it happen) it would be helpful to collect big M log data for a series of failures. Then attempt to find a pattern.
You might find the requestprofiler.py script in "utilities" useful when analyzing big M log data.
----- Original Message ----- From: "Chris Kratz" <chris.kratz@vistashare.com> To: <zope@zope.org> Sent: Wednesday, November 07, 2001 3:46 PM Subject: Re: [Zope] Zope hiccuping
OK, did a ./configure --without-py_malloc with python 2.1.1 and it didn't take care of the problem. Still getting:
2001-11-07T20:30:11 ERROR(200) zdaemon zdaemon: Wed Nov 7 15:30:11 2001: Aiieee! 9400 exited with error code: 11
Yes, it did compile (make clean, configure, make) and the Makefile has the argument --without-py_malloc The start script is pointed to the newly compiled python and I'm still getting the same error.
Aack!
-chris ------------------------------ Chris Kratz chris.kratz@vistashare.com
----- Original Message ----- From: "Chris Kratz" <chris.kratz@vistashare.com> To: "Chris Kratz" <chris.kratz@vistashare.com> Sent: Wednesday, November 07, 2001 2:49 PM Subject: Re: [Zope] Zope hiccuping
After posting this, I ran into some messages posted on the mailing list that seem to say that py_malloc is the culprit for this particular problem. I'm in the process of trying this solution. My apologies for the spam. On the other hand if there are any other thoughts, I would appreciate hearing them as well.
-Chris ------------------------------ Chris Kratz chris.kratz@vistashare.com
----- Original Message ----- From: "Chris Kratz" <chris.kratz@vistashare.com> To: <zope@zope.org> Sent: Wednesday, November 07, 2001 2:33 PM Subject: [Zope] Zope hiccuping
We have been noticing that periodically, we get an error from IE that says "Cannot find server or DNS error". It is not easily reproducable (except by just clicking links on the server) and a F5 refresh in the browser [almost] always loads the page correctly. I turned on logging today with the -M startup option and observed the following entries when it happened:
B 145697132 2001-11-07T17:15:38 GET /OutcomeTracker/Dev_News I 145697132 2001-11-07T17:15:38 0 A 145697132 2001-11-07T17:15:39 200 32155 E 145697132 2001-11-07T17:15:39 B 145934020 2001-11-07T17:15:40 GET /OutcomeTracker/PeopleOrganizations/index_html I 145934020 2001-11-07T17:15:40 0 B 135053764 2001-11-07T17:15:56 POST /OutcomeTracker/Activities/index_html I 135053764 2001-11-07T17:15:56 2831 B 146464388 2001-11-07T17:15:57 GET /OutcomeTracker/PeopleOrganizations/index_html I 146464388 2001-11-07T17:15:57 0 A 146464388 2001-11-07T17:16:03 200 31200 E 146464388 2001-11-07T17:16:03
Notice how the Get /OutcomeTracker/PeopleOrganizations/index_html never gets the A or E lines, but only has a B and I line. The subsequent refresh finished the request. Interestingly, the two incompleted requests are not logged to z2.log. We can see the request before and the request after, but that's it. The other strangeness is that in the postgres log, we see a "pq_recvbuf: unexpected EOF on client connection". This seemed to point to zope threads dying. Since I'm not getting anything in the logs(*see below), I started running tests with one eye on the currently running processes. And sure enough, whenever I got that error at the browser (cannot find server...), *All* of the zope threads (except the main starter thread) die quietly and come back with new PIDs. It really appears like it reruns the entire startup sequence again. With Z_DEBUG_MODE on I can watch it go through the startup sequence again whenever this happens. But, there are no tracebacks. It's just like somebody clicked restart in the middle of a process.
The one glimmer of hope is in the stupid log file:
2001-11-07T19:30:23 ERROR(200) zdaemon zdaemon: Wed Nov 7 14:30:23 2001: Aiieee! 1925 exited with error code: 11 ...restarting...
Here's the questions,
1. It appears that something is causing those threads to crash (or end), but nothing is getting put in the log file. Is there any way to get the tracebacks I assume are happening or to find out what is going on? 2. Alternatively, is there a way to run zope in single threaded mode? Z_DEBUG_MODE appears to only apply to the main thread because it goes ahead and spawns additional threads. If I use -t 0 I get two processes, but no response from a web browser request. If I use -t 1, I get three processes owned by nobody and the original one by root. 3. Any further ideas on how to debug this thing? Where do I find what error code 11 is?
Thanks for you time and help,
-Chris
------------------------------ Chris Kratz chris.kratz@vistashare.com
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
Chris, indeed, psycopg is the problem, and you can cure this (as we did) by upgrading to latest test release. You can find it here: http://initd.org/software/psycopg/psycopg-1.0pre2.tar.gz There was a nasty segfault bug in psycopg caused by running under python2.1. Since upgrading, our sites have been rock solid. HTH, Adam On Wed, 2001-11-07 at 17:34, Chris Kratz wrote:
I think what brought this on was our switching to psycopg from PoPy. -- "There's never enough time to do | M. Adam Kendall all the nothing you want." | mak@kha0s.org -Bill Watterson (Calvin and Hobbes) | http://www.zopelabs.com
Thanks, I'll check it out. ------------------------------ Chris Kratz chris.kratz@vistashare.com ----- Original Message ----- From: "M. Adam Kendall" <mak@kha0s.org> To: "Chris Kratz" <chris.kratz@vistashare.com> Cc: "Chris McDonough" <chrism@zope.com>; "Zope List" <zope@zope.org> Sent: Wednesday, November 07, 2001 5:45 PM Subject: Re: [Zope] Zope hiccuping
Chris, indeed, psycopg is the problem, and you can cure this (as we did) by upgrading to latest test release. You can find it here: http://initd.org/software/psycopg/psycopg-1.0pre2.tar.gz There was a nasty segfault bug in psycopg caused by running under python2.1. Since upgrading, our sites have been rock solid. HTH, Adam
On Wed, 2001-11-07 at 17:34, Chris Kratz wrote:
I think what brought this on was our switching to psycopg from PoPy. -- "There's never enough time to do | M. Adam Kendall all the nothing you want." | mak@kha0s.org -Bill Watterson (Calvin and Hobbes) | http://www.zopelabs.com
did) by upgrading to latest test release. You can find it here: http://initd.org/software/psycopg/psycopg-1.0pre2.tar.gz
Go for pre3. There was an autocommit bug that was fixed in that version. Paz -----Original Message----- From: zope-admin@zope.org [mailto:zope-admin@zope.org] On Behalf Of Chris Kratz Sent: Wednesday, November 07, 2001 11:56 PM To: M. Adam Kendall Cc: Chris McDonough; Zope List Subject: Re: [Zope] Zope hiccuping Thanks, I'll check it out. ------------------------------ Chris Kratz chris.kratz@vistashare.com ----- Original Message ----- From: "M. Adam Kendall" <mak@kha0s.org> To: "Chris Kratz" <chris.kratz@vistashare.com> Cc: "Chris McDonough" <chrism@zope.com>; "Zope List" <zope@zope.org> Sent: Wednesday, November 07, 2001 5:45 PM Subject: Re: [Zope] Zope hiccuping
Chris, indeed, psycopg is the problem, and you can cure this (as we did) by upgrading to latest test release. You can find it here: http://initd.org/software/psycopg/psycopg-1.0pre2.tar.gz There was a nasty segfault bug in psycopg caused by running under python2.1. Since upgrading, our sites have been rock solid. HTH, Adam
On Wed, 2001-11-07 at 17:34, Chris Kratz wrote:
I think what brought this on was our switching to psycopg from PoPy.
-- "There's never enough time to do | M. Adam Kendall all the nothing you want." | mak@kha0s.org -Bill Watterson (Calvin and Hobbes) | http://www.zopelabs.com
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
participants (4)
-
Chris Kratz -
Chris McDonough -
M. Adam Kendall -
Paul Zwarts