I'm in the process of moving a site from a (much) earlier release of Zope, up to 2.5.whatever. In making this migration, I have encountered a problem with segmentation faults and spontaneous Zope restarts that sounds an awful lot like the problems some BSD users were encountering earlier. The principal difference between my case and theirs is that they're on BSD, where the thread stack size is by default yitty-bitty (64k), and I'm working on Linux, where the thread stack size is by default really huge (2MB). The solution previously recommended was to boost the thread stack size to 128k; Linux already provides well more. It's also a lot harder to boost the stack size under Linux; if there's a way to do it short of recompiling glibc, I don't know what it is, yet, and I've spent a lot more time than I should trying to find one. Now the symptoms I'm seeing pretty obviously have more to do with the signal handler for segfaults than any particular manifestation of segfault, but early triage attempts pointed at this thread stack problem. So, because I can't find hard diagnostic details in the mailing list archives, I'm hoping that someone might be able to tell me what the best indicator would be, or how to go about demonstrating that it isn't this problem. Thanks, --G. -- Geoff Gerrietts <geoff at gerrietts net> "I have read your book and much like it." --Moses Hadas
On Wed, 24 Apr 2002, Geoff Gerrietts wrote:
I'm in the process of moving a site from a (much) earlier release of Zope, up to 2.5.whatever. In making this migration, I have encountered a problem with segmentation faults and spontaneous Zope restarts that sounds an awful lot like the problems some BSD users were encountering earlier.
Install python 2.1.3 and upgrade to zope 2.5.1 the problem will most likely go away. Python 2.1.3 fixes a serious bug that crashes zope and 2.5.1 has a LOT of fixes.
Geoff Gerrietts wrote:
I'm in the process of moving a site from a (much) earlier release of Zope, up to 2.5.whatever. In making this migration, I have encountered a problem with segmentation faults and spontaneous Zope restarts that sounds an awful lot like the problems some BSD users were encountering earlier.
Please ensure you're using Python 2.1.3 and Zope 2.5.1 to ensure you're not just experiencing some common crash bugs which have recently been fixed... cheers, Chris
Quoting Chris Withers (chrisw@nipltd.com):
Geoff Gerrietts wrote:
I'm in the process of moving a site from a (much) earlier release of Zope, up to 2.5.whatever. In making this migration, I have encountered a problem with segmentation faults and spontaneous Zope restarts that sounds an awful lot like the problems some BSD users were encountering earlier.
Please ensure you're using Python 2.1.3 and Zope 2.5.1 to ensure you're not just experiencing some common crash bugs which have recently been fixed...
I am using 2.1.3 and 2.5.1. The problem also shows up on Zope 2.4.4. Our Data.fs is huge (180MB) and our site is pretty complicated, so it's conceivable that we may still be causing a thread stack problem (though I judge it unlikely). That's why I'm asking after diagnostic clues, not solutions. I would like to rule out thread stack problems as early as possible, and start focussing on other issues, if in fact I can. Thanks, --G. -- Geoff Gerrietts <geoff at gerrietts net> "A man can't be too careful in the choice of his enemies." --Oscar Wilde
I think there was an issue with the stacksize under BSD. Matt Kromer should know about it. Andreas ----- Original Message ----- From: "Geoff Gerrietts" <geoff@gerrietts.net> To: <zope@zope.org> Sent: Thursday, April 25, 2002 14:58 Subject: Re: [Zope] Zope 2.5.0, thread stack issues
Quoting Chris Withers (chrisw@nipltd.com):
Geoff Gerrietts wrote:
I'm in the process of moving a site from a (much) earlier release of Zope, up to 2.5.whatever. In making this migration, I have encountered a problem with segmentation faults and spontaneous Zope restarts that sounds an awful lot like the problems some BSD users were encountering earlier.
Please ensure you're using Python 2.1.3 and Zope 2.5.1 to ensure you're not just experiencing some common crash bugs which have recently been fixed...
I am using 2.1.3 and 2.5.1. The problem also shows up on Zope 2.4.4.
Our Data.fs is huge (180MB) and our site is pretty complicated, so it's conceivable that we may still be causing a thread stack problem (though I judge it unlikely). That's why I'm asking after diagnostic clues, not solutions. I would like to rule out thread stack problems as early as possible, and start focussing on other issues, if in fact I can.
Thanks, --G.
-- Geoff Gerrietts <geoff at gerrietts net> "A man can't be too careful in the choice of his enemies." --Oscar Wilde
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
if you have not hacked your python 2.1.3 yet to allocate a bigger threaad stack upon thread creation then i would say that's where your problem is. jens On Thursday, April 25, 2002, at 02:58 , Geoff Gerrietts wrote:
Quoting Chris Withers (chrisw@nipltd.com):
Geoff Gerrietts wrote:
I'm in the process of moving a site from a (much) earlier release of Zope, up to 2.5.whatever. In making this migration, I have encountered a problem with segmentation faults and spontaneous Zope restarts that sounds an awful lot like the problems some BSD users were encountering earlier.
Please ensure you're using Python 2.1.3 and Zope 2.5.1 to ensure you're not just experiencing some common crash bugs which have recently been fixed...
I am using 2.1.3 and 2.5.1. The problem also shows up on Zope 2.4.4.
Our Data.fs is huge (180MB) and our site is pretty complicated, so it's conceivable that we may still be causing a thread stack problem (though I judge it unlikely). That's why I'm asking after diagnostic clues, not solutions. I would like to rule out thread stack problems as early as possible, and start focussing on other issues, if in fact I can.
Thanks, --G.
On Thu, 25 Apr 2002, Geoff Gerrietts wrote:
Quoting Chris Withers (chrisw@nipltd.com):
Geoff Gerrietts wrote:
I'm in the process of moving a site from a (much) earlier release of Zope, up to 2.5.whatever. In making this migration, I have encountered a problem with segmentation faults and spontaneous Zope restarts that sounds an awful lot like the problems some BSD users were encountering earlier.
Please ensure you're using Python 2.1.3 and Zope 2.5.1 to ensure you're not just experiencing some common crash bugs which have recently been fixed...
I am using 2.1.3 and 2.5.1. The problem also shows up on Zope 2.4.4.
Our Data.fs is huge (180MB) and our site is pretty complicated, so
You can rule out the Data.fs size also I have lots that are in the 500M to 1G range with no problems. I think we could do with some more information. If you start zope with STUPID_LOG_FILE="somepath/somefile" and then follow that file do you see any errors before the crash? What OS and version? How much memory does the system have? How fast is the cpu and what kind?
Quoting Jens Vagelpohl (jens@zope.com):
if you have not hacked your python 2.1.3 yet to allocate a bigger threaad stack upon thread creation then i would say that's where your problem is.
Again, I'm on Linux, not FreeBSD, and under Linux the default thread stack size is 2MB, not 64kB. I don't believe there's a way to raise that, short of recompiling glibc; pthread_attr_setstacksize doesn't work for values larger than 2MB. It's possible that there's a kernel limit that needs to be retuned; I find very little documentation on this. That is, if I need to hack my python to allot more memory for thread stacks, something is REALLY foobar. Quoting kosh@aesaeion.com (kosh@aesaeion.com):
You can rule out the Data.fs size also I have lots that are in the 500M to 1G range with no problems.
I meant to point out the size as an indicator of complexity, though I can see that it's not a good indicator. :)
I think we could do with some more information. If you start zope with STUPID_LOG_FILE="somepath/somefile" and then follow that file do you see any errors before the crash?
I wasn't getting any under RH6.2, but (curiously enough), moving to RH7.2 has changed the symptoms. Now it hangs, then throws a SystemExit after a long while. This is looking more familiar (but no less solve-able). The system exit shows up in the controlling console, not in STUPID_LOG_FILE. The message looks like: error! exceptions.SystemExit Terminated Zope continues to run after this, but appears to have lost its brains.
What OS and version?
Was RedHat Linux 6.2. Moved to RedHat Linux 7.2.
How much memory does the system have?
512MB RAM
How fast is the cpu and what kind?
Intel Pentium III clocking at 497.438 MHz (according to /proc/cpuinfo). Thanks, --G. -- Geoff Gerrietts "I don't think it's immoral to want to <geoff at gerrietts net> make money." -- Guido van Rossum
On Thu, 25 Apr 2002 13:19:07 -0600 (MDT) kosh@aesaeion.com wrote:
On Thu, 25 Apr 2002, Geoff Gerrietts wrote:
Quoting Chris Withers (chrisw@nipltd.com):
Geoff Gerrietts wrote:
I'm in the process of moving a site from a (much) earlier release of Zope, up to 2.5.whatever. In making this migration, I have encountered a problem with segmentation faults and spontaneous Zope restarts that sounds an awful lot like the problems some BSD users were encountering earlier.
Please ensure you're using Python 2.1.3 and Zope 2.5.1 to ensure you're not just experiencing some common crash bugs which have recently been fixed...
I am using 2.1.3 and 2.5.1. The problem also shows up on Zope 2.4.4.
Our Data.fs is huge (180MB) and our site is pretty complicated, so
You can rule out the Data.fs size also I have lots that are in the 500M to 1G range with no problems.
I think we could do with some more information. If you start zope with STUPID_LOG_FILE="somepath/somefile" and then follow that file do you see any errors before the crash?
What OS and version? How much memory does the system have? How fast is the cpu and what kind?
i have the same problem with zope-2.5.0 and with zope-2.5.1b2 on AIX4 with python -2.1.3 and i have started zope with STUPID_LOG_FILE but the only message that i get when zope chrashed is: 2002-04-26T07:35:38 ERROR(200) zdaemon zdaemon: Fri Apr 26 09:35:38 2002: Aiieee! 21886 exited with error code: 139 i don't know what is wrong. the install was running without any error messages jan "the fish is mute, the fish doesn't think, the fish knows everything"
Jan Idzikowski writes:
On Thu, 25 Apr 2002 13:19:07 -0600 (MDT) ... i have the same problem with zope-2.5.0 and with zope-2.5.1b2 on AIX4 with python -2.1.3 and i have started zope with STUPID_LOG_FILE but the only message that i get when zope chrashed is:
2002-04-26T07:35:38 ERROR(200) zdaemon zdaemon: Fri Apr 26 09:35:38 2002: Aiieee! 21886 exited with error code: 139 This is a SIGNAL 11, Segmentation Violation. Probably some form of memory corruption.
Dieter
Jan Idzikowski <idzikowski@atsolute.com> wrote:
2002-04-26T07:35:38 ERROR(200) zdaemon zdaemon: Fri Apr 26 09:35:38 2002: Aiieee! 21886 exited with error code: 139
Segmentation fault. Have you ruled out memory problems ? memtest86 ? Florent -- Florent Guillaume, Nuxeo (Paris, France) +33 1 40 33 79 87 http://nuxeo.com mailto:fg@nuxeo.com
On Mon, 29 Apr 2002 17:52:40 +0000 (UTC) Florent Guillaume <fg@nuxeo.com> wrote:
Jan Idzikowski <idzikowski@atsolute.com> wrote:
2002-04-26T07:35:38 ERROR(200) zdaemon zdaemon: Fri Apr 26 09:35:38 2002: Aiieee! 21886 exited with error code: 139
Segmentation fault. Have you ruled out memory problems ? memtest86 ?
Florent
no this wasn't the memory, it was the wrong compiler, i have to compile python with the ibm cc_r (thread-safe) compiler and the option -qmaxmem=4000 or greater. Now Zope is running and I'm happy. I think, I will write a new install howto for zope on aix. thanks all jan idzikowski@atsolute.com
-- Florent Guillaume, Nuxeo (Paris, France) +33 1 40 33 79 87 http://nuxeo.com mailto:fg@nuxeo.com
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
Quoting kosh@aesaeion.com (kosh@aesaeion.com):
You can rule out the Data.fs size also I have lots that are in the 500M to 1G range with no problems.
I think we could do with some more information. If you start zope with STUPID_LOG_FILE="somepath/somefile" and then follow that file do you see any errors before the crash?
What OS and version? How much memory does the system have? How fast is the cpu and what kind?
Under the latest incarnation, I get a system hang so brutal that ps quits responding and the system hasta be hard-booted. Yum. :) I've impressed our sysadmin. ;) I'll let everyone know what I find out.... --G. -- Geoff Gerrietts "There is no fate that cannot be <geoff at gerrietts net> surmounted by scorn." --Albert Camus
Quoting Geoff Gerrietts (geoff@gerrietts.net):
Under the latest incarnation, I get a system hang so brutal that ps quits responding and the system hasta be hard-booted. Yum. :)
I've impressed our sysadmin. ;) I'll let everyone know what I find out....
In keeping with my promise, an update. The application in question relies on ILU for communication with the backend services (including authentication). The problem I was experiencing was a consequence of a bug in ILU, solved by our original team but not documented in any way. In other words, it was, as many suspected, a problem in a third-party extension. The 2MB stack space reserved by Linux is in fact plenty for any use case I've come across. That it did not appear in independent testing of the third-party extension was an artifact of the test design. When used for authentication, the application writers used the buggy feature in such a way as to exercise the bug, and this feature was used elsewhere only in a relatively innocuous fashion. To the uninitiated (me), it wasn't even obvious that there was a difference. As a side note, debugging threaded applications under Linux is somewhere between totally useless and extremely annoying. The core that gets dumped is almost never from the thread that raised the signal (if in fact it ever is). If you can force things into a single thread and make them break, you'll be much happier for the effort. Thanks, everyone, for all your insight and assistance. --G. -- Geoff Gerrietts "Don't get suckered in by the comments-- <geoff at gerrietts net> they can be terribly misleading. www.gerrietts.net/geoff/ Debug only code." --Dave Storer
hallo Geoff, your problems sounds like my problems but i get this on AIX4 with zope version 2.5. i have also try it with the newest 2.5.... but with the same problems. I have open a bug in http://collector.zope.org/Zope/363/view but without any response. jan "the fish is mute, the fish doesn't think, the fish knows everything"
participants (8)
-
Andreas Jung -
Chris Withers -
Dieter Maurer -
Florent Guillaume -
Geoff Gerrietts -
Jan Idzikowski -
Jens Vagelpohl -
kosh@aesaeion.com