Our development Zope 2.5.0 crashes frequently
Hi all, any thoughts? Has anyone had this experience? (and would you like to?) Our Zope development system is used by a small team of developers, up to 6 people. It has been crashing and restarting automatically 6 or 7 times a day, much to the annoyance of our programmers. I use both Stupid_log_file and "-M" detailed log file to record everything I can, and have grabbed requestprofiler.py, which is a very useful tool (Thanks!). (an aside, is there any other logging I can do, aside from coding something?) All our developers interface with the Zope "manage" function, as one would, and go directly through port 8080 (Z HTTP). One suggestion I have made is to access it via the FastCGI interface instead, just to see what impact that has. The Detailed Log shows me variously that for no apparent reason, GET /manage, /manage_main, or /some-product/subdir/sub/something/manage_workspace and so on cause this failure, which results in: zdaemon... Aiieee! 13215 exited with error code: 138 (or 139) I had asked you all last week, about what this above meant, and Jens & Dieter kindly told me descriptions are to be found in the OS's lib/errno.h file. Now, we use Solaris and/or openBSD, and this file helpfully says this number (in a range) is used by Xenix, and is thus not documented. [Footnote: I keep all emails, but somehow I managed to delete this single one from Jens - damn!] Thanks in advance, if you can shed light on this.... ------------------------------- Graham King Host Services Team Leader OzEmail Internet +61 2 9433 2747 graham.king@team.ozemail.com.au www.ozemail.com.au -------------------------------
Graham King writes:
zdaemon... Aiieee! 13215 exited with error code: 138 (or 139)
I had asked you all last week, about what this above meant, and Jens & Dieter kindly told me descriptions are to be found in the OS's lib/errno.h file. Now, we use Solaris and/or openBSD, and this file helpfully says this number (in a range) is used by Xenix, and is thus not documented. [Footnote: I keep all emails, but somehow I managed to delete this single one from Jens - damn!] You look at the wrong place! "errno.h" is irrelevant for you.
Exit code 138 means "signal 10 and core dump", exit code 139 "signal 11 (SIGSEGV) and core dump". You can try "kill -l" in a bash or csh to find out what signal 10 means. Under Linux, it is SIGUSR1, but this should not cause a core dump. Dieter
participants (2)
-
Dieter Maurer -
Graham King