[ZCM] [ZC] 2075/10 Comment "zdrun.py is crashing with SEGV"

Collector: Zope Bugs, Features, and Patches ... zope-coders-admin at zope.org
Sun Apr 30 08:08:37 EDT 2006


Issue #2075 Update (Comment) "zdrun.py is crashing with SEGV"
 Status Pending, Zope/bug medium
To followup, visit:
  http://www.zope.org/Collectors/Zope/2075

==============================================================
= Comment - Entry #10 by ctheune on Apr 30, 2006 8:08 am

Hi. Just a short info: I had a similar problem with segfaults and hangs.

Is it possible that you are starting Zope as root and drop the privileges to some other user? In that case I found that Zope can not reliable segfault and create a core dump. 

If you start it as a normal user, you might experience your segfaults but won't have Zope itself hang, but restart cleanly.

Then you will also receive core dumps and can check which 3rd party module is responsible. (In my case it turned out to be mysql)

________________________________________
= Comment - Entry #9 by oddjob on Apr 26, 2006 5:39 pm

Ok, (with entire toolchain/gcc/glib/python etc built most recent source from source - bear in mind this has probably previously been running on 6 months+ uptime with no changes ...)

1. I think I need some advice on how to get a core dump, I don't seem to be able to get any symbols from a running instance, and it's not generating a core dump file when it dies .. even tho' unlimit -c is set high ..

2. Here's where it goes in the trace;

B 1121480076 2006-04-26T22:09:15 GET /VirtualHostBase/http/linux.co.uk:80/plone/linux/VirtualHostRoot/
I 1121480076 2006-04-26T22:09:15 0
A 1121480076 2006-04-26T22:09:15 200 35536
E 1121480076 2006-04-26T22:09:15

B 1121424908 2006-04-26T22:09:36 GET /VirtualHostBase/http/linux.co.uk:80/plone/linux/VirtualHostRoot/ldp/linuxfocus/Portugues/July2002/article239.shtml
I 1121424908 2006-04-26T22:09:36 0

---- DIES HERE ---
---- NEXT ENTRY LOOKS TRUNCATED TOO! ----
B 1090202764 2006-04-26T22:31:56 GET /VirtualHostBase/http/linux.co.uk:80/plone/linux/VirtualHostRoot/ldp/HOWTO/Scripting-GUI-TclTk/advanced.html
I 1090202764 2006-04-26T22:31:56 0

B 1090370700 2006-04-26T22:31:56 GET /VirtualHostBase/http/linux.co.uk:80:80/plone/linux/VirtualHostRoot/Pages/projects/plone/IndexesInAnSQLDatabase
I 1090370700 2006-04-26T22:31:56 0
A 1090202764 2006-04-26T22:32:09 200 7782

________________________________________
= Comment - Entry #8 by tseaver on Apr 26, 2006 10:21 am

While you are waiting, please enable the 'trace log' in your
zope.conf;  that should provide more information about the URLs
which hang / crash.
________________________________________
= Comment - Entry #7 by oddjob on Apr 26, 2006 10:05 am

Actions::
 * Rebuilt entire toolchain with more recent versions
 * Rebuilt python
 * Turned on core dumps
 * Lifted firewall

Next action :: 
 * Waiting for it to crash to I can acquire a core dump
________________________________________
= Comment - Entry #6 by tseaver on Apr 26, 2006 8:43 am

We need more information about the code which is actually executing
when the crash happens.  Can you provide a URL which reliably
triggers the crash when run against Zope on localhost:8080?

If not, then this is likely a problem related to a third-party
add-on (Plone is such an add on from Zope's POV).  If you can
reliably trigger a crash against a default Plone install, then
report the bug to the Plone collector.

If a Plone add-on is responsible, then you need to report the
bug to the author of that add-on.
________________________________________
= Comment - Entry #5 by oddjob on Apr 26, 2006 5:47 am

Just to clarify, I've got 2 servers being hit , one on Plone 2.1.0 / Zope 2.7.6 , one on Plone 2.1.2 that was initially 2.8.4 but is now in 2.8.6 .. both roll over at random intervals (0.5 to 8h) if I lift the firewall.
________________________________________
= Comment - Entry #4 by oddjob on Apr 26, 2006 5:45 am

Mmm, from what I can see the problem is being exploited as a DOS attack (intentionally or otherwise) and I've cured the problem for now by firewalling off China and Russia.

If Zope is subject to a DOS attack that will hang it (Badly! , the child threads don't die, it just stops taking new requests!) you might want to reconsider the importance .. just my 2c ..
________________________________________
= Comment - Entry #3 by ajung on Apr 26, 2006 12:26 am

This does not seem to be a general a problem of Zope. I recommend
to ask on the Zope mailinglist first. Remark: a strace listing is not helpful. gdb would be the choice to track down this problem.


________________________________________
= Edit - Entry #2 by ajung on Apr 26, 2006 12:25 am

 Changes: submitter email, importance (critical => medium)
________________________________________
= Request - Entry #1 by oddjob on Apr 25, 2006 8:56 pm

Has happens for a couple of days at a time, seems to be traffic related, no apparent pattern available from log files, Firewalling off certain Internet address ranges seems to help.

 * site drives a public web site)
 * if it matters, the top level application is "Plone"
 * system is running other applications, nothing else seems to be effected
 * zope runs behind apache2

I attached strace to it and this is what happens when it crashes;

 accept(3, {sa_family=AF_INET, sin_port=htons(39771), sin_addr=inet_addr("127.0.0.1")}, [16]) = 21
 fcntl64(21, F_GETFL)                    = 0x2 (flags O_RDWR)
 fcntl64(21, F_SETFL, O_RDWR|O_NONBLOCK) = 0
 fcntl64(21, F_GETFL)                    = 0x802 (flags O_RDWR|O_NONBLOCK)
 fcntl64(21, F_SETFL, O_RDWR|O_NONBLOCK) = 0
 getpeername(21, {sa_family=AF_INET, sin_port=htons(39771), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
 gettimeofday({1145987756, 502686}, NULL) = 0
 fcntl64(21, F_SETFD, FD_CLOEXEC)        = 0
 select(22, [3 8 13 17 19 20 21], [], [], {30, 0}) = 1 (in [21], left {30, 0})
 recv(21, "GET /VirtualHostBase/http/flashl"..., 4096, 0) = 638
 gettimeofday({1145987756, 503343}, NULL) = 0
 kill(15272, SIGRTMIN)                   = 0
 kill(15272, SIGRTMIN)                   = 0
 select(22, [3 8 13 17 19 20 21], [], [], {30, 0}) = ? ERESTARTNOHAND (To be restarted)
 --- SIGSEGV (Segmentation fault) @ 0 (0) ---

==============================================================



More information about the Zope-Collector-Monitor mailing list