[Zope-dev] recipe for trapping SIGSEGV and SIGILL signals on solaris
Joseph Wayne Norton
norton@alum.mit.edu
Tue, 11 Dec 2001 16:20:10 +0900
Hello.
We are facing zope restarts on the solaris 5.6 platform with zope
2.4.3 and python 2.1.1. I put together a script based some
information on an old posting to the apache mailing list. The
following shell/perl script allows one to get a core file from a dying
zope child process and also allow the zope to restart without any side
effects.
The script ....
#!/bin/sh
PATH=$PATH:/usr/local/bin
export PATH
cd /tmp
for PID in `ps -u zfs -f -o pid,comm,args | fgrep z2.py | cut -d' ' -f1`
do
export PID
truss -f -l -t\!all -S SIGSEGV,SIGILL -p $PID 2>&1 \
| perl -pe 'system("gcore $ENV{'PID'} && sleep 5 && kill -9 $ENV{'PID'}"), exit($ENV{'PID'}) if /(SIGSEGV|SIGILL)/;' &
done
Step 1: modify script to match your environment.
Step 2: execute script
Step 3: wait for core file to be dumped in /tmp.
Step 4: analyze with gdb where $PID is the pid of the dumped process
#bash gdb /path/to/bin/python /tmp/core.$PID
#0 0xef5b9810 in _lwp_sema_wait ()
(gdb) where
#0 0xef5b9810 in _lwp_sema_wait ()
#1 0xef647ea0 in _park ()
#2 0xef647b84 in _swtch ()
#3 0xef6468a4 in cond_wait ()
#4 0xef6467c8 in _ti_pthread_cond_wait ()
#5 0x50220 in PyThread_acquire_lock (lock=0xd9d878, waitflag=1)
at Python/thread_pthread.h:313
#6 0x51f18 in lock_PyThread_acquire_lock (self=0xda39b8, args=0x0)
at ./Modules/threadmodule.c:67
#7 0x35db4 in fast_cfunction (func=0xda39b8, pp_stack=0xed40f828,
na=0)
at Python/ceval.c:2994
#8 0x33ca0 in eval_code2 (co=0x267848, globals=0x51ec4, locals=0x0,
args=0x0,
argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0)
at Python/ceval.c:1951
:
:
It seems that we are facing trouble due to the thread library on
solaris (unless the truss command has introduced a side-effect).
Anyone else facing similiar troubles? .... or maybe I should post
this to a python mailing list.
- joe