Well, I just moved my zope to a Solaris box rebuilt with gcc. I'm now able to CONSISTENTLY get this thing to crash. What i do is point a couple of wgets at the box to recursively snarf up pages. This doesn't seem to cause any problems (yet) Then with my browser, I try and load up some pages. Usually, within four or five attempts at loading pages, BAMM zope starts eating up CPU cycles and within two minutes zope crashes... And this is what the debugger tells me: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 7 (LWP 3)] Program received signal SIGSEGV, Segmentation fault. 0xef51d2b0 in ExtensionClass_FindInstanceAttribute (inst=0x1c6ff50, oname=0x3e6690, name=0x3e66a4 "_View_Permission") at ./../Components/ExtensionClass/ExtensionClass.c:1603 1603 if (! name) return NULL; (gdb) where #0 0xef51d2b0 in ExtensionClass_FindInstanceAttribute (inst=0x1c6ff50, oname=0x3e6690, name=0x3e66a4 "_View_Permission") at ./../Components/ExtensionClass/ExtensionClass.c:1603 Cannot access memory at address 0xeee05fac. (gdb) info threads * 14 Thread 7 (LWP 3) 0xef51d2b0 in ExtensionClass_FindInstanceAttribute ( inst=0x1c6ff50, oname=0x3e6690, name=0x3e66a4 "_View_Permission") at ./../Components/ExtensionClass/ExtensionClass.c:1603 13 Thread 6 0xef5b9788 in _lwp_sema_wait () 12 Thread 5 (LWP 0) 0xef737ae4 in _swtch () 11 Thread 4 (LWP 0) 0xef737ae4 in _swtch () 10 Thread 3 0xef737c14 in _swtch () 9 Thread 2 (LWP 2) 0xef5b98d0 in __signotifywait () 8 Thread 1 (LWP 1) 0xef5b7400 in poll () 7 LWP 7 0xef5b699c in door_restart () 6 LWP 6 0xef5b9788 in _lwp_sema_wait () 5 LWP 5 0xef5b9788 in _lwp_sema_wait () 4 LWP 4 0xef5b9788 in _lwp_sema_wait () 3 LWP 3 0xef51d2b0 in ExtensionClass_FindInstanceAttribute ( inst=0x1c6ff50, oname=0x3e6690, name=0x3e66a4 "_View_Permission") at ./../Components/ExtensionClass/ExtensionClass.c:1603 2 LWP 2 0xef5b98d0 in __signotifywait () 1 LWP 1 0xef5b7400 in poll () (gdb) Anybody think this is related to the Linux problem? -Jon "Dr. Ross Lazarus" <rossl@med.usyd.edu.au> writes:
Made no difference here either.
In desperation, I moved the server behind an apache using fastcgi. Same problem persists...random crashes with aaiiieee 11 and 256 in the debug log, often with core dumps.
May be time to move away from the Lintel box (stock redhat 6.1, 2.16 src install, pII350, 256mB ram) - I don't see this on a sun box I'm also running.
Does this happen with binary installs ? Is it just redhat 6.1 (we didn't see this on 5.2!)? One problem is that the zmonitor usually works - the server restarts itself and users may only notice a long delay. I see it because I'm watching the logs anxiously.
jon prettyman <jprettyma-@acm.org> wrote: original article:http://www.egroups.com/group/zope/?start=27180
Setting DEBUG to 1 had no affect on my server. Crashed within 15 minutes of setting it.
-Jon
Pavlos Christoforou <pavlos@gaaros.com> writes:
On Fri, 24 Mar 2000, Michel Pelletier wrote:
As soon as I'm able to collect more info I'll forward it to you. Is there anywhere else I should be posting this information?
The list. just keep ccing me.
Some good news at last ...
When I set DEBUG in asyncore.py to 1 so I could view the lists going into select, ZServer stabilised and hasn't crashed since. Smells like a race condition and somehow the extra time it takes to print the list contents stabilises things.
--
Dr Ross Lazarus Associate Professor and Sub-Dean for Information Technology Faculty of Medicine, Room 126A, A27, University of Sydney, Camperdown, NSW 2006, Australia Tel: (+61 2) 93514429 Mobile: +61414872482 Fax: (+61 2) 93516646 Email: rossl@med.usyd.edu.au