Hmmmm.... what do *you* do when Zope is 'stuck'.
Hi Folks, Ok.. this is the wierdest thing I've run into yet. Any ideas how to diagnose this would be most appreciated. It's a little bit like quantum mechanics. When you try to measure it... it changes. ;-) I have a Data.fs that has somehow gotten into a state where Zope just hangs there. I can attach to the python process with the debugger when it's running and dump the stack trace. It looks like this: #0 0x281af6c0 in _thread_sys_poll () from /usr/lib/libc_r.so.4 #1 0x281a806f in _thread_kern_sched_state_unlock () from /usr/lib/libc_r.so.4 #2 0x281a775a in _thread_kern_sched () from /usr/lib/libc_r.so.4 #3 0x281a7b7e in _thread_kern_sched_state () from /usr/lib/libc_r.so.4 #4 0x281944c1 in wait4 () from /usr/lib/libc_r.so.4 #5 0x281865c7 in __waitpid () from /usr/lib/libc_r.so.4 #6 0x2816bba2 in waitpid () from /usr/lib/libc_r.so.4 #7 0x80890cd in pcre_exec () #8 0x8055069 in PyEval_CallObjectWithKeywords () #9 0x8054f73 in PyEval_CallObjectWithKeywords () #10 0x8054028 in PyEval_EvalCode () #11 0x8053f10 in PyEval_EvalCode () #12 0x805210d in PyEval_EvalCode () #13 0x80629a3 in PyRun_File () #14 0x806295c in PyRun_File () #15 0x806293c in PyRun_File () #16 0x8062046 in PyRun_SimpleFile () #17 0x8061cba in PyRun_AnyFile () #18 0x8050935 in Py_Main () #19 0x8050420 in main () #20 0x80503a5 in _start () The problem is somehow related to timing and threads. If I start Zope with only one thread.... no hang. If I start Zope with 'PROFILE_PUBLISHER' defined... no hang. I've added some debugger hooks at various places (e.g., ZPublisher, ZServer, etc... ) in an attempt to 'step' through the problem, but when I do.. it works! Ahhhh!!! I'm about to add some 'print' statements around but I'm fearing that if I add some print statements it will disturb the timing enough to destroy the electron interference patterns... er... I mean mask the problem. Has anyone got any ideas on how to track something like this down? The only other clue I have is that, at first, when I saw the problem, whenever I attached to the process it would be in the 'PoPymodule', so I naturally suspected PoPy. However, I've *removed* the ZPoPyDA Product from the Products folder (on the filesystem) , so that presumably PoPymodule.so will/should not be loaded or accessed. Anyway.... popymodule is no longer showing up in the stack trace, but still Zope is hanging.... All this started showing up by the way not long after we started experimenting with ZPoPyDA. But.. now that ZPoPyDA is gone... is there any way my Data.fs could be affected in a way that could cause this behavior? thanks for any insight.... -steve
OK.. more details... this is definitely a PoPy issue... I reinstalled ZPoPyDA, ran Zope in single threaded mode ( ./start -t 1 ) and then deleted my ZPoPyDA database adaptor. Then I created a new one, connected to postgres, and tested it. Then when I shut down Zope, and restared multi-threaded all was working again. Soo... this leaves me with a real warm fuzzy feeling ;-). I have been moving databases around recently so I probably futzed something in the process... wierd symptoms though.... Anyway.. sorry for bothering... still interested in ideas for 'debugging' the problem, should it ever arise again. -steve
On Sat, 2 Dec 2000, Steve Spicklemire wrote:
OK.. more details... this is definitely a PoPy issue...
I reinstalled ZPoPyDA, ran Zope in single threaded mode ( ./start -t 1 ) and then deleted my ZPoPyDA database adaptor. Then I created a new one, connected to postgres, and tested it. Then when I shut down Zope, and restared multi-threaded all was working again. Soo... this leaves me with a real warm fuzzy feeling ;-). I have been moving databases around recently so I probably futzed something in the process... wierd symptoms though....
Anyway.. sorry for bothering... still interested in ideas for 'debugging' the problem, should it ever arise again.
Though I do not really see an analogy ;), I want to share this: I have found that my Zope would not start up when the servers mentioned in a database connection object were unavailable (a ZMySQLDA in my case). I "fixed" this by adding my db servers to the local /etc/hosts file with an address of 127.0.0.0. Zope would start up fine then. The db connections all appeared broken of course, but at least they were not waiting "indefinitely" for a response from the servers. This does of course not explain the quantum phenomena you saw ;) Regards, Stefan
Thanks... I'm pretty sure now that what I was was pthreads/popy related. I didn't realize there was a new/updated PoPy/ZPoPyDA on the zope site with lots of thread related (sem_init, sem_wait,... ) changes so I built it on my FreeBSD boxen and haven't seen the same behavior since. -steve P.S. here are the changes I made to build on FreeBSD: diff -c -r1.1.1.1 -r1.2 *** PoPymodule.h 2000/12/03 14:06:38 1.1.1.1 --- PoPymodule.h 2000/12/03 14:09:57 1.2 *************** *** 38,43 **** --- 38,44 ---- #include <catalog/pg_type.h> #include <libpq-fe.h> #include <libpq/libpq-fs.h> + #include <sys/types.h> #include <regex.h> #include <string.h> #include <stdlib.h> and mercury.spvi.com> diff -c foop/pythonmods/PoPy/Makefile foo/pythonmods/PoPy/Makefile *** old/Makefile Sun Dec 3 18:06:58 2000 --- new/Makefile Sun Dec 3 09:43:00 2000 *************** *** 88,94 **** TARGET= python # Add more -I and -D options here ! CFLAGS= $(OPT) -I$(INCLUDEPY) -I$(LIBPL) $(DEFS) -I/usr/local/pgsql/include/ -Wall \ -DVERSION=\"1.4.1\" # These two variables can be set in Setup to merge extensions. --- 88,94 ---- TARGET= python # Add more -I and -D options here ! CFLAGS= $(OPT) -pthread -I$(INCLUDEPY) -I$(LIBPL) $(DEFS) -I/usr/local/pgsql/include -I/usr/ports/databases/postgresql7/work/postgresql-7.0.2/src/include -Wall \ -DVERSION=\"1.4.1\" # These two variables can be set in Setup to merge extensions. *************** *** 114,120 **** LINKCC= $(PURIFY) $(CC) SGI_ABI= OPT= -fomit-frame-pointer -O6 ! LDFLAGS= -L/usr/local/pgsql/lib/ LDLAST= DEFS= -DHAVE_CONFIG_H=1 -DHAVE_LIBCRYPT=1 LIBS= -lcrypt -lc_r --- 114,120 ---- LINKCC= $(PURIFY) $(CC) SGI_ABI= OPT= -fomit-frame-pointer -O6 ! LDFLAGS= -L/usr/local/pgsql/lib LDLAST= DEFS= -DHAVE_CONFIG_H=1 -DHAVE_LIBCRYPT=1 LIBS= -lcrypt -lc_r *************** *** 123,129 **** RANLIB= ranlib MACHDEP= freebsd3 SO= .so ! LDSHARED= gcc -shared CCSHARED= -fpic LINKFORSHARED= -Xlinker -export-dynamic --- 123,129 ---- RANLIB= ranlib MACHDEP= freebsd3 SO= .so ! LDSHARED= gcc -shared -fpic -pthread CCSHARED= -fpic LINKFORSHARED= -Xlinker -export-dynamic *************** *** 209,217 **** cp -ra test debian mx PoPy-1.4.1/ tar czf PoPy-1.4.1.tar.gz PoPy-1.4.1/ - - # Rules appended by makedepend PoPymodule.o: $(srcdir)/PoPymodule.c; $(CC) $(CCSHARED) $(CFLAGS) -c $(srcdir)/PoPymodule.c ! PoPymodule$(SO): PoPymodule.o; $(LDSHARED) PoPymodule.o -L/usr/local/pgsql/lib/ -lpq -o PoPymodule$(SO) --- 209,215 ---- cp -ra test debian mx PoPy-1.4.1/ tar czf PoPy-1.4.1.tar.gz PoPy-1.4.1/ # Rules appended by makedepend PoPymodule.o: $(srcdir)/PoPymodule.c; $(CC) $(CCSHARED) $(CFLAGS) -c $(srcdir)/PoPymodule.c ! PoPymodule$(SO): PoPymodule.o; $(LDSHARED) PoPymodule.o -L/usr/local/pgsql/lib -lpq $(LIBS) -o PoPymodule$(SO)
participants (2)
-
Stefan H. Holek -
Steve Spicklemire