On Mon, Jan 15, 2007 at 01:22:39PM +0100, Nico Grubert wrote:
Dear list members,
I am running Zope 2.9.6. on a 64-Bit Suse Linux 10.1 machine (9 Gbyte of RAM) with Python 2.4.3. installed. From time to time, Zope hangs and I cannot access it anymore. I tried to use the "Zope DeadlockDebugger" but if my Zope hangs I cannot call the URL "http://myzopesite:8080/manage_debug_threads?secret_password" to let Zope DeadlockDebugger show any useful information. Zope does not response. If Zope hangs, the python process eats all the memory and the machine starts to swap.
The "top" command in the shell tells me: ------------------------------------------------------------------------- Tasks: 91 total, 2 running, 89 sleeping, 0 stopped, 0 zombie Cpu(s): 10.4%us, 0.4%sy, 0.0%ni, 89.0%id, 0.2%wa, 0.0%hi, 0.0%si Mem: 9041256k total, 9025124k used, 16132k free, 10604k buffers Swap: 4208988k total, 4208988k used, 0k free, 9472k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
29992 wwwrun 16 0 12.0g 8.3g 2860 S 19 95.9 3:52.66 python -------------------------------------------------------------------------
There are several add-ons for Zope installed, like: - "Psycopg" Postgres Database Adapter - "mxODBCDA" ODBC Database Adapter - LDAPUserfolder - "Silva" Content Management System - PIL
Furthermore, I see a lot of Conflict Errors in the "event.log" like e.g.: ------------------------------------------------------------------------- ZPublisher.Conflict ConflictError at /VirtualHostBase/http/193.134.202.20:80/mysite/VirtualHostRoot/: database conflict error (oid 0x0435, class BTrees._OOBTree.OOBTree, serial this txn started with 0x036aeb14ab1c4b88 2007-01-15 12:04:40.104030, serial currently committed 0x036aeb1d0cc5d099 2007-01-15 12:13:02.993605) (80 conflicts (0 unresolved) since startup at Mon Jan 15 11:44:55 2007) -------------------------------------------------------------------------
These kind of Conflict errors occur almost every minute. It might be that these errors have something to do with it but I am not sure.
How can I figure out what exactly causes Zope to hang? If you need more information, please let me know.
Thanks in advance...
Regards, Nico
It sounds to me like you are simply experiencing "swap death". Zope isn't really hung - it's just using so much memory that you're "forever" waiting on disk I/O to handle all those pages. The system should recover eventually if there's no more load; but unfortunately the typical case is that once you start swapping badly, user requests keep piling up in the queue, so things only continue to get worse. I'm impressed at your stats though. I've never had a box with 8 GB of RAM, much less got Zope to use up all of it :-) Have you tried the "debug spinning zope" recipe? http://www.zopelabs.com/cookbook/1073504990 that might give you a clue how you got into this state. unfortunately since the whole system is swapping like crazy, working at the shell is probably no fun either :-) P.S. How's your SU700 treating you? I sold mine years ago :) -- Paul Winkler http://www.slinkp.com