[Zope-dev] Re: Zope 2.4 crashes -- possible fix identified, other solutions also suggested

Anthony Baxter Anthony Baxter <anthony@interlink.com.au>
Tue, 18 Dec 2001 12:25:42 +1100


>>> Jeremy Hylton wrote
> Do you have any more idea about what shutting the garbage collector
> off achieves?  In practice, the garbage collector's most common effect
> is to turn latent bugs into manifest bugs; a bug has trashed part of
> memory and the garbage collector just happens to find it first.  If
> you turn GC off in these cases, you run a little longer, but you're
> running with corrupted memory.

Sorry I haven't been keeping up with the zope-* lists of late - this is
what I've found as well. Something, and I strongly suspect it's inside
the Zope C code, is playing jumpy-jumpy-stomp-stomp on bits of memory.
The garbage collector is hitting this corrupted data and dying. I've
posted before about the structure I've found that's corrupted (it's 
_always_ the same structure) but I've not yet been able to track down
what it is. For us, the "fix" has been to run more zeo clients behind a
loadbalancer, so that when one crashes out (about every 10-12 hours for
us) things keep working, and the zopecontrol script restarts it.

Anthony