[Zope-dev] Very severe memory leak

Leonardo Rochael Almeida leo at hiper.com.br
Mon Aug 25 19:18:56 EDT 2003


On Sat, 2003-08-23 at 22:18, Shane Hathaway wrote:
> On 08/22/2003 05:38 PM, Leonardo Rochael Almeida wrote:
> > In time, DateTime refcounts eventually dwarves the second place by an
> > order of magnitude. I think this is related to the fact that DateTime
> > instances are stored as metadata, even though the date indexes have been
> > converted to DateTime indexes. The question is, why aren't those
> > instances being released? What is holding on to them?
> 
> When you flush the cache, those DateTimes should disappear.  If they 
> don't, the leak is keeping them.

They are disappearing. Too bad they return immediately after, as they're
comming from a very heavy catalog query result.

> LeakFinder is an early attempt to share some of the techniques I use for 
> finding leaks.  Unfortunately, those techniques aren't very useful until 
> you've already searched for weeks, which means LeakFinder isn't very 
> good for emergencies.

Well, I have been searching for close to a week now :-)

BTW, removing the third parameter in the PickleCache call makes the
RefCounts tab work again. I'll see if I can get some tracebacks out of
object creations now.

> Here are the things you should look at first:
> 
> 1) The ZODB cache size.  The meaning of this number changed dramatically 
> in 2.6.  Before 2.6 it was a very vague number.  In 2.6 it's a target 
> number of objects that Zope actually tries to maintain.  Before 2.6 it 
> might have made sense to set the ZODB cache size to some arbitrarily 
> high number like 100,000; in 2.6 you want to start at about 2000 and 
> adjust from there.  There are tools in 2.6 for helping you adjust the 
> number.

We're keeping it at 15,000. Most threads keep around 12k and 13k. One of
them sometimes exagerates at 100k or so. Hmm, funny... when I look at
them now, they're all under 5k...

> 2) The number of ImplicitAcquisitionWrappers present in the system.  I 
> have found it to be a reliable indicator of whether you have a leak or 
> not.  Expect this number to stay under 400 or so.  If it grows 
> gradually, there's a leak.  Watch the refcounts screen.

It keeps under 200. At least when I'm looking :-)

> 3) Is Python compiled with cyclic garbage collection enabled?  2.4 and 
> above absolutely require cyclic garbage collection.

Yes, I mentioned I checked it in my message. I also checked that the gc
didn't find any uncollectables, so it's definetly *not* a cycle leak.
It's a please-release-my-reference-and-lemme-die-in-peace leak.

> 4) Don't use 2.6.1.  Use 2.6.2, which has fixes for known leaks.  It is 
> actually already tagged in CVS as "Zope-2_6_2", and it's what zope.org 
> is now running.  Various unrelated things prevented a formal release 
> this week.

You guys are sure about this? The client is very much against running
things from CVS because they just don't want to go thru another upgrade
processes when the "real" 2.6.2 shows up.

> If all else fails, grep all Python modules for "sys._getframe()" and 
> "sys.exc_info()".  These are the primary causes of memory leaks in 
> Python 2.1 and below.  If you're brave, you can just run Zope under 
> Python 2.2, which fixes those particular leaks AFAIK.

Do I need to be particularly brave to run 2.6.2 (as tagged in the CVS)
under Python2.2? is it still an "unsupported" combination? I know it's
2.7's job to be 2.2 compliant, but I've seen reports of more and more
people running Zope 2.6.x under Python 2.2.

> Finally, there's always hope. :-)  The latest thing I've been doing is 
> running Zope in a debug build of Python.  A debug build makes a magical 
> "sys._getobjects()" available,

Where do you discover these things? :-)

> allowing you to inspect all live objects 
> through a remote console.

Interesting, which one do you use? does Boa have a remote console? or is
it just plain monitor port?

> Since debug builds aren't much slower than 
> standard builds, you can even run a debug build in production for short 
> periods of time.  I've been building a small library of functions for 
> working in this mode, and if you need them, I'll pass them along.  I'd 
> have to warn you that they are anything but intuitive in their purpose 
> and use, though.

I'll let you know if it gets to that :-)

For now we'll be stress testing to try and locate the URLs that cause
the most leakage.

Cheers, Leo

PS: to the people who offered other suggestions: Yes, we're already
running ZEO to restart faster. No, buying more memory now is not an
option. Yes, we're already automatically restarting the process when the
available memory in the machine gets to a critical state, but not thru
Autolance.

-- 
Ideas don't stay in some minds very long because they don't like
solitary confinement.




More information about the Zope-Dev mailing list