[Zope] Investigating a reference leak...

Fri Jan 28 01:52:10 EST 2005

Dear Zopistas

I've been trying to track down what's either a memory or a reference leak in a Zope 2.7.3 (Python 2.3.4) system.

The symptoms are that after several days of normal running on the agressively proxy-cached production site or a few hours on the
development site under simulated load, the number of OFS.Image.Image references reaches very high levels, such as 17761.  In the
cache at that time are just 4345 images.  There are only 1332 images on the whole site, so assuming (as I understand it) that each
of the 5 Zope thread has its own cache, there should be a maximum of (1332 * threads) = 6660 Images.  The site never creates new
Images, it just loads existing ones.

In a bid to see where these references are being created, I installed LeakFinder, but that doesn't help me because Images are
persistent objects, and thus their __init__ methods are not called (LeakFinder appears to patch just the __init__ and __del__
methods for tracking).  Thus I've been on a quest to find a place to stick a patch that will show me, for a persistent object, from
where a reference to it is being created.

I've tried __setstate__, and overridden it for OFS.Image.Image to print a traceback to stdout.  Then I run the development Zope site
with bin/runzope and watch.  This shows me Images being created in the expected places, but not in quantities that would explain
those huge refcounts.

So; to questions:

1) Is there a better place than __setcache__ to identify where references to Images are being created?  I don't fully understand the
way in which persistent objects are actually created and populated.

2) Our site is very dependent on ExternalMethods (Zope provides the visual layout, business logic is in ExternalMethods, all from
one module).  In some of those methods, the code loads Images in order to look up information about them - this is because we have
some particular standards for things like Image tags which the built-in tag() method can't support, so we have to build our own.
Thus there's code like:

	img = getattr(self.Images,'image.gif')	#all images live in /Images, got by acquisition from self
	buildTag(img.title,img.width,img.height)	#use image attributes to build fuller tag
	del img						#avoid leaving references around (this is paranoid!)

Is there any reason that this is bad practise?  I see __setstate__ for images being called from this code and then again shortly
afterwards when the Image is served up to the browser.

I'd be very grateful for any help on this!

Regards to all
Ben

Ben Last
Technical Director
SleepyDog