Sessions and RAM cache managers causing memory leak
Hi, I have a website using CMF that uses sessions and a couple of RAM Cache Managers. There seems to be some problem between them and the data I am storing that is causing a huge memory leak (At times 300-400M of memory is chewed up in 10 minutes). My problem is how to track the problem down. I thought it would be simple process of running httperf and including/excluding things until I narrow it down. It hasn't been as simple as that, actually it has become very confusing... Zope Version: 2.6 Python: 2.1.3 CMF: 1.4.2 The session objects hold general data for the current user including; strings, lists, dictionaries and a datetime object. There are 2 RAM Cache Managers, one which caches for 15 mins and one for 24 hours. Both cache 20-30 DTML Methods or Python Scripts that return either lists, datetime objects, dictioneries, or rendered dtml. Some of the dtml calls the catalog multiple times, I can't see why, but would this be a problem? (I also tried changing everything to strings to see if it was a problem with caching objects like datetime and dictionaries, but no change) I created a test dtml method that creates a simple session (with a single string stored), and also calls all the methods/scripts that are cached. I then went about adding/removing these and running httperf to narrow down the problem. This is what happened calling the test method by httperf for approx 15 mins after restarting zope after each test (initial memory usage was 22M before running httperf)... Calling all methods/scripts being cached (no sessions created): 86M used at end of test. Calling the sessions only, creating a session for each call (no caching): 60M used at end of test. Calling cached methods/scripts and session creation together (this is everything): 471M used at end of test (usually more). Calling 15min cache objects and session creation (exclude 24hour cache objects): 192M used at end of test. Calling 24hour cache objects and session creation (exclude 15min cache objects): 159M used at end of test. I don't understand whats going on here. there seems no problem between the 2 RAM caches when you run them together. No problem with the session machinery when you create sessions by itself. but put everything together and it all goes very wrong! I would have thought if the caching on its own ends up at 86M and the sessions on their own end at 60M, then together i should expect around 140-150M. Any ideas of where I should be looking, something I am overlooking, or any known problems with cmf,caching,sessions that I should be aware of? Thanks for your help on this, as I have no idea where to go next. Richard.
On Mon, 2004-05-17 at 05:34, Richard Ettema wrote:
I created a test dtml method that creates a simple session (with a single string stored), and also calls all the methods/scripts that are cached. I then went about adding/removing these and running httperf to narrow down the problem. This is what happened calling the test method by httperf for approx 15 mins after restarting zope after each test (initial memory usage was 22M before running httperf)...
Calling all methods/scripts being cached (no sessions created): 86M used at end of test.
Calling the sessions only, creating a session for each call (no caching): 60M used at end of test.
Calling cached methods/scripts and session creation together (this is everything): 471M used at end of test (usually more).
Calling 15min cache objects and session creation (exclude 24hour cache objects): 192M used at end of test.
Calling 24hour cache objects and session creation (exclude 15min cache objects): 159M used at end of test.
I don't understand whats going on here. there seems no problem between the 2 RAM caches when you run them together. No problem with the session machinery when you create sessions by itself. but put everything together and it all goes very wrong! I would have thought if the caching on its own ends up at 86M and the sessions on their own end at 60M, then together i should expect around 140-150M.
Any ideas of where I should be looking, something I am overlooking, or any known problems with cmf,caching,sessions that I should be aware of?
Not really, unfortunately. How are you determining memory usage? Do you have any "cache keys" for your cache managers (except AUTHENTICATED_USER, which is the default)? There is a rendering potentially stored for each cache key when the cache key differs per request. Maybe try storing your "transient object container" in the "main" ZODB instead of storing it in /temp_folder/session_data (which is backed entirely by RAM). I don't recommend doing this long-term but it might be helpful during diagnosis. The only other thing I can suggest is to try to make a reproducible test case and post the steps (and if possible a .zexp) in the Zope.org collector. - C
Not really, unfortunately. How are you determining memory usage? Do you have any "cache keys" for your cache managers (except AUTHENTICATED_USER, which is the default)? There is a rendering potentially stored for each cache key when the cache key differs per request.
Some of the dtml methods being cached have dtml namespace cache keys set (no request vars are used) and appear on all accounts to being working as expected, with the expected number of enteries for these dtml methods. Most create one entry, a few others create anywhere between 5-20 enteries as the dtml keys change as I would expect. The dtml vars are set by dtml-let statements before the method or script is called.
Maybe try storing your "transient object container" in the "main" ZODB instead of storing it in /temp_folder/session_data (which is backed entirely by RAM). I don't recommend doing this long-term but it might be helpful during diagnosis.
Do you mean to just move the "transient object container" to the zope root folder? What is bad about doing this?
The only other thing I can suggest is to try to make a reproducible test case and post the steps (and if possible a .zexp) in the Zope.org collector.
Richard.
On Mon, 2004-05-17 at 08:26, Richard Ettema wrote:
Not really, unfortunately. How are you determining memory usage? Do you have any "cache keys" for your cache managers (except AUTHENTICATED_USER, which is the default)? There is a rendering potentially stored for each cache key when the cache key differs per request.
Some of the dtml methods being cached have dtml namespace cache keys set (no request vars are used) and appear on all accounts to being working as expected, with the expected number of enteries for these dtml methods. Most create one entry, a few others create anywhere between 5-20 enteries as the dtml keys change as I would expect. The dtml vars are set by dtml-let statements before the method or script is called.
Maybe try not using the various cache keys as a test?
Maybe try storing your "transient object container" in the "main" ZODB instead of storing it in /temp_folder/session_data (which is backed entirely by RAM). I don't recommend doing this long-term but it might be helpful during diagnosis.
Do you mean to just move the "transient object container" to the zope root folder? What is bad about doing this?
Just create another one (add a 'Transient Object Container' from the Addd.. list) ; don't move the existing one. Nothing is "bad" about doing this, it's just that sessions are very write-intensive so having a TOC in the "main" ZODB will bloat the database quickly. - C
Some of the dtml methods being cached have dtml namespace cache keys set (no request vars are used) and appear on all accounts to being working as expected, with the expected number of enteries for these dtml
Thats could be the problem. If you pass the wrong thing to the dtml namespace cache keys you will end up in a world of pain. I suggest adding some debugging statements inside ZCacheable_set or ZDocumentTemplate_afterRender to see what that keywords dictionary you're passing in actually looks like to the the RCM. If the RCM ends up calling str() on any objects that get repr'd like <Foo instance at 6a3463> then you're going to see a lot of cache misses and spurious growth as your python process grows in size and relocates its objects in memory. This is the same problem facing PageTemplates that I wrote about last year in http://marc.theaimsgroup.com/?l=zope&m=105460381811223&w=2
Is this a possibility even when the "Entries" and "Misses" column counts are what I would expect for the possible cache key combinations from dtml? Entry counts range from 1 to approx 20, as I would expect for the possible cache keys being passed. None of the entries or misses column counts climb during testing. Does this problem occur in the background with no visible signs? Thanks for your help on this. Richard.
Richard Ettema wrote:
Some of the dtml methods being cached have dtml namespace cache keys set (no request vars are used) and appear on all accounts to being working as expected, with the expected number of enteries for these dtml
Thats could be the problem. If you pass the wrong thing to the dtml namespace cache keys you will end up in a world of pain. I suggest adding some debugging statements inside ZCacheable_set or ZDocumentTemplate_afterRender to see what that keywords dictionary you're passing in actually looks like to the the RCM. If the RCM ends up calling str() on any objects that get repr'd like <Foo instance at 6a3463> then you're going to see a lot of cache misses and spurious growth as your python process grows in size and relocates its objects in memory. This is the same problem facing PageTemplates that I wrote about last year in http://marc.theaimsgroup.com/?l=zope&m=105460381811223&w=2 I solved it by writing a cache manager that gave the user more control over how cache keys were handled. Many times a simple str() is just too simplistic. The good news is you're still using 2.6 so you could use my cache manager (MemoryCache), the bad news is it requires patches to the core caching api (backwards compatible with everything I've tested though) and python 2.2.x not 2.1.x and not 2.3.x. And it doesn't work in 2.7 yet for various and sundry issues to do with PageTemplateFile and its management interfaces. But first things first, add some debugging and find out if the behavior you're seeing is actually attributeble to the same thing I mentioned above. -- Jamie Heilman http://audible.transient.net/~jamie/ "You came all this way, without saying squat, and now you're trying to tell me a '56 Chevy can beat a '47 Buick in a dead quarter mile? I liked you better when you weren't saying squat kid." -Buddy
Richard Ettema wrote:
There are 2 RAM Cache Managers, one which caches for 15 mins and one for 24 hours. Both cache 20-30 DTML Methods or Python Scripts that return either lists, datetime objects, dictioneries, or rendered dtml. Some of the dtml calls the catalog multiple times, I can't see why, but would this be a problem? (I also tried changing everything to strings to see if it was a problem with caching objects like datetime and dictionaries, but no change)
Richard, How many items end up in the cache after each test? If you look at the statistics tabs on the caches you should be able to see how many items are beeing cached in each case. -Matt -- Matt Hamilton matth@netsight.co.uk Netsight Internet Solutions, Ltd. Business Vision on the Internet http://www.netsight.co.uk +44 (0)117 9090901 Web Design | Zope/Plone Development & Consulting | Co-location | Hosting
participants (4)
-
Chris McDonough -
Jamie Heilman -
Matt Hamilton -
Richard Ettema