Andy Yates wrote at 2005-1-5 15:30 -0600:
... In fact we can reproduce this behavior on a fresh install. On a fresh system create a python script that
puts
data in a session object.
s = context.REQUEST.SESSION t = ' ' * 1024 s['data'] = t print "foo" return printed
Then call this script with your favorite bench marking program. We used apache bench (ab). This will cause python2.3 to consume all available memory and crash or lockup.
Can anybody else reproduce these results?
And I am not surprised at all:
You allow an unlimited number of sessions.
Your session timeout is 45 minutes
"ab" does not honour the "set cookie" requests for the sessioning. Thus, each request creates a new session.
Yes, I understand that ab will not set the cookie and each request will look like a new connection to Zope. That is not the point. The point is that whatever memory is used by these sessions should be released once the session has expired. This does not seem to happen. Memory use just grows and grows until the server dies. We turned on the debug output in Transient.py and the buckets seem to get expired as expected but like I said memory used just grows without bound. For example if I have ab make X requests, X new sessions are created and memory use goes up by 100MB. I also see about X objects in the transient object container. Now I wait for the sessions to expire based on the "Data object timeout value". Then I make a few calls to tickle the Transient system's garbage collection. The objects in the toc goes down but memory use does not. Some have posted that python does not release memory it just reuses it. Well, when I make X more requests the memory is not reused. Now it is using 200MB.
This way, it should not be difficult to use a large amount of RAM. My Zope serves about 500 request/s, 30.000 request/min and 1.350.000 requests in 45 min. You must expect a number of sessions in this order
... In a very basic sense this is all our production server does. When a user first comes to our site we query a mysql database for several hundred values. These values are then stored in the session object as a map. All other pages are built dynamically based on the values stored in the session object.
Is the session object system just not supposed to be used like this?
We use the session system in quite a similar way -- without problems. But, we have a maximal session count of 5.000.
We, too, observe some increase in memory requirements which we do not yet understand: memory seems to increase from about 300 MB to about 600 MB within a week. Therefore, we restart once every Sunday night. But, the problem has not yet been pressing and we did not yet seriously tried to understand it.
Consider the problem pressed. ;-) Well, I guess I'll just restart Zope every day for now. Could it be possible that the BTrees code which is written in C is leaking? Memory leaked here would not show up as refcounts in Zope right? It is FULL of malloc and free calls. Isn't this a fairly new addition to Zope? Thanks again for your help and insight! Andy
[Andy Yates]
... For example if I have ab make X requests, X new sessions are created and memory use goes up by 100MB. I also see about X objects in the transient object container. Now I wait for the sessions to expire based on the "Data object timeout value". Then I make a few calls to tickle the Transient system's garbage collection. The objects in the toc goes down but memory use does not. Some have posted that python does not release memory it just reuses it. Well, when I make X more requests the memory is not reused. Now it is using 200MB.
I know as much about the gory details of Python's memory-management as anyone on Earth, so you can believe this: if anyone gives you a catchy one-line summary, don't believe them. Python has many distinct memory-management subsystems, and details are crucial. In addition, Python builds on top of the platform C library's malloc/free/realloc system, and that builds in turn on the platform operating system policies. If you pursue it, you'll discover that it's a long discussion just trying to figure out what "release memory" could possibly mean. That said, it's not generally expected that the virtual memory highwater mark (which I'm guessing you're really talking about when you say "now it is using 200MB") will grow rapidly and consistently.
... Could it be possible that the BTrees code which is written in C is leaking?
Yes -- and if you use an old enough version of ZODB, it's guaranteed to leak <wink>.
Memory leaked here would not show up as refcounts in Zope right?
For the most part, it should: BTrees are built out of Python objects, each with its own refcount (as is true of all Python objects). To Zope, BTree building blocks look like all other kinds of persistent Python objects. I haven't done this for a while, but when leaks in the BTree code were still suspected (they truly aren't any more), I used to run the ZODB test suite under a debug-build ZODB and a debug-build Python, for days at a time in a loop. There are special gimmicks you can use in a debug-build Python to detect leaks. This did find leaks, too. I stopped doing this after the last leak it pointed at took days to track down, and turned out to be in a BTree endcase soooooo rare that it may never have occurred in real life (but did occur once in one of the test cases for "pathological" BTrees). That doesn't mean there can't be more leaks, but it would be pretty amazing if there were still a leak on any common path thru the BTree code.
It is FULL of malloc and free calls. Isn't this a fairly new addition to Zope?
No. The BTree code is really part of ZODB, and has been there (as far as I know) forever; Jeremy Hylton and I did a lot concentrated work to squash memory leaks in ZODB (including its BTree code) a couple years ago. The BTree code has been very stable (== has been changed very little) since then.
Andy Yates wrote at 2005-1-6 16:19 -0600:
...
Andy Yates wrote at 2005-1-5 15:30 -0600:
... In fact we can reproduce this behavior on a fresh install. On a fresh system create a python script that puts data in a session object.
s = context.REQUEST.SESSION t = ' ' * 1024 s['data'] = t print "foo" return printed
Then call this script with your favorite bench marking program. We used apache bench (ab). This will cause python2.3 to consume all available memory and crash or lockup.
Can anybody else reproduce these results?
I will look into this... -- Dieter
Andy Yates wrote at 2005-1-6 16:19 -0600:
... Can anybody else reproduce these results?
You have been right. The leak is caused by a bug in "tempstorage.TemporaryStorage.TemporaryStorage._takeOutGarbage" It performs the zero refcount check for a child of a deleted object *before* it has decremented the refcount. As a consequence, it does not release children of a deleted object. I attach a script that reproduces the problem in an easier way. Calling it with "bin/zopectl run" will consume about 200 MB of RAM for about 5 min. I also attach a patch. I will file a bug report. -- Dieter
Dieter Maurer wrote:
The leak is caused by a bug in "tempstorage.TemporaryStorage.TemporaryStorage._takeOutGarbage"
It performs the zero refcount check for a child of a deleted object *before* it has decremented the refcount. As a consequence, it does not release children of a deleted object.
I attach a script that reproduces the problem in an easier way. Calling it with "bin/zopectl run" will consume about 200 MB of RAM for about 5 min.
I ran the script on Linux against both Zope 2.7.0 and 2.7.3 and according to top only some 21 MB of RAM was used up. (I ran "bin/zopectl run tempstorebug.py" in the instance home.) We have a similar problem w/ Zope leaking memory and our application uses sessions a fair bit, so better ways to confirm that this patch fixes the problem would be appreciated. TIA, John
John Barham wrote at 2005-1-13 00:29 +0900:
... I ran the script on Linux against both Zope 2.7.0 and 2.7.3 and according to top only some 21 MB of RAM was used up. (I ran "bin/zopectl run tempstorebug.py" in the instance home.) We have a similar problem w/ Zope leaking memory and our application uses sessions a fair bit, so better ways to confirm that this patch fixes the problem would be appreciated.
Andy Yates reported that applying the patch removed his problem... If this is not enough to convince you, you might add an additional persistent object between the session and the string. I made the test with our own Transience implementation. I introduces such an intermediate persistent object (holding the timestamp for the session). Maybe, the standard "Transience" does not... -- Dieter
participants (4)
-
Andy Yates -
Dieter Maurer -
John Barham -
Tim Peters