Memory and Large Zope Page Templates
Hi all, I'm trying to understand some questions about Zope and Memory management. I've configured a Zope with minimal Cache settings: 100 objects of cache, 4 threads and 20 Mb of ZEO Cache. The Storage is a DirectoryStorage of 4Gb. Monitoring the Zope process, all is running fine: it takes about 30 Mb of RAM. I've see that just 2 of the threads are working. The problem begins when I execute a report. The report is a PageTemplateFile that calls a method to obtain a list of processed results and send them to the user. It takes some time of process, but it works fine. What I don't understand is why, after this report, Zope is taking 120 Mb of RAM !! Too much for the cache I have. I've look for for a memory leak, showing refcounts in method like this: def _get_refcounts(self): import sys import types d = {} sys.modules # collect all classes for m in sys.modules.values(): for sym in dir(m): o = getattr (m, sym) if type(o) is types.ClassType: d[o] = sys.getrefcount(o) # sort by refcount pairs = map (lambda x: (x[1],x[0]), d.items()) pairs.sort() pairs.reverse() return pairs But there is nothing rare. I can see how the refcounts of some objects are heavely incresed during the process of the report, but when finished they are freed. But RAM of the Zope process don't decrease. Anybody have an explanation ? It's recomended to run this kind of reports in a new process ? In this case, there are any products to help this way ? Thanks in advance Santi Camps
Santi Camps wrote:
Hi all,
I'm trying to understand some questions about Zope and Memory management. I've configured a Zope with minimal Cache settings: 100 objects of cache, 4 threads and 20 Mb of ZEO Cache. The Storage is a DirectoryStorage of 4Gb. Monitoring the Zope process, all is running fine: it takes about 30 Mb of RAM. I've see that just 2 of the threads are working.
The problem begins when I execute a report. The report is a PageTemplateFile that calls a method to obtain a list of processed results and send them to the user. It takes some time of process, but it works fine. What I don't understand is why, after this report, Zope is taking 120 Mb of RAM !! Too much for the cache I have. I've look for for a memory leak, showing refcounts in method like this:
def _get_refcounts(self): import sys import types d = {} sys.modules # collect all classes for m in sys.modules.values(): for sym in dir(m): o = getattr (m, sym) if type(o) is types.ClassType: d[o] = sys.getrefcount(o) # sort by refcount pairs = map (lambda x: (x[1],x[0]), d.items()) pairs.sort() pairs.reverse() return pairs
But there is nothing rare. I can see how the refcounts of some objects are heavely incresed during the process of the report, but when finished they are freed. But RAM of the Zope process don't decrease. Anybody have an explanation ? It's recomended to run this kind of reports in a new process ? In this case, there are any products to help this way ?
Thanks in advance
Santi Camps
If you look at Zope-dev, there have been some discussion regarding leaks and memory management. IIRC, I think that once the size of your process increases, it won't shrink... however, this does not mean that it will not reuse the same memory again... I mean, if you experience the following, it should be normal: initial size of Zope -> 30 MB you run the report: new size -> 150 MB after a while, you run again the report: new size: 150 MB if you run again the report, and you end up with a size of 270 MB (150 + 120) this is a problem. HTH Regards Marco
En/na Marco Bizzarri ha escrit:
If you look at Zope-dev, there have been some discussion regarding leaks and memory management.
IIRC, I think that once the size of your process increases, it won't shrink... however, this does not mean that it will not reuse the same memory again...
Dou you know if this is a python behaviour ? Or a generic one in process managements ?
I mean, if you experience the following, it should be normal:
initial size of Zope -> 30 MB you run the report: new size -> 150 MB after a while, you run again the report: new size: 150 MB
Yes, that's the situation when the report is executed by the same thread. But If some reports are executed at the same time in different threads, then the memory size is increased again, aprox to 30Mb + 80Mb * simultaneous_threads
if you run again the report, and you end up with a size of 270 MB (150 + 120) this is a problem.
HTH
Regards Marco
Thanks Santi Camps
Santi Camps wrote:
IIRC, I think that once the size of your process increases, it won't shrink... however, this does not mean that it will not reuse the same memory again...
Dou you know if this is a python behaviour ? Or a generic one in process managements ?
I'm going to stick my neck out and disagree with Dieter, I remember this being an artefact of the python memory manager...
Yes, that's the situation when the report is executed by the same thread. But If some reports are executed at the same time in different threads, then the memory size is increased again, aprox to 30Mb + 80Mb * simultaneous_threads
That sounds odd, although not really. Each thread keeps its own ZODB object cache. This cache has a target size which is vaguely tunable, but if a singly transaction loads lots of objects (as your report likely does) they will all end up being loaded regardless of your cache setting. The trick is not to load so many objects into memory (see low level ZODB methods for synching, subtransactions, etc) or to make the objects more lightweight (see ZCatalogs and brains in particular) cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
En/na Chris Withers ha escrit:
Santi Camps wrote:
IIRC, I think that once the size of your process increases, it won't shrink... however, this does not mean that it will not reuse the same memory again...
Dou you know if this is a python behaviour ? Or a generic one in process managements ?
I'm going to stick my neck out and disagree with Dieter, I remember this being an artefact of the python memory manager...
Yes, that's the situation when the report is executed by the same thread. But If some reports are executed at the same time in different threads, then the memory size is increased again, aprox to 30Mb + 80Mb * simultaneous_threads
That sounds odd, although not really. Each thread keeps its own ZODB object cache. This cache has a target size which is vaguely tunable, but if a singly transaction loads lots of objects (as your report likely does) they will all end up being loaded regardless of your cache setting. The trick is not to load so many objects into memory (see low level ZODB methods for synching, subtransactions, etc) or to make the objects more lightweight (see ZCatalogs and brains in particular)
cheers,
Chris
Thanks for your answer. Yes, I understant how ZODB cache works with threads. The behaviour I didn't understood is why, once transaction is finished, the ZODB Cache decrese in number of objects but the RAM process not. I'm already using brains. I've tried your suggestion of subtransactions, but there is no change (I think it's normal, I'm not changing any persistent object). I'm not sure how a _p_jar.sync() can improve something. I think that the best solution will be run the report in a new process, spawning a ZEO Client to do the work and waiting for it. Regards Santi Camps
Santi Camps wrote:
Yes, I understant how ZODB cache works with threads. The behaviour I didn't understood is why, once transaction is finished, the ZODB Cache decrese in number of objects but the RAM process not.
This is the pythonic behaviour I was talking about. If you want it fixed, bug Guido ;-)
I'm already using brains. I've tried your suggestion of subtransactions, but there is no change (I think it's normal, I'm not changing any persistent object). I'm not sure how a _p_jar.sync() can improve something.
...you need to force the cache to garbage collect before there are too many objects in memory. The catalog does this using subtransactions. there's also a cache minimize call somewhere, but you're probably better off asking on zodb-dev@zope.org about where that is...
I think that the best solution will be run the report in a new process, spawning a ZEO Client to do the work and waiting for it.
...yeah, get put the ZEO client on a different machine :-) cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
En/na Chris Withers ha escrit:
Santi Camps wrote:
Yes, I understant how ZODB cache works with threads. The behaviour I didn't understood is why, once transaction is finished, the ZODB Cache decrese in number of objects but the RAM process not.
This is the pythonic behaviour I was talking about. If you want it fixed, bug Guido ;-)
I'm already using brains. I've tried your suggestion of subtransactions, but there is no change (I think it's normal, I'm not changing any persistent object). I'm not sure how a _p_jar.sync() can improve something.
...you need to force the cache to garbage collect before there are too many objects in memory. The catalog does this using subtransactions. there's also a cache minimize call somewhere, but you're probably better off asking on zodb-dev@zope.org about where that is...
Hi again, I reply just to put the answer here, in order to be useful someone else. Following Chris instructions, I've looked at ZCatalogs code and found the way to free memory: 1) Use subtransactions when changing data, adding a get_transaction().commit(1) every X steps 2) Explicitly call the garbage collector of the connection every X steps, forcing the conection to release objects from cache and be adjusted to cache parameters defined in zope.conf. This can be done with a self._p_jar.cacheGC() (where self is a persistent object, obviously) In my case, a readonly report, the second method is enought to keep memory in place. Thanks very much to everybody Santi Camps
Santi Camps wrote at 2005-1-14 16:10 +0100:
... But there is nothing rare. I can see how the refcounts of some objects are heavely incresed during the process of the report, but when finished they are freed. But RAM of the Zope process don't decrease.
Anybody have an explanation ? It's recomended to run this kind of reports in a new process ? In this case, there are any products to help this way ?
The UNIX memory API allows to grow and shrink the heap only at one end. If there is a block (as tiny as it may be) at this end, then it prevents to give back any unused memory to the OS before that block. Usually, there is no way to compact the memory (as it is accessed directly and not via an indirection). Therefore, you rarely see that the memory allocated to a process decreases. -- Dieter
participants (4)
-
Chris Withers -
Dieter Maurer -
Marco Bizzarri -
Santi Camps