Hi all, i have a little problem with my production server. The memory usage of the zope processes running on this server are growing up 100K a day upto 1MB a day. How can i track down the problem. Zope is 2.3.3 Python is 1.5.2 OS is RedHat Immunix 6.2 Hope somebody could help... Thanks, -- Andre Schubert EMail: Andre.Schubert@km3.de Tel: 03774 6625-78 km3 teledienst GmbH Fax: 03774 6625-79
Finding memory leaks is an exercise in "binary search". Isolate half of the Zope (and Product) code and find out if it leaks. If it doesn't, you know the problem is in the other half. ;-) Ad infinitum. I'm sorry this is the case, but it almost always boils down to this particular recipe of diagnostics. ----- Original Message ----- From: "Andre Schubert" <andre.schubert@km3.de> To: "zope" <zope-dev@zope.org> Sent: Tuesday, March 12, 2002 10:57 AM Subject: [Zope-dev] Memory Leak Problem
Hi all,
i have a little problem with my production server. The memory usage of the zope processes running on this server are growing up 100K a day upto 1MB a day. How can i track down the problem.
Zope is 2.3.3 Python is 1.5.2 OS is RedHat Immunix 6.2
Hope somebody could help...
Thanks,
-- Andre Schubert EMail: Andre.Schubert@km3.de Tel: 03774 6625-78 km3 teledienst GmbH Fax: 03774 6625-79
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
Hi all,
i have a little problem with my production server. The memory usage of the zope processes running on this server are growing up 100K a day upto 1MB a day. How can i track down the problem. [snip]
Chris McDonough wrote:
Finding memory leaks is an exercise in "binary search". Isolate half of the Zope (and Product) code and find out if it leaks. If it doesn't, you know the problem is in the other half. ;-) Ad infinitum. I'm sorry this is the case, but it almost always boils down to this particular recipe of diagnostics.
A question and one remark. Wouldn't staring at the refcounts in the debug panel also give some info? At least for selfmade products? Something like: Acquisition.ImplicitAcquirerWrapper: 42442 OFS.DTMLMethod.DTMLMethod: 3989 ZPublisher.HTTPRequest.HTTPRequest: 3941 ZServer.HTTPResponse.ZServerHTTPResponse: 3934 ZPublisher.BaseRequest.RequestContainer: 3931 AccessControl.DTML.TemplateDict: 3930 doesn't surely look too good, heh? ;-) Yes this is from a production server, which now get's restarted as it occupies 130MB (up from 15MB in 10 days). The remark, Andre, Chris has written a very nice product, called AutoLance which helps by monitoring zope's memory usage and restarting if a configurable threshold is reached. http://www.zope.org/Members/mcdonc/Products/AutoLance We use it on a production server. cheers, oliver
A question and one remark. Wouldn't staring at the refcounts in the debug panel also give some info? At least for selfmade products?
Maybe... depends on your level of Zope Zen. It's hard to tell which classes should and should not have high refcounts. For example, in your sample output, I know that high numbers of references to ImplicitAcquirerWrappers, HTTPRequests, ZServerHTTPResponse, and RequestContainer objects likely indicate a leak of REQUEST objects somewhere in your code. OTOH, the high number of references to DTMLMethods might be totally normal. But I only know these things because I know Zope maybe better than most folks do, and I just... well... *know*. ;-) It's hard to explain the reasons I know these things to other folks (although I started on something like it at http://www.zope.org/Members/mcdonc/HowTos/PreventAndDebugMemLeaks (unfinished). That said, even if I took a day to finish my unfinished HowTo, the only bit of wisdom that I could consistently espouse given Control_Panel -> Debug output is that indeed you do have a leak. So you'd know you have a leak (which you knew in the first place). How do you fix it? I don't know. ;-) That's where the binary search comes in. Shane's LeakFinder can make the binary search a bit easier by recording states between leakfinding runs, but it's still mainly an exercise in binary search.
The remark, Andre, Chris has written a very nice product, called AutoLance which helps by monitoring zope's memory usage and restarting if a configurable threshold is reached.
If you can't solve the problem at least treat the symptoms. ;-) - C
Oliver Bleutgen schrieb:
Hi all,
i have a little problem with my production server. The memory usage of the zope processes running on this server are growing up 100K a day upto 1MB a day. How can i track down the problem. [snip]
Chris McDonough wrote:
Finding memory leaks is an exercise in "binary search". Isolate half of the Zope (and Product) code and find out if it leaks. If it doesn't, you know the problem is in the other half. ;-) Ad infinitum. I'm sorry this is the case, but it almost always boils down to this particular recipe of diagnostics.
A question and one remark. Wouldn't staring at the refcounts in the debug panel also give some info? At least for selfmade products?
Something like:
Acquisition.ImplicitAcquirerWrapper: 42442 OFS.DTMLMethod.DTMLMethod: 3989 ZPublisher.HTTPRequest.HTTPRequest: 3941 ZServer.HTTPResponse.ZServerHTTPResponse: 3934 ZPublisher.BaseRequest.RequestContainer: 3931 AccessControl.DTML.TemplateDict: 3930
The Toprefcounts are: Acquisition.ImplicitAcquirerWrapper: 2981 DocumentTemplate.DT_Util.Eval: 1379 OFS.Image.Image: 1194 ZPublisher.HTTPRequest.HTTPRequest: 916 ZServer.PCGIServer.PCGIResponse: 906 ZPublisher.BaseRequest.RequestContainer: 906 ExtensionClass: 361 DocumentTemplate.DT_In.InClass: 263 OFS.Folder.Folder: 250 OFS.DTMLDocument.DTMLDocument: 238 DocumentTemplate.DT_Var.Var: 207 OFS.DTMLMethod.DTMLMethod: 196 AccessControl.PermissionRole.PermissionRole: 184 App.special_dtml.DTMLFile: 182 Products.ZGadflyDA.gadfly.kjParser.ParseRule: 161 re.RegexObject: 153 Shared.DC.Scripts.Signature.FuncCode: 143 and so on..... My selfmade products are near the bottom of the list... Thanks
doesn't surely look too good, heh? ;-) Yes this is from a production server, which now get's restarted as it occupies 130MB (up from 15MB in 10 days).
The remark, Andre, Chris has written a very nice product, called AutoLance which helps by monitoring zope's memory usage and restarting if a configurable threshold is reached.
http://www.zope.org/Members/mcdonc/Products/AutoLance
We use it on a production server.
cheers, oliver
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
-- Andre Schubert EMail: Andre.Schubert@km3.de Tel: 03774 6625-78 km3 teledienst GmbH Fax: 03774 6625-79
On Tue, 12 Mar 2002 18:38:16 +0100, Oliver Bleutgen <myzope@gmx.net> wrote:
Acquisition.ImplicitAcquirerWrapper: 42442
That class is used to glue together acquisition content chains. Being top of the list indicates that you have been leaking an acquisition context chain. That doesnt mean that the bug relates to acquisition; acquisition just means that the leak is more severe than otherwise (because it keeps the whole context chain around).
OFS.DTMLMethod.DTMLMethod: 3989
42442/3989 = 10.6. That indicates that you are leaking an acquisition chain of roughly 10 objects.
ZPublisher.HTTPRequest.HTTPRequest: 3941 ZServer.HTTPResponse.ZServerHTTPResponse: 3934 ZPublisher.BaseRequest.RequestContainer: 3931 AccessControl.DTML.TemplateDict: 3930
The numbers for those objects are all roughly 3900. These objects are the ones referenced by that acquisition chain. I recommend you look further down that list, until you find a number that is much less than 3900. All of the objects above that mark are pieces of the jigsaw puzzle that will tell you *what* is being leaked. Thats where I would start meditating to work out *how* its being leaked.
doesn't surely look too good, heh? ;-)
No way. Memory leaks are fun. ;-) Toby Dickenson tdickenson@geminidataloggers.com
Toby Dickenson wrote:
On Tue, 12 Mar 2002 18:38:16 +0100, Oliver Bleutgen <myzope@gmx.net> wrote:
Acquisition.ImplicitAcquirerWrapper: 42442
That class is used to glue together acquisition content chains. Being top of the list indicates that you have been leaking an acquisition context chain.
That doesnt mean that the bug relates to acquisition; acquisition just means that the leak is more severe than otherwise (because it keeps the whole context chain around).
OFS.DTMLMethod.DTMLMethod: 3989
42442/3989 = 10.6. That indicates that you are leaking an acquisition chain of roughly 10 objects.
ZPublisher.HTTPRequest.HTTPRequest: 3941 ZServer.HTTPResponse.ZServerHTTPResponse: 3934 ZPublisher.BaseRequest.RequestContainer: 3931 AccessControl.DTML.TemplateDict: 3930
The numbers for those objects are all roughly 3900. These objects are the ones referenced by that acquisition chain.
I recommend you look further down that list, until you find a number that is much less than 3900. All of the objects above that mark are pieces of the jigsaw puzzle that will tell you *what* is being leaked. Thats where I would start meditating to work out *how* its being leaked.
Ah, thanks for the nice explanation, Toby. I don't want to abuse your or others time too much, but since this might be interesting in general too, I'll give some more data. To be clear, I don't want to really trace down that leak, we live with it for nearly a year now, the server will be replaced soon, but I'm smelling some zen and my appetite is whet. - I did paste only the objects which have such a high count (next one was approx 200). - This server (2.3.3) has no special products on it, just 2 Folderish ZClasses, which are heavily used (they construct the whole site). So, can we rule out that there's a bug in the zclasses machinery which is the cause for this? I.e. should there be some zclass related object in the above list otherwise? Is it right that in Zope 2.3.x (and others?) usage of <dtml-var "REQUEST.set(...)"> can lead to such leakage? Is there a description somewhere what the basic causes of such leakages are? I.e. only bugs in python c-code/zope c-code? Is there some text about what to not do in python product development to avoid introducing leakage? For example, on #zope someone pointed out to me that storing acquisition wrappers in the ZODB will lead to leakage. cheers, oliver
On Wed, 13 Mar 2002 11:37:00 +0100, Oliver Bleutgen <myzope@gmx.net> wrote:
- I did paste only the objects which have such a high count (next one was approx 200).
Hmm. Ill have to ponder what that means.
- This server (2.3.3) has no special products on it, just 2 Folderish ZClasses, which are heavily used (they construct the whole site).
So, can we rule out that there's a bug in the zclasses machinery which is the cause for this?
No, thats not impossible
I.e. should there be some zclass related object in the above list otherwise?
Not necessarily. The class with the bug need not leave an instance of himself tied up in his immortal cyclic trash.
Is it right that in Zope 2.3.x (and others?) usage of <dtml-var "REQUEST.set(...)"> can lead to such leakage?
Yes, thats true in all versions.
Is there a description somewhere what the basic causes of such leakages are? I.e. only bugs in python c-code/zope c-code?
No, its possible for a bug in through-the-web edited dtml to cause this.
Is there some text about what to not do in python product development to avoid introducing leakage? For example, on #zope someone pointed out to me that storing acquisition wrappers in the ZODB will lead to leakage.
In old versions (roughly before 2.3 or 2.4; I forget exactly when) it was possible to create a cyles involving a persistent object having a reference to an acquisition wrapper. Today that raises an exception. Toby Dickenson tdickenson@geminidataloggers.com
One more question then I'll shut up ;-). Toby Dickenson wrote:
Is there a description somewhere what the basic causes of such leakages are? I.e. only bugs in python c-code/zope c-code?
No, its possible for a bug in through-the-web edited dtml to cause this.
Waah, this is the first time I hear that. That means that even if zope/python would be bug-free, it's possible to write dtml that causes a leak? Is there a (non-esoteric) example for that? cheers, oliver
On Wed, 13 Mar 2002 13:46:27 +0100, Oliver Bleutgen <myzope@gmx.net> wrote:
Waah, this is the first time I hear that. That means that even if zope/python would be bug-free, it's possible to write dtml that causes a leak? Is there a (non-esoteric) example for that?
(untested, but the principal is sound) <dtml-call "REQUEST.set('foo',x)"> where x is some object whose acquisition context includes the REQUEST object Toby Dickenson tdickenson@geminidataloggers.com
Didnt Shane do a leak finder product that can help debug this? http://www.zope.org/Members/hathawsh On March 12, 2002 08:42 am, Chris McDonough wrote:
Finding memory leaks is an exercise in "binary search". Isolate half of the Zope (and Product) code and find out if it leaks. If it doesn't, you know the problem is in the other half. ;-) Ad infinitum. I'm sorry this is the case, but it almost always boils down to this particular recipe of diagnostics. -- Andy McKay
Andy McKay schrieb:
Didnt Shane do a leak finder product that can help debug this? http://www.zope.org/Members/hathawsh
I will install the LeakFinder product on the server and look what happens....
On March 12, 2002 08:42 am, Chris McDonough wrote:
Finding memory leaks is an exercise in "binary search". Isolate half of the Zope (and Product) code and find out if it leaks. If it doesn't, you know the problem is in the other half. ;-) Ad infinitum. I'm sorry this is the case, but it almost always boils down to this particular recipe of diagnostics. -- Andy McKay
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
-- Andre Schubert EMail: Andre.Schubert@km3.de Tel: 03774 6625-78 km3 teledienst GmbH Fax: 03774 6625-79
Andre Schubert schrieb:
Andy McKay schrieb:
Didnt Shane do a leak finder product that can help debug this? http://www.zope.org/Members/hathawsh
I will install the LeakFinder product on the server and look what happens....
If have installed LeakFinder this morning and restarted zope. My processes are grown up from 48MB to 78MB in about 4 hours, it never was so much ( the last time 10MB per week! ) In this time i got the following "top 10" refcounts from leakfinder. OFS.Image.Image 36319 DocumentTemplate.DT_Util.Eval 9802 DateTime.DateTime.DateTime 6539 OFS.DTMLDocument.DTMLDocument 5454 OFS.Folder.Folder 5386 OFS.DTMLMethod.DTMLMethod 4850 DocumentTemplate.DT_In.InClass 1725 Products.ZSQLMethods.SQL.SQL 996 Products.TransparentFolders.TransparentFolder.TransparentFolder 458 DocumentTemplate.DT_Let.Let 398 Hope this list helps ... Thanks
On March 12, 2002 08:42 am, Chris McDonough wrote:
Finding memory leaks is an exercise in "binary search". Isolate half of the Zope (and Product) code and find out if it leaks. If it doesn't, you know the problem is in the other half. ;-) Ad infinitum. I'm sorry this is the case, but it almost always boils down to this particular recipe of diagnostics. -- Andy McKay
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
-- Andre Schubert EMail: Andre.Schubert@km3.de Tel: 03774 6625-78 km3 teledienst GmbH Fax: 03774 6625-79
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
-- Andre Schubert EMail: Andre.Schubert@km3.de Tel: 03774 6625-78 km3 teledienst GmbH Fax: 03774 6625-79
Hi all, I have isolated the memory leak problem on a naked 2.3.3 Zope. I have on ZPsycopgDA and two SQL Methods, one SQL Method contains DTML-IFs and DTML-ELSEs and REQUEST.sets the other SQL Method contains the rendered query of the first, this means there is pure sql in it. Both methods cache there results 60 secs. Zope runs with 4 threads. I wrote a test-script wich opens 4 threads and each thread call the second SQL Methods "manage_test" via wget. After a restart my zope-processes are around 12MB. Then i start my script and before i receive any result all processes are grown up to 26MB. Even if after i start my script nothing happens, but before i receive the result the processes grow up, maybe i think this is the time where zope gets the results from the database and "puts" it into the Results-Class( i am right at this point ?). But is the problem in the ZSQLMethod( where the results were cached ) or the ZPsycopgDA ?? Regards, -- Andre Schubert EMail: Andre.Schubert@km3.de Tel: 03774 6625-78 km3 teledienst GmbH Fax: 03774 6625-79
On Thu, 14 Mar 2002 13:50:13 +0100, Andre Schubert <andre.schubert@km3.de> wrote:
After a restart my zope-processes are around 12MB.
Sounds about right.
Then i start my script and before i receive any result all processes are grown up to 26MB.
Thats not unreasonable for the working set of zope. I have no reason to suspect a leak. Im not sure what operating system you are using. If its linux then maybe you are misinterpreting the output of ps or top: The 26M is listed next to each process, but its the *same* 26M in each row of the table. In other words, your system is using 26M total for zope, not 26*5 (assuming you have 5 processes). If Zope grows by 12M for each request then thats a different scenario. 12M growth on the *first* request is quite normal. Toby Dickenson tdickenson@geminidataloggers.com
Toby Dickenson schrieb:
On Thu, 14 Mar 2002 13:50:13 +0100, Andre Schubert <andre.schubert@km3.de> wrote:
After a restart my zope-processes are around 12MB.
Sounds about right.
Then i start my script and before i receive any result all processes are grown up to 26MB.
Thats not unreasonable for the working set of zope. I have no reason to suspect a leak.
Im not sure what operating system you are using. If its linux then
RedHat Immunix 6.2
maybe you are misinterpreting the output of ps or top: The 26M is listed next to each process, but its the *same* 26M in each row of the table. In other words, your system is using 26M total for zope, not 26*5 (assuming you have 5 processes).
If Zope grows by 12M for each request then thats a different scenario.
12M growth on the *first* request is quite normal.
Yes, i believe, but every time i run that script the memory usage of zope( all processes ) grows up. If the script was ended and zope grows up about 1MB would i think that each request for the ZSQL Methods grows up zope about 250K and 250K * 4 = 1MB. My problem is that every time after running the script zope grows up. Sometimes zope grows up about 1 or 2MB and sometimes only 100K, but zope is growing up every time a run my test-script. On our production server the last time zope grows up to 100MB in 6 hours, and maybe thats not so good.
Toby Dickenson tdickenson@geminidataloggers.com
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
Andre Schubert schrieb: If have changed the that it skips the ZSQLMethod and sends the query used in the ZSQLMethod direct to the manage_test function from the ZPsycopgDA with the result that zope grows up after running the script. If it is possible that the mem-leak is somewhere in the DA, then i post this message to the psycopg mailinglist. Thanks
Hi all,
I have isolated the memory leak problem on a naked 2.3.3 Zope. I have on ZPsycopgDA and two SQL Methods, one SQL Method contains DTML-IFs and DTML-ELSEs and REQUEST.sets the other SQL Method contains the rendered query of the first, this means there is pure sql in it. Both methods cache there results 60 secs. Zope runs with 4 threads.
I wrote a test-script wich opens 4 threads and each thread call the second SQL Methods "manage_test" via wget.
After a restart my zope-processes are around 12MB. Then i start my script and before i receive any result all processes are grown up to 26MB.
Even if after i start my script nothing happens, but before i receive the result the processes grow up, maybe i think this is the time where zope gets the results from the database and "puts" it into the Results-Class( i am right at this point ?).
But is the problem in the ZSQLMethod( where the results were cached ) or the ZPsycopgDA ??
Regards,
-- Andre Schubert EMail: Andre.Schubert@km3.de Tel: 03774 6625-78 km3 teledienst GmbH Fax: 03774 6625-79
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
participants (5)
-
Andre Schubert -
Andy McKay -
Chris McDonough -
Oliver Bleutgen -
Toby Dickenson