All, (I am posting to both the general Zope and Plone lists, I hope that's all right. My questions are more directly Zope issues, but in case someone who only looks only at the Plone lists has a different insight to offer...) I'm the co-admin Alex Kirk was referring to in his recent post "Random Crashes/Freezes on FreeBSD 5.4-RELEASE" on the Zope list. The particular instance he was referring to was not a "random" crash but simply Zope running out of memory (with the standard 512M limit on FreeBSD, that we have not increased so far -- we have 2Gb of physical memory). I have had one truly unexplained unresponsive Zope on that system so far, but was not able to characterize it well enough to ask an intelligent question on the list about that one. I fully understand that Zope has to be restarted regularly and intend to use monit to do so, but I would like to better understand what is going on in any case, and hopefully be able to lower the required frequency of such restarts. I have searched the lists (posts like "[Zope-dev] Very severe memory leak" and "[ plone.user] Memory leaks even with latest-everything minimalistic site?") and get the general idea, but what I have read so far does not directly explain what I am observing. I have a test machine, but for starters, I have done some analysis of memory consumption of our live site to see what normal traffic produces. I have used a script to capture the size of the Zope process every minute (from ps) and merged these results with the access log. What I see does not make sense to me in terms of object cache, or growth of refcounts. My only (entirely unverified) hypothesis is that modules like PIL grab large chunks of memory once in a while. I provide specific examples below. First, the system: Plone 2.5.1, CMF-1.6.2, Zope (Zope 2.9.5-final, python 2.4.3, freebsd5), Five 1.3.7, Python 2.4.3 (#2, Jul 2 2006, 16:14:54) [GCC 3.4.2 [FreeBSD] 20040728], PIL 1.1.5 FreeBSD 5.4 My object cache size was initially at the default of 5000 objects per threads (4 threads default). I reduced this to 2000 just to see, but as I expected, the overall memory use behavior was the same. I have made some minor customizations to Plone and added contributed as well as my own products. My use of CMFReportTool might explain some of what I see, but definitely not all. Overall, my custom memory-use/access-log shows long periods with no increase in memory at all, but there are some big jumps at times (during a one minute tick) by 5, 10, even 25 Mb! As pretty much expected, Zope never gives much memory back to the OS; that part I understand, however I would like to get it to a state where it mostly plateaus and then only slowly creeps up. Some specific examples: With memory use at 417640 Kb, no requests at all in the previous minute, and a single request for an image_view_fullscreen in that whole minute, memory use jumped by 5064 Kb to 422704. The image was definitely not that large! (access log shows size returned was 1137). While on that topic, I realize that the ATImage has to be loaded in the object cache at some point, and that if it's a large image (in the order of mega bytes), I might see the memory use jump by that much. Now assume that the cache was already full when the image was loaded, that a lot of other later requests fill the cache with much smaller objects (so the image is not cached anymore) and then later the image is requested again, my guess is that the cache should already have the space it needs to load it, i.e. should I expect to see another memory jump by the size of the image, and if so, why (where did the space it originally occupy go that it can't be reused)? Can I easily check if large objects are being split in smaller chunks as they should by the way? Other example: A few of the large jumps I noticed have as one of the request during that minute a PDF listing of some documents. I use CMFReportTool to generate the PDF. There seems to be a possible correlation there, however, the resulting PDF document is about 150K, and these memory jumps can be as large as 10 Mb! I'll have a look at my code, CMFReportTool's and ReportLab's, but that seems a bit excessive to produce that document, so I wonder if something else is going on there. Of course, there is a catalog query required to get the documents to list, but I generate the report with the metadata only ( i.e. so nothing should be loaded in the object cache). I expect some increase in memory the first time a module is loaded (like ReportLab, although doesn't that happen when Zope is launched as part of product initialization?), but that would not explain continued increases on further requests. There are also many cases where the same listing was requested with no increase in memory at all (although that may be due to caching -- I'll have to look into that; I use cacheFu but have turned Squid off for now). Other example: There are however cases where I just can't see any reason for the sudden jumps. In one case there was a ~15M increase (from 477996 to 493016) during a one minute tick, with only a few very benign requests (GETs for the sento_form, contact-info form, and a couple of very small ATDocuments. There were no memory increases for the two minutes preceding and following that occurrence. Yet another: There was a 27Mb increase in one minute (the largest during that log), nowhere close to a Zope restart (the memory use was at 374376 prior to the jump). In that case there were a lot of requests in the preceding minute, but they all look benign to me (a few image thumbs, css, js, gifs; looks as if someone did a force reload of their browser). Of course the images had to be loaded to produce the thumbs, but I am sure that they total nowhere close to 27 Mb (they were all already pretty much thumb size to start with, except our "image of the day", which might have been a couple of Mb). Is the PIL process of generating the thumbs that memory hungry? I hope not. I have also noticed a few instances where there are no requests for a few minutes but still some growth of the Zope process (not by much though). I assume that these are due to Zope internal housekeeping? I do have an automated process that loads some content, but it runs at a fixed time in the night and I have not seen large memory jumps while that runs. If someone feels like having a look, for fun ;-), that memory-use/access-log file I have been looking at is a 6 Mb HTML file (I removed some info from the access entries so they are easier to scan through, and show the memory increases/decreases as a bar graph so it's easy to scroll down the file and see big changes). The log spans one Zope restart (I see a couple big and then more little memory jumps soon after Zope has started, which makes sense to me). Other tidbits. I have seen one post stating that large file imports may increase memory use by 2 to 3 times the size of the file. I have looked at some POSTs of large images and see about that. Are these increases cumulative, or should they plateau at some point? Also it wasn't clear in that post if that was supposed to be a WebDav issue only (we are not using WebDav so far). When I flush the cache, the DateTime refcount goes down dramatically, but the top counts (ValidationChain, EmptyValidator, AttributeStorage, StringField) do not go down significantly. Alex and I are volunteers doing this work for a non-profit, so any ideas or "yep, that's normal, live with it" would be greatly appreciated! (If the latter answer, I would still like to better understand where that memory is being used; I still don't understand how it could be the object cache, which is often cited as the main cause of memory creep). Cheers, Jean