[Zope-dev] zope2 webdav memory usage...

Fri Jan 11 13:08:39 EST 2008

On Jan 11, 2008, at 9:24 AM, Jim Fulton wrote:

>
>> While messing with Zope2's webdav implementation, I ran across this  
>> bit of memory-management code:
>>
>>  dflag = hasattr(ob, '_p_changed') and (ob._p_changed == None)
>>  ... stuff ...
>>  if dflag:
>>      ob._p_deactivate()
>>
>> I actually think this should be:
>>
>>  dflag = not getattr(ob, '_p_changed', None)
>>  ... stuff ...
>>  if dflag:
>>      ob._p_deactivate()
>>
>> .. because when _p_changed on a persistent object (as I read it in  
>> the persistence interface file) is None, the object is a ghost.   
>> The object will never be a ghost here because non-_p_ attributes  
>> are looked up on it before we check for _p_changed, and even if it  
>> was a ghost, deactivating a ghost (to turn it into a ghost) is just  
>> not useful.  I think the only time we don't want to deactivate it  
>> is if it has been changed (when _p_changed is True).  Or at least  
>> that seems to be the intent.
>>
>> Does this sound right?
>
> No.
>
> The important thing to note above is that the check is made *before*  
> ".. stuff ..". The idea is that we want to free any objects we  
> loaded while doing whatever we are doing.  We check if an object is  
> a ghost before doing something and then, if it was a ghost before,  
> we try to make it a ghost when we're done. We try to leave the  
> object as we found it.  With your change, we'd be ghostifying  
> objects that were presumably useful, because they were in the cache,  
> and we'd be failing to ghostify objects that we resurrected just for  
> the operation.

I see.  The earliest place I've seen this code is almost ten years  
old, it's amazing you remember.  The intent was misunderstood over the  
years, because code like this got into various places:

             for ob in obj.listDAVObjects():
                 if hasattr(ob,"meta_type"):
                     if ob.meta_type=="Broken Because Product is Gone":
                         continue
                 dflag=hasattr(ob, '_p_changed') and (ob._p_changed ==  
None)

the (ob._p_changed == None) will never be true here as we will have  
woken it up by looking for 'meta_type'.  This is what got me confused  
initially.

> With your change, we'd be ghostifying objects that were presumably  
> useful, because they were in the cache, and we'd be failing to  
> ghostify objects that we resurrected just for the operation.

The former is true, the latter no.  Objects that are resurrected just  
for the operation would be ghostified too.  Only objects that were  
_p_changed = True would not be.  But it doesn't matter, I just needed  
to understand the intent, and now I do.  I don't intend to check my  
competing example code in, I'll just fix the code around the places  
where the pattern was misunderstood.

Related.... but fuzzier... is it your expectation that the amount of  
memory used for a database walk routine that tries to do memory  
management via some combination of connection.cacheMinimize-or- 
cacheGC() every-n-iterations (no calls to individual objects'  
_p_deactivate) should be close to one which uses _p_deactivate  
aggressively against the objects being walked?  In my experience, no  
combination of the "aggregate" cache management calls seems to work  
nearly as well as aggressively _p_deactivate-ing walked objects  
directly while walking a large object tree (at least under ZODB 3.6).   
It seems like doing cacheMinimize, etc just doesn't really have much  
effect on real memory usage during the walk when it's the only thing  
used.  It's a difficult thing to test, as you need a truly huge  
database to finaly see the failure mode (which is that you run out of  
RAM ;-), but that's my experience anyway.

- C