[ZODB-Dev] ZEO client leaking memory?
Chris Withers
chrisw@nipltd.com
Tue, 09 Oct 2001 17:33:54 +0100
Toby Dickenson wrote:
>
> Ive spent may weeks trying to understand how ZODB behaves in this type
> of situation. The whole system behaviour when you need to touch many
> objects in the database is one area where ZODB doesnt work well
> out-of-the-box without some tuning.
Hurm, where can I learn how to do this tuning?
> That would remove wasted disk space, but not wasted memory. If adding
> a document wastes that much disk space then I suspect a better
> solution is to improve the adding-a-document implementation.
Well, it's just BTrees changing, so maybe Jim could explain more how they
behave.
In my test rigs, I found pack was the only thign which reduced the _RAM_ used,
bizarre, I know.
> This is in Zope? You might want to make that a subtransaction commit
> to keep with Zopes assumptions that full transactions start and end on
> request buondaries:
Why? What's a REQUEST boundary in the context that this is just a python script
opening a ZEO connection and indexing a bucketload of documents?
In what way would subtransactions behave differently?
> This may also remove your need to pack the database during the work.
> Any wasted disk space is wasted in temporary files (for the
> subtransaction data), only the final copy of each object gets written
> to the database in the full commit.
Hmmm... now that is interesting... I may have to give it a go...
> That means to remove all objects from the cache associated with _p_jar
> that have not been touched in three seconds. Is that what you
> intended?
Yup.
> _p_jar.cacheMinimize() is a fairly heavy-handed way of controlling
> memory usage; Adding a sprinkling of _p_jar.cacheCG() in code that
> moves many objects into memory is a better way to deal with runaway
> memory usage: cacheCG() will do "just enough" work when memory usage
> grows, and very little work when memory usage is acceptable.
Can you explain the differences?
> Im guessing on the numbers here, but I suspect adding a:
>
> get_transaction().commit(1)
> self._p_jar.cacheGC()
>
> every 10 documents would be better.
Interesting...
> 1. What are your ZODB cache settings (size and time)
Dunno, whatever they are when you do:
import Zope
...and there's a custom_zodb.py lying around with a ClientStorage specified in
it...
> 2. How many ZODB objects make up a 'document'
the documents aren't stored in the ZODB, just indexed, using about 4-10 BTrees,
IIRC.
> 3. How much memory is used by a 'document'
How wouldI measure or work that out?
> 4. How fast are documents being added (number per minute)
As fast as they happen, it's just a for loop. Prolly about 1 a second, but this
slows down a _lot_ when the machine runs out of memory ;-)
> 5. Have you checked the size of the ZODB caches during this problem?
How can I do that?
> 6. Have you checked the reference count debugging page
Can't do that, there's no HTTP process in this ZEO client ;-)
> 7. Have you any mounted databases?
Nope...
cheers,
Chris