[ZODB-Dev] BTree memory bomb
Jeremy Hylton
jhylton at gmail.com
Tue Jan 18 09:41:01 EST 2005
On Tue, 18 Jan 2005 15:25:01 +0000, Simon Burton <simon at arrowtheory.com> wrote:
> I did a test (below) to see if BTree would unload it's objects as it grew large. No luck, I
> killed the script once it had taken 80% of memory.
A BTree loads a bucket at a time. By default, ZODB will cache some
number of 1st class persistent objects in memory (400?). All the 2nd
class persistent objects reachable from a 1st class object are also
cached in memory. In particular, if the keys and values stored in a
BTree bucket are 2nd class persistent objects (that is, regular Python
objects), they will all be kept in memory along with the bucket that
contains them.
> data = OOBTree()
>
> root[0] = data
> print "data:", len(data)
>
> f = open('/dev/zero')
>
> for i in xrange(10000):
> for j in xrange(10000):
> data[i*10000+j] = f.read(i*128)
> get_transaction().commit()
> print "data:", len(data)
>
In your example, the keys 10,000,000 through 10,009,999 each store a
128KB string as the value. If a single bucket holds 50 items (I'm
making that number up), it will use roughly 40MB of memory. The
program is going to make thousands of buckets that require this much
memory or more, so it's not going to be effective at limit memory
consumption.
If you want objects to be moved in and out of memory independently,
you need to make them inherit from Persistent. If the values were 1st
class persistent objects:
class PersistentData(Persistent):
def __init__(self, data):
self.data = data
then you'd see much less memory consumption, because a bucket could be
loaded without also loading all of its contents.
If the values store in the bucket are small -- say a 3-tuple of ints
-- then it doesn't make much difference. The collections of all the
keys and values in a bucket don't use much memory anyway. In you
case, the values were very large, so it's makes a big difference.
Jeremy
More information about the ZODB-Dev
mailing list