[ZODB-Dev] BTree memory bomb
Tim Peters
tim at zope.com
Tue Jan 18 20:52:22 EST 2005
[Simon Burton]
> Aha! yes, it was the len(BTree) that kept the thing in memory. Now when I
> run it (the original script without the len), the DB file quickly grows
> to 700Mb, but memory usage only gets to around 50Mb.
If that's good enough for you, it's good enough for me <wink>.
> I guess I should have mentioned, the application is for a web cache,
> which I foresee growing easily to the gigabyte range. I don't
> particularly need to know the len of it, and if I did I could store that
> in a counter. But, it was important to test useing big and distinct
> values, not just 'abc', as this does not "memory bomb".
A few things:
- Multi-gigabyte .fs files are common. It's individual multi-gigabyte
transactions that are rare.
- Distinct values didn't matter. References to persistent objects
(what Jeremy called "first class": the type is a subclass of Persistent)
are shared, but "second class" persistent objects (all others, like
Python strings or integers) are stored in the database by value. So
in my variant of your program, a distinct 3-character "abc" string
was stored in the database in every BTree entry. ZODB stores a
general rooted object graph, but there's only one incoming arc on
each second-class persistent object.
- If you want a web cache like, say, Squid, use Squid.
- You'll eventually want to use ZEO, and storing large blobs of text
in ZODB is problematic for several reasons, partly that using ZEO
to transport large blobs of text across a network isn't particularly
efficient. I'm not sure you _do_ want to store large blobs of text,
but if you do, schemes other than direct storage of giant strings
should be considered. For example, store file paths, and then you
can naturally exploit your operating system's file caching.
> I see there are still finer issues to consider, such as index size (??)
> and Storage backend, but now at least the cache can grow much bigger than
> memory available, so that's great.
Ya, there are lots of details, but they all pale compared to avoiding
len(BTree) (which can be disastrous). If you're going to use FileStorage
(most people do), you should find this helpful:
http://zope.org/Wikis/ZODB/FileStorageBackup
More information about the ZODB-Dev
mailing list