[ZODB-Dev] BTree pickle size

Roché Compaan roche at upfrontsystems.co.za
Sat Aug 23 13:31:12 EDT 2008


On Sat, 2008-08-23 at 14:09 +0200, Dieter Maurer wrote:
> Roché Compaan wrote at 2008-8-22 14:49 +0200:
> >I've been doing some benchmarks on Plone and got some surprising stats
> >on the pickle size of btrees and their buckets that are persisted with
> >each transaction. Surprising in the sense that they are very big in
> >relation to the actual data indexed. I would appreciate it if somebody
> >can help me understand what is going on, or just take a look to see if
> >the sizes look normal.
> >
> >In the benchmark I add and index 10000 ATDocuments. I commit after each
> >document to simulate a transaction per request environment. Each
> >document has a 100 byte long description and 100 bytes in it's body. The
> >total transaction size however is 40K in the beginning. The transaction
> >sizes grow linearly to about 350K when reaching 10000 documents.
> 
> The "Bucket" nodes store usually between 22 ("OOBucket") and 90 ("IIBucket")
> objects in a single bucket.
> 
> With any change, the transaction will contain unmodified data
> for several dozens other objects.

Are you saying *all* 22 OOBuckets and 90 IIBuckets will be persisted
again whether they are modified or not?

> 
> >What concerns me is that the footprint of indexed data in terms of
> >BTrees, Buckets and Sets are huge! The total amount of data committed
> >that related directly to ATDocument is around 30 Mbyte. The total for
> >BTrees, Buckets and IISets is more than 2 Gbyte. Even taking into
> >account that Plone has a lot of catalog indexes and metadata columns (I
> >think 71 in total), this seems very high. 
> >
> >This is a summary of total data committed per class:
> >
> >Classname,Object Count,Total Size (Kbytes)
> >BTrees._IIBTree.IISet,640686,1024506
> 
> A typical "IISet" contains 90 value records and a persistent reference.
> 
> I expect that an integer is pickled in 5 bytes. Thus, about 0.5 kB
> should be expected as typical size of an "IISet".
> Your "IISet" instances seem to be about 1.5 kB large.
> 
> That is significantly larger than I would expect but maybe not
> yet something to worry about.

It looks like there is something to be worried about since there are
quite a few IISet instances that are larger than 0.5 kB. Some are as
large as 50K! Here are some lines from fsdump:

  data #00033 oid=0000000000001d65 size=50058
class=BTrees._IIBTree.IISet
  data #00034 oid=0000000000001d66 size=50058
class=BTrees._IIBTree.IISet
  data #00111 oid=0000000000001e0b size=50023
class=BTrees._IIBTree.IISet
  data #00033 oid=0000000000001d65 size=50063
class=BTrees._IIBTree.IISet
  data #00034 oid=0000000000001d66 size=50063
class=BTrees._IIBTree.IISet
  data #00109 oid=0000000000001e0b size=50028
class=BTrees._IIBTree.IISet
  data #00035 oid=0000000000001d65 size=50068
class=BTrees._IIBTree.IISet


> >BTrees._IIBTree.IIBucket,252121,163524
> 
> The same size reasoning applies to "IIBucket"s: 90 records, but
> now consisting of key and value (about 10 bytes).
> 
> Your "IIBuckets" are smaller than one would expect.

But that is supposedly ok?

I am curious to know if you can explain why the proportion of actual to
total transaction size is so small?

-- 
Roché Compaan
Upfront Systems                   http://www.upfrontsystems.co.za



More information about the ZODB-Dev mailing list