[ZODB-Dev] Re: Use of fsync in FileStorage

Tim Peters tim at zope.com
Fri Jul 30 15:42:53 EDT 2004


[Casey Duncan]
> Yeah, I wondered how adding one integer to the BTree (which presumably
> mutates one bucket in the common case) resulted in a 2Kb transaction.

Yup, "the average" transaction there commits just one bucket.  But it's an
IIBucket, and those are fat.  Given the specific pattern of additions, "on
average" the bucket contains 90 key+value pairs, or 180 integers, and at 8
bytes apiece I figured that would be close to 2KB.  Heh.  I worked too many
years on 64-bit boxes!  Thanks to pickle magic, it's closer to 3 bytes per
integer (via the BININT2 pickle code, and all the ints in the test driver
were small enough to use BININT2).

> I actually tried larger transactions (about 100Kb).

I think that's much closer to "the average" transaction size in a production
Zope site, but also that the mean is a misleading measure.  Let's see ... I
have 4 FileStorages here from 4 very different production sites:

                        txn size
               -------------------------------
#txn  .fs size  min       max     mean  median
---- --------- ---- --------- --------  ------
7535 995683830  155 156552882 132141.2    5137
  51   5879455  233   3697063 115283.4    4627
2120 145630793  157  10552388  68693.8    6339
1592 150777347  157  10055707  94709.4    5889

So half the txns are less than 4-6KB, but very large txns inflate the mean.
By "txn size" here, I mean the whole thing, including transaction header
overheads:  every byte on disk.

Have to note the relatively low total #txn counts too.  

> As I increased the
> transaction size, the difference between having fsync and not shrunk
> considerably. This is not too surprising since presumably larger writes
> fill the disk cache quicker resulting in more flushing and syncing by the
> OS, making ZODB's syncing less aggressive by comparison.

That makes good sense.  "Even on Windows", the difference between 0 and 2
fsyncs on txn rate dropped from a factor of 200 to a factor of 12 as the
average txn size went from 1K to 100K.  So the test driver is showing fsync
in the worst possible light.

Enough data.  Who's got a production site where they're willing to *try*
running with 0 fsyncs, or 2?  It would be nice if you were also willing to
pull the plug while it's running <wink>.



More information about the ZODB-Dev mailing list