[ZODB-Dev] Re: Use of fsync in FileStorage

Casey Duncan casey at zope.com
Thu Jul 29 10:17:53 EDT 2004


Some timing on a 700MHz notebook with a 30gig drive running FreeBSD
5.1/ufs:

Stock ZODB:  57.9809 seconds, 172.471 txn/sec
Without fsync: 36.0513 seconds, 277.383 txn/sec

(I didn't bother with two fsyncs)

The former was I/O bound and the latter CPU bound.

I definitely think this means we should consider a knob for those
willing to run fast and loose versus those wanting to trade performance
for better data integrity.

For development, testing, loading and database intensive tasks like
cataloging the "fast and loose mode" could be pretty desireable, with
the caveat that there may be little real difference between the two with
a decent disk controller.

-Casey

On Wed, 28 Jul 2004 17:47:21 -0400
"Tim Peters" <tim at zope.com> wrote:

> OK, I timed different fsync() strategies on WinXP Pro.  Remember that
> Windows doesn't actually have fsync(), Python maps its os.fysnc() to
> the Win32 FlushFileBuffers(), indirectly via MS C's _commit()
> function.
> 
> The box here is a beefy laptop, WinXP Pro SP1, 3.2GHz P4
> hyper-threaded, 1GB RAM, 80GB IDE disk w/ 8MB cache.
> 
> Using current (Zope 2.7 branch HEAD) FileStorage code:
> 
> C:\Code\ZODB3.2>timefsync.py
> Doing 10000 transactions, timed with time.clock()
> 323.649 seconds, 30.8977 txn/sec
> 
> The process never got above 1% CPU usage, and the disk was busy the
> whole time, so this was clearly I/O-bound.
> 
> After adding a second fsync, in tpc_vote():
> 
>                 self._file.flush()   # existing line
>                 if fsync is not None: fsync(self._file.fileno()) # new
> 
> C:\Code\ZODB3.2>timefsync.py
> Doing 10000 transactions, timed with time.clock()
> 666.169 seconds, 15.0112 txn/sec
> 
> So, contrary to hopes, adding a second fsync() cut the txn rate in
> half.
> 
> Finally, commenting out both fsync()'s in FileStorage.py:
> 
> C:\Code\ZODB3.2>timefsync.py
> Doing 10000 transactions, timed with time.clock()
> 3.49118 seconds, 2864.36 txn/sec
> 
> Yikes!  That's near a factor of 100 higher than the one-fsync case. 
> In this case it appeared CPU-bound.
> 
> Here's the driver.  Before running it each time, I deleted all Data.*
> files:
> 
> """
> import sys
> if sys.platform == "win32":
>     from time import clock as now
> else:
>     from time import time as now
> 
> import ZODB
> from ZODB.FileStorage import FileStorage
> 
> from BTrees.IIBTree import IIBTree
> 
> N = 10000
> 
> st = FileStorage('Data.fs')
> db = ZODB.DB(st)
> cn = db.open()
> rt = cn.root()
> 
> rt['tree'] = t = IIBTree()
> get_transaction().commit()
> 
> print "Doing %d transactions, timed with time.%s()" % (N,
> now.__name__) start = now()
> for i in xrange(N):
>     t[i] = i
>     get_transaction().commit()
> finish = now()
> 
> elapsed = finish - start
> print "%g seconds, %g txn/sec" % (elapsed, N / elapsed)
> """
> 
> This commits more-or-less "average size" transactions, but on the high
> side(about 2KB per transaction record).
> 
> Any use of os.fsync is clearly a txn-rate disaster on my WinXP box. 
> It would be great if readers here tried it on their boxes and reported
> results: various Linux flavors with various filesystems, Solaris
> variants, Windows boxes with "serious" disk systems, whatever matters
> to you in practice.
> 
> _______________________________________________
> For more information about ZODB, see the ZODB Wiki:
> http://www.zope.org/Wikis/ZODB/
> 
> ZODB-Dev mailing list  -  ZODB-Dev at zope.org
> http://mail.zope.org/mailman/listinfo/zodb-dev
> 



More information about the ZODB-Dev mailing list