[ZODB-Dev] Berkley Transactions slow to commit?
Chris Withers
chrisw@nipltd.com
Tue, 30 Oct 2001 14:32:46 +0000
Hi Barry,
Having had more than my fair share of fun'n'games with FileStorage, I'm now
experiencing different fun'n'games with BerkleyStorage ;-)
The problems I face are these:
I need to index about 30,000 documents. I'm doing this using a python script
(not a (script) python ;-)
that imports Zope and hence uses custom_zodb.py to open a Full berkley storage.
I figured doing all 30,000 documents in one transaction wasn't a good idea, so I
was trying do them in batches of 500. After each batch I'd do a
get_transaction().commit().
First problem, I kept on running out of locks doing this. So, I bumped the lock
settings up to:
set_lk_max_locks 1000000
set_lk_max_objects 100000
set_lk_max_lockers 100
...this stopped the error, but the python process chewed through 220Mb of RAM.
...so I dropped it down to only 50 documents per batch and dropped the lock
settings down by a factor of 10.
Now I'm only using 100Mb of memory but still:
- Indexing 50 documents takes, on average, 3 minutes
- calling get_transaction().commit() takes, on average, 13-20 minutes(!!)
- app._p_jar.cacheMinimize(3) takes, on average, 20 seconds.
Here's a snapshot of the top of a top output:
load average: 1.20, 1.16, 1.11
47 processes: 45 sleeping, 2 running, 0 zombie, 0 stopped
CPU states: 0.0% user, 4.3% system, 6.1% nice, 89.5% idle
Mem: 899980K av, 896924K used, 3056K free, 0K shrd, 4480K buff
Swap: 2097136K av, 393452K used, 1703684K free 647220K cached
PID USER PRI NI SIZE RSS SHARE STAT LIB %CPU %MEM TIME COMMAND
13883 root 17 5 117M 117M 17944 R N 0 8.5 13.3 125:53 python2.1
Can you (or anyone else) enlighten me as to what's going on here?
Why is the commit taking so long? How can I speed it up?
Also, in general, should you try and have a few big transactions or many small
transactions when using BerkleyDB? Does this vary depending on whether you use
Minimal or Full?
Oh, and while I remember, should I use Minimal or Full if I want a simple,
efficient, non-versioning storage?
cheers,
Chris