(cc the direstorage-users list too) On Friday 31 January 2003 5:07 pm, Paul Winkler wrote:
**HOWEVER** you are concerned about performance, and DS is even slower than FileStorage for writes. From the DS FAQ:
"""Intermittant writes are a factor of 1.5 slower. ... Under high write pressure the journal queue becomes a bottleneck, and performance degrades to 3 times slower than FileStorage. """
The question then becomes, what is "high write pressure"?
A benchmark that bombards the storage with nothing but writes sufficient to saturate the disk interface. Most production loads dont look like that. The storage probably spends some of its time handling reads, and some (most?) time idle. DirectoryStorage is optimised for writes that come in bursts. It reduces the latency of individual writes within the burst, under the assumption that it can do the rest of the work asynchronously once the burst is over. The 3x slowdown applies if the 'burst' goes on too long. Yes, you can configure the size of a burst. (for what its worth, I expect to be able to improve on that 1.5x with the latest reiserfs kernel patches)
And what does 3x slower than FS feel like to the user?
A typical human Zope user wont notice. Most of the time in a Zope request is spend in DTML processing, application logic, traversing, and security checks. In my experience only a small proportion of the time is spent in the storage. 3 times small is still small. Expect something like 3x for scripts that perform many writes. The write response profile changes once the storage is pushed into this 3x mode in its default configuration. Some writes will be much slower than others, and this will be noticeable to a human. The cause and effect are analogous to virtual memory thrashing. This can be tweaked, but I doubt anyone will need to.
given what the DS FAQ says about write performance, I'd look into setting up a test server and bombard it with automated writes to see if it will handle the load you anticipate. But of course you were going to do that anyway. ;)
Indeed. Please share your results. Note than for modern storages it is important to measure the performance under a realistic load, rather than applying a huge load and seeing where it saturates. The original Berkeley storage benchmarks were bogus (imo) for this reason. -- Toby Dickenson http://www.geminidataloggers.com/people/tdickenson