[Zodb-checkins] CVS: ZODB3/ZEO - README.txt:1.5
Jeremy Hylton
jeremy at zope.com
Tue May 27 17:11:45 EDT 2003
Update of /cvs-repository/ZODB3/ZEO
In directory cvs.zope.org:/tmp/cvs-serv5512/ZEO
Modified Files:
README.txt
Log Message:
Move cache trace documentation to doc directory.
=== ZODB3/ZEO/README.txt 1.4 => 1.5 ===
--- ZODB3/ZEO/README.txt:1.4 Thu Nov 21 13:55:27 2002
+++ ZODB3/ZEO/README.txt Tue May 27 16:11:44 2003
@@ -3,7 +3,7 @@
=======
What's ZEO?
-===========
+-----------
ZEO stands for Zope Enterprise Objects. ZEO is an add-on for Zope
that allows a multiple processes to connect to a single ZODB storage.
@@ -15,13 +15,12 @@
directory (ZEO1). Some documentation for ZEO is available in the ZODB
3 package in the Doc subdirectory. ZEO depends on the ZODB software;
it can be used with the version of ZODB distributed with Zope 2.5.1 or
-later. More information about ZEO can be found on its home on the
-web:
+later. More information about ZEO can be found in the ZODB Wiki:
- http://www.zope.org/Products/ZEO/
+ http://www.zope.org/Wikis/ZODB
What's here?
-============
+------------
This list of filenames is mostly for ZEO developers::
@@ -43,129 +42,3 @@
version.txt text file indicating the ZEO version
zrpc/ subpackage implementing Remote Procedure Call (RPC)
-Client Cache Tracing
-====================
-
-An important question for ZEO users is: how large should the ZEO
-client cache be? ZEO 2 (as of ZEO 2.0b2) has a new feature that lets
-you collect a trace of cache activity and tools to analyze this trace,
-enabling you to make an informed decision about the cache size.
-
-Don't confuse the ZEO client cache with the Zope object cache. The
-ZEO client cache is only used when an object is not in the Zope object
-cache; the ZEO client cache avoids roundtrips to the ZEO server.
-
-Enabling Cache Tracing
-----------------------
-
-To enable cache tracing, set the environment variable ZEO_CACHE_TRACE
-to the name of a file to which the ZEO client process can write. ZEO
-will append a hyphen and the storage name to the filename, to
-distinguish different storages. If the file doesn't exist, the ZEO
-will try to create it. If there are problems with the file, a log
-message is written to the standard Zope log file. To start or stop
-tracing, the ZEO client process (typically a Zope application server)
-must be restarted.
-
-The trace file can grow pretty quickly; on a moderately loaded server,
-we observed it growing by 5 MB per hour. The file consists of binary
-records, each 24 bytes long; a detailed description of the record
-lay-out is given in stats.py. No sensitive data is logged.
-
-Analyzing a Cache Trace
------------------------
-
-The stats.py command-line tool is the first-line tool to analyze a
-cache trace. Its default output consists of two parts: a one-line
-summary of essential statistics for each segment of 15 minutes,
-interspersed with lines indicating client restarts and "cache flip
-events" (more about those later), followed by a more detailed summary
-of overall statistics.
-
-The most important statistic is probably the "hit rate", a percentage
-indicating how many requests to load an object could be satisfied from
-the cache. Hit rates around 70% are good. 90% is probably close to
-the theoretical maximum. If you see a hit rate under 60% you can
-probably improve the cache performance (and hence your Zope
-application server's performance) by increasing the ZEO cache size.
-This is normally configured using the cache_size keyword argument to
-the ClientStorage() constructor in your custom_zodb.py file. The
-default cache size is 20 MB.
-
-The stats.py tool shows its command line syntax when invoked without
-arguments. The tracefile argument can be a gzipped file if it has a
-.gz extension. It will read from stdin (assuming uncompressed data)
-if the tracefile argument is '-'.
-
-Simulating Different Cache Sizes
---------------------------------
-
-Based on a cache trace file, you can make a prediction of how well the
-cache might do with a different cache size. The simul.py tool runs an
-accurate simulation of the ZEO client cache implementation based upon
-the events read from a trace file. A new simulation is started each
-time the trace file records a client restart event; if a trace file
-contains more than one restart event, a separate line is printed for
-each simulation, and line with overall statistics is added at the end.
-
-Example, assuming the trace file is in /tmp/cachetrace.log::
-
- $ python simul.py -s 100 /tmp/cachetrace.log
- START TIME DURATION LOADS HITS INVALS WRITES FLIPS HITRATE
- Sep 4 11:59 38:01 59833 40473 257 20 2 67.6%
- $
-
-This shows that with a 100 MB cache size, the cache hit rate is
-67.6%. So let's try this again with a 200 MB cache size::
-
- $ python simul.py -s 200 /tmp/cachetrace.log
- START TIME DURATION LOADS HITS INVALS WRITES FLIPS HITRATE
- Sep 4 11:59 38:01 59833 40921 258 20 1 68.4%
- $
-
-This showed hardly any improvement. So let's try a 300 MB cache
-size::
-
- $ python2.0 simul.py -s 300 /tmp/cachetrace.log
- ZEOCacheSimulation, cache size 300,000,000 bytes
- START TIME DURATION LOADS HITS INVALS WRITES FLIPS HITRATE
- Sep 4 11:59 38:01 59833 40921 258 20 0 68.4%
- $
-
-This shows that for this particular trace file, the maximum attainable
-hit rate is 68.4%. This is probably caused by the fact that nearly a
-third of the objects mentioned in the trace were loaded only once --
-the cache only helps if an object is loaded more than once.
-
-The simul.py tool also supports simulating different cache
-strategies. Since none of these are implemented, these are not
-further documented here.
-
-Cache Flips
------------
-
-The cache uses two files, which are managed as follows:
-
- - Data are written to file 0 until file 0 exceeds limit/2 in size.
-
- - Data are written to file 1 until file 1 exceeds limit/2 in size.
-
- - File 0 is truncated to size 0 (or deleted and recreated).
-
- - Data are written to file 0 until file 0 exceeds limit/2 in size.
-
- - File 1 is truncated to size 0 (or deleted and recreated).
-
- - Data are written to file 1 until file 1 exceeds limit/2 in size.
-
-and so on.
-
-A switch from file 0 to file 1 is called a "cache flip". At all cache
-flips except the first, half of the cache contents is wiped out. This
-affects cache performance. How badly this impact is can be seen from
-the per-15-minutes summaries printed by stats.py. The -i option lets
-you choose a smaller summary interval which shows the impact more
-acutely.
-
-The simul.py tool shows the number of cache flips in the FLIPS column.
-If you see more than one flip per hour the cache may be too small.
More information about the Zodb-checkins
mailing list