[Zodb-checkins] SVN: ZODB/trunk/doc/ Documentation cleanup.
Jim Fulton
jim at zope.com
Sat Jan 19 13:53:09 EST 2008
Log message for revision 82954:
Documentation cleanup.
Changed:
D ZODB/trunk/doc/Makefile
D ZODB/trunk/doc/ZEO/
D ZODB/trunk/doc/zdctl.txt
A ZODB/trunk/doc/zeo-client-cache-tracing.txt
A ZODB/trunk/doc/zeo-client-cache.txt
A ZODB/trunk/doc/zeo.txt
-=-
Deleted: ZODB/trunk/doc/Makefile
===================================================================
--- ZODB/trunk/doc/Makefile 2008-01-19 18:53:02 UTC (rev 82953)
+++ ZODB/trunk/doc/Makefile 2008-01-19 18:53:09 UTC (rev 82954)
@@ -1,36 +0,0 @@
-MKHOWTO=mkhowto
-
-MKHTML=$(MKHOWTO) --html --iconserver=. --split=4 --dvips-safe
-
-ZODBTEX = guide/gfdl.tex guide/introduction.tex guide/modules.tex \
- guide/prog-zodb.tex guide/storages.tex guide/transactions.tex \
- guide/zeo.tex guide/zodb.tex
-
-default: pdf
-all: pdf ps html
-
-pdf: storage.pdf zodb.pdf
-ps: storage.ps zodb.ps
-
-html: storage/storage.html zodb/zodb.html
-
-storage.pdf: storage.tex
- $(MKHOWTO) --pdf $<
-
-storage.ps: storage.tex
- $(MKHOWTO) --ps $<
-
-storage/storage.html: storage.tex
- $(MKHTML) storage.tex
-
-zodb.pdf: $(ZODBTEX)
- $(MKHOWTO) --pdf guide/zodb.tex
-
-zodb.ps: $(ZODBTEX)
- $(MKHOWTO) --ps guide/zodb.tex
-
-zodb/zodb.html: $(ZODBTEX)
- $(MKHTML) guide/zodb.tex
-
-clobber:
- rm -rf storage.pdf storage.ps storage/ zodb.pdf zodb.ps zodb/
Deleted: ZODB/trunk/doc/zdctl.txt
===================================================================
--- ZODB/trunk/doc/zdctl.txt 2008-01-19 18:53:02 UTC (rev 82953)
+++ ZODB/trunk/doc/zdctl.txt 2008-01-19 18:53:09 UTC (rev 82954)
@@ -1,335 +0,0 @@
-Using zdctl and zdrun to manage server processes
-================================================
-
-
-Summary
--------
-
-Starting with Zope 2.7 and ZODB 3.2, Zope has a new way to configure
-and control server processes. This file documents the new approach to
-server process management; the new approach to configuration is
-documented elsewhere, although some examples will be given here. We
-use the ZEO server as a running example, although this isn't a
-complete manual for configuring or running ZEO.
-
-This documentation applies to Unix/Linux systems; zdctl and zdrun do
-not work on Windows.
-
-
-Prerequisites
--------------
-
-This document assumes that you have installed the ZODB3 software
-(version 3.2 or higher) using a variation on the following command,
-given from the root directory of the ZODB3 distribution::
-
- $ python setup.py install
-
-This installs the packages ZConfig, ZEO, zdaemon, zLOG, ZODB and
-various other needed packages and extension modules in the Python
-interpreter's site-packages directory, and installs scripts including
-zdctl.py, zdrun.py, runzeo.py and mkzeoinst.py in /usr/local/bin
-(actually the bin directory from which the python interpreter was
-loaded).
-
-When you receive ZODB as a part of Zope (version 2.7 or higher), the
-installation instructions will explain how to reach a similar state.
-
-
-Introduction
-------------
-
-The most basic way to run a ZEO server is using the following
-command::
-
- $ runzeo.py -a 9999 -f Data.fs
-
-Here 9999 is the ZEO port (you can pick your own unused TCP port
-number in the range 1024 through 65535, inclusive); Data.fs is the
-storage file. Again, you can pick any filename you want; the
-ZODB.FileStorage module code creates this file and various other files
-with additional extensions, like Data.fs.index, Data.fs.lock, and
-Data.fs.tmp.
-
-If something's wrong, for example if you picked a bad port number or
-filename, you'll get an error message or an exception right away and
-runzeo.py will exit with a non-zero exit status. The exit status is 2
-for command line syntax errors, 1 for other errors.
-
-If all's well, runzeo.py will emit a few logging messages to stderr
-and start serving, until you hit ^C. For example::
-
- $ runzeo.py -a 9999 -f Data.fs
- ------
- 2003-01-24T11:49:27 INFO(0) RUNSVR opening storage '1' using FileStorage
- ------
- 2003-01-24T11:49:27 INFO(0) ZSS:23531 StorageServer created RW with
- storages: 1:RW:Data.fs
- ------
- 2003-01-24T11:49:27 INFO(0) zrpc:23531 listening on ('', 9999)
-
-At this point you can hit ^C to stop it; runzeo.py will catch the
-interrupt signal, emit a few more log messages and exit::
-
- ^C
- ------
- 2003-01-24T12:11:15 INFO(0) RUNSVR terminated by SIGINT
- ------
- 2003-01-24T12:11:15 INFO(0) RUNSVR closing storage '1'
- $
-
-This may be fine for testing, but a bad idea for running a ZEO server
-in a production environment. In production, you want the ZEO server
-to be run as a daemon process, you want the log output to go to a
-file, you want the ZEO server to be started when the system is
-rebooted, and (usually) you want the ZEO server to be automatically
-restarted when it crashes. You should also have a log rotation policy
-in place so that your disk doesn't fill up with log messages.
-
-The zdctl/zdrun combo can take care of running a server as a daemon
-process and restarting it when it crashes. It can also be used to
-start it when the system is rebooted. Sending log output to a file is
-done by adjusting the ZEO server configuration. There are many fine
-existing tools to rotate log files, so we don't provide this
-functionality; zdctl has a command to send the server process a
-SIGUSR2 signal to tell it to reopen its log file after log rotation
-has taken place (the ZEO server has a signal handler that catches
-SIGUSR2 for this purpose).
-
-In addition, zdctl lets a system administrator or developer control
-the server process. This is useful to deal with typical problems like
-restarting a hanging server or adjusting a server's configuration.
-
-The zdctl program can be used in two ways: in one-shot mode it
-executes a single command (such as "start", "stop" or "restart"); in
-interactive mode it acts much like a typical Unix shell or the Python
-interpreter, printing a prompt to standard output and reading commands
-from standard input. It currently cannot be used to read commands
-from a file; if you need to script it, you can use a shell script
-containing repeated one-shot invocations.
-
-zdctl can be configured using command line options or a configuration
-file. In practice, you'll want to use a configuration file; but first
-we'll show some examples using command line options only. Here's a
-one-shot zdctl command to start the ZEO server::
-
- $ zdctl.py -p "runzeo.py -a 9999 -f Data.fs" start
-
-The -p option specifies the server program; it is the runzeo
-invocation that we showed before. The start argument tells it to
-start the process. What actually happens is that zdctl starts zdrun,
-and zdrun now manages the ZEO server process. The zdctl process exits
-once zdrun has started the ZEO server process; the zdrun process stays
-around, and when the ZEO server process crashes it will restart it.
-
-To check that the ZEO server is now running, use the zdctl status
-command::
-
- $ zdctl.py -p "runzeo.py -a 9999 -f Data.fs" status
-
-This prints a one-line message telling you that the program is
-running. To stop the ZEO server, use the zdctl stop command::
-
- $ zdctl.py -p "runzeo.py -a 9999 -f Data.fs" stop
-
-To check that is no longer running, use the zdctl status command
-again.
-
-
-Daemon mode
------------
-
-If you are playing along on your computer, you cannot have missed that
-some log output has been spewing to your terminal window. While this
-may give you a warm and fuzzy feeling that something is actually
-happening, after a whiile it can get quite annoying (especially if
-clients are actually connecting to the server). This can be avoided
-by using the -d flag, which enables "daemon mode"::
-
- $ zdctl.py -d -p "runzeo.py -a 9999 -f Data.fs" start
-
-Daemon mode does several subtle things; see for example section 13.3
-of "Advanced Programming in the UNIX Environment" by Richard Stevens
-for a good explanation of daemon mode. For now, the most important
-effect is that the standard input, output and error streams are
-redirected to /dev/null, and that the process is "detached" from your
-controlling tty, which implies that it won't receive a SIGHUP signal
-when you log out.
-
-
-Using a configuration file
---------------------------
-
-I hope you are using a Unix shell with command line history, otherwise
-entering the examples above would have been quite a pain. But a
-better way to control zdctl and zdrun's many options without having to
-type them over and over again is to use a configuration file. Here's
-a small configuration file; place this in the file "zeoctl.conf" (the
-name is just a convention; you can call it "foo" if you prefer)::
-
- # Sample zdctl/zdrun configuration
- <runner>
- program runzeo.py -a 9999 -f Data.fs
- daemon true
- directory /tmp/zeohome
- socket-name /tmp/zeohome/zdsock
- </runner>
-
-The "program" and "daemon" lines correspond to the -p and -d command
-line options discussed above. The "directory" line is new. It
-specifies a directory into which zdrun (but not zdctl!) chdirs. This
-directory should exist; zdctl won't create it for you. The Data.fs
-filename passed to runzeo.py is interpreted relative to this
-directory. Finally, the "socket-name" line names the Unix domain
-socket that is used for communication between zdctl and zdrun. It
-defaults to zdsock in the current directory, a default you definitely
-want to override for production usage.
-
-To invoke zdctl with a configuration file, use its -C option to name
-the configuration file, for example::
-
- $ zdctl.py -C zeoctl.conf start
-
- $ zdctl.py -C zeoctl.conf status
-
- $ zdctl.py -C zeoctl.conf stop
-
-
-Interactive mode
-----------------
-
-Using a configuration file makes it a little easier to repeatedly
-start, stop and request status of a particular server, but it still
-requires typing the configuration file name on each command.
-Fortunately, zdctl.py can be used as an interactive "shell" which lets
-you execute repeated commands for the same server. Simply invoke
-zdctl.py without the final argument ("start", "status" or "stop" in
-the above examples)::
-
- $ zdctl.py -C zeoctl.conf
- program: runzeo.py -a 9999 -f Data.fs
- daemon manager not running
- zdctl>
-
-The first two lines of output are status messages (and could be
-different in your case); the final line is the interactive command
-prompt. At this prompt, you can type commands::
-
- zdctl> help
-
- Documented commands (type help <topic>):
- ========================================
- EOF fg foreground help kill
- logreopen logtail quit reload restart
- shell show start status stop
- wait
-
- zdctl> help start
- start -- Start the daemon process.
- If it is already running, do nothing.
- zdctl> start
- daemon process started, pid=31580
- zdctl> status
- program running; pid=31580
- zdctl> stop
- daemon process stopped
- zdctl> quit
- daemon manager not running
- $
-
-In short, the commands you can type at the interactive prompt are the
-same commands (with optional arguments) that you can use as positional
-arguments on the zdctl.py command line.
-
-The interactive shell has some additional features:
-
-- Line editing and command line history using the standard GNU
- readline module.
-
-- A blank line repeats the last command (especially useful for status).
-
-- Command and argument completion using the TAB key.
-
-One final note: some people don't like it that an invocation without
-arguments enters interactive mode. If this describes you, there's an
-easy way to disable this feature: add a line saying
-
- default-to-interactive false
-
-to the zeoctl.conf file. You can still enter interactive mode by
-using the -i option.
-
-
-Using mkzeoinst.py
-------------------
-
-If you still think that all of the above is a lot of typing, you're
-right. Fortunately, there's a simple utility that help you creating
-and configuring a ZEO server instance. mkzeoinst.py requires one
-argument, the ZEO server's "home directory". After that, you can
-optionally specify a service port number; the port defaults to 9999.
-
-mkzeoinst.py creates the server home directory (and its ancestor
-directories if necessary), and then creates the following directory
-substructure:
-
- bin/ - directory for scripts (zeoctl)
- etc/ - directory for configuration files (zeo.conf, zeoctl.conf)
- log/ - directory for log files (zeo.log, zeoctl.log)
- var/ - directory for data files (Data.fs and friends)
-
-If the server home directory or any of its subdirectories already
-exist, mkzeoinst.py will note this and assume you are rebuilding an
-existing instance. (In fact, it prints a message for each directory
-it creates but is silent about existing directories.)
-
-It then creates the following files:
-
- bin/zeoctl - executable shell script to run zdctl.py
- etc/zeo.conf - configuration file for ZEO
- etc/zeoctl.conf - configuration file for zdrun.py and zdctl.py
-
-If any of the files it wants to create already exists and is
-non-empty, it does not write the file. (An empty file will be
-overwritten though.) If the existing contents differ from what it
-would have written if the file didn't exist, it prints a warning
-message; otherwise the skipping is silent.
-
-Other errors (e.g. permission errors creating or reading files or
-directories) cause mkzeoinst.py to bail with an error message; it does
-not clean up the work already done.
-
-The created files contain absolute path references to all of the
-programs, files, directories used. They also contain default values
-for most configuration settings that one might normally want to
-configure. Most configured settings are the same as the defaults;
-however, daemon mode is on while the regular default is off. Log
-files are configured to go into the log directory. If configures
-separate log files for zdrun.py/zdctl.py (log/zeoctl.log) and for the
-ZEO server itself (log/zeo.log). Once created, the files are yours;
-feel free to edit them to suit your taste.
-
-The bin/zeoctl script should be invoked with the positional arguments
-(e,g, "start", "stop" or "status") that you would pass to zdctl.py;
-the script hardcodes the configuration file so you don't have to pass
-that. It can also be invoked without arguments to enter interactive
-mode.
-
-One final detail: if you want the ZEO server to be started
-automatically when the machine is rebooted, and you're lucky enough to
-be using a recent Red Hat (or similar) system, you can copy the
-bin/zeoctl script into the /etc/rc.d/init.d/ directory and use
-chkconfig(8) to create the correct symlinks to it; the bin/zeoctl
-script already has the appropriate magical comments for chkconfig.
-
-
-zdctl reference
----------------
-
-TBD
-
-
-zdrun reference
----------------
-
-TBD
Copied: ZODB/trunk/doc/zeo-client-cache-tracing.txt (from rev 82950, ZODB/trunk/doc/ZEO/trace.txt)
===================================================================
--- ZODB/trunk/doc/zeo-client-cache-tracing.txt (rev 0)
+++ ZODB/trunk/doc/zeo-client-cache-tracing.txt 2008-01-19 18:53:09 UTC (rev 82954)
@@ -0,0 +1,144 @@
+ZEO Client Cache Tracing
+========================
+
+An important question for ZEO users is: how large should the ZEO
+client cache be? ZEO 2 (as of ZEO 2.0b2) has a new feature that lets
+you collect a trace of cache activity and tools to analyze this trace,
+enabling you to make an informed decision about the cache size.
+
+Don't confuse the ZEO client cache with the Zope object cache. The
+ZEO client cache is only used when an object is not in the Zope object
+cache; the ZEO client cache avoids roundtrips to the ZEO server.
+
+Enabling Cache Tracing
+----------------------
+
+To enable cache tracing, you must use a persistent cache (specify a ``client``
+name), and set the environment variable ZEO_CACHE_TRACE to a non-empty
+value. The path to the trace file is derived from the path to the persistent
+cache file by appending ".trace". If the file doesn't exist, ZEO will try to
+create it. If the file does exist, it's opened for appending (previous trace
+information is not overwritten). If there are problems with the file, a
+warning message is logged. To start or stop tracing, the ZEO client process
+(typically a Zope application server) must be restarted.
+
+The trace file can grow pretty quickly; on a moderately loaded server, we
+observed it growing by 7 MB per hour. The file consists of binary records,
+each 34 bytes long if 8-byte oids are in use; a detailed description of the
+record lay-out is given in stats.py. No sensitive data is logged: data
+record sizes (but not data records), and binary object and transaction ids
+are logged, but no object pickles, object types or names, user names,
+transaction comments, access paths, or machine information (such as machine
+name or IP address) are logged.
+
+Analyzing a Cache Trace
+-----------------------
+
+The stats.py command-line tool is the first-line tool to analyze a cache
+trace. Its default output consists of two parts: a one-line summary of
+essential statistics for each segment of 15 minutes, interspersed with lines
+indicating client restarts, followed by a more detailed summary of overall
+statistics.
+
+The most important statistic is the "hit rate", a percentage indicating how
+many requests to load an object could be satisfied from the cache. Hit rates
+around 70% are good. 90% is excellent. If you see a hit rate under 60% you
+can probably improve the cache performance (and hence your Zope application
+server's performance) by increasing the ZEO cache size. This is normally
+configured using key ``cache_size`` in the ``zeoclient`` section of your
+configuration file. The default cache size is 20 MB, which is small.
+
+The stats.py tool shows its command line syntax when invoked without
+arguments. The tracefile argument can be a gzipped file if it has a .gz
+extension. It will be read from stdin (assuming uncompressed data) if the
+tracefile argument is '-'.
+
+Simulating Different Cache Sizes
+--------------------------------
+
+Based on a cache trace file, you can make a prediction of how well the cache
+might do with a different cache size. The simul.py tool runs a simulation of
+the ZEO client cache implementation based upon the events read from a trace
+file. A new simulation is started each time the trace file records a client
+restart event; if a trace file contains more than one restart event, a
+separate line is printed for each simulation, and a line with overall
+statistics is added at the end.
+
+Example, assuming the trace file is in /tmp/cachetrace.log::
+
+ $ python simul.py -s 4 /tmp/cachetrace.log
+ CircularCacheSimulation, cache size 4,194,304 bytes
+ START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE
+ Jul 22 22:22 39:09 3218856 1429329 24046 41517 44.4% 40776 99.8
+
+This shows that with a 4 MB cache size, the cache hit rate is 44.4%, the
+percentage 1429329 (number of cache hits) is of 3218856 (number of load
+requests). The cache simulated 40776 evictions, to make room for new object
+states. At the end, 99.8% of the bytes reserved for the cache file were in
+use to hold object state (the remaining 0.2% consists of "holes", bytes freed
+by object eviction and not yet reused to hold another object's state).
+
+Let's try this again with an 8 MB cache::
+
+ $ python simul.py -s 8 /tmp/cachetrace.log
+ CircularCacheSimulation, cache size 8,388,608 bytes
+ START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE
+ Jul 22 22:22 39:09 3218856 2182722 31315 41517 67.8% 40016 100.0
+
+That's a huge improvement in hit rate, which isn't surprising since these are
+very small cache sizes. The default cache size is 20 MB, which is still on
+the small side::
+
+ $ python simul.py /tmp/cachetrace.log
+ CircularCacheSimulation, cache size 20,971,520 bytes
+ START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE
+ Jul 22 22:22 39:09 3218856 2982589 37922 41517 92.7% 37761 99.9
+
+Again a very nice improvement in hit rate, and there's not a lot of room left
+for improvement. Let's try 100 MB::
+
+ $ python simul.py -s 100 /tmp/cachetrace.log
+ CircularCacheSimulation, cache size 104,857,600 bytes
+ START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE
+ Jul 22 22:22 39:09 3218856 3218741 39572 41517 100.0% 22778 100.0
+
+It's very unusual to see a hit rate so high. The application here frequently
+modified a very large BTree, so given enough cache space to hold the entire
+BTree it rarely needed to ask the ZEO server for data: this application
+reused the same objects over and over.
+
+More typical is that a substantial number of objects will be referenced only
+once. Whenever an object turns out to be loaded only once, it's a pure loss
+for the cache: the first (and only) load is a cache miss; storing the object
+evicts other objects, possibly causing more cache misses; and the object is
+never loaded again. If, for example, a third of the objects are loaded only
+once, it's quite possible for the theoretical maximum hit rate to be 67%, no
+matter how large the cache.
+
+The simul.py script also contains code to simulate different cache
+strategies. Since none of these are implemented, and only the default cache
+strategy's code has been updated to be aware of MVCC, these are not further
+documented here.
+
+Simulation Limitations
+----------------------
+
+The cache simulation is an approximation, and actual hit rate may be higher
+or lower than the simulated result. These are some factors that inhibit
+exact simulation:
+
+- The simulator doesn't try to emulate versions. If the trace file contains
+ loads and stores of objects in versions, the simulator treats them as if
+ they were loads and stores of non-version data.
+
+- Each time a load of an object O in the trace file was a cache hit, but the
+ simulated cache has evicted O, the simulated cache has no way to repair its
+ knowledge about O. This is more frequent when simulating caches smaller
+ than the cache used to produce the trace file. When a real cache suffers a
+ cache miss, it asks the ZEO server for the needed information about O, and
+ saves O in the client cache. The simulated cache doesn't have a ZEO server
+ to ask, and O continues to be absent in the simulated cache. Further
+ requests for O will continue to be simulated cache misses, although in a
+ real cache they'll likely be cache hits. On the other hand, the
+ simulated cache doesn't need to evict any objects to make room for O, so it
+ may enjoy further cache hits on objects a real cache would have evicted.
Copied: ZODB/trunk/doc/zeo-client-cache.txt (from rev 82950, ZODB/trunk/doc/ZEO/cache.txt)
===================================================================
--- ZODB/trunk/doc/zeo-client-cache.txt (rev 0)
+++ ZODB/trunk/doc/zeo-client-cache.txt 2008-01-19 18:53:09 UTC (rev 82954)
@@ -0,0 +1,48 @@
+ZEO Client Cache
+
+ The client cache provides a disk based cache for each ZEO client. The
+ client cache allows reads to be done from local disk rather than by remote
+ access to the storage server.
+
+ The cache may be persistent or transient. If the cache is persistent, then
+ the cache file is retained for use after process restarts. A non-
+ persistent cache uses a temporary file.
+
+ The client cache is managed in a single file, of the specified size.
+
+ The life of the cache is as follows:
+
+ - The cache file is opened (if it already exists), or created and set to
+ the specified size.
+
+ - Cache records are written to the cache file, as transactions commit
+ locally, and as data are loaded from the server.
+
+ - Writes are to "the current file position". This is a pointer that
+ travels around the file, circularly. After a record is written, the
+ pointer advances to just beyond it. Objects starting at the current
+ file position are evicted, as needed, to make room for the next record
+ written.
+
+ A distinct index file is not created, although indexing structures are
+ maintained in memory while a ClientStorage is running. When a persistent
+ client cache file is reopened, these indexing structures are recreated
+ by analyzing the file contents.
+
+ Persistent cache files are created in the directory named in the ``var``
+ argument to the ClientStorage, or if ``var`` is None, in the current
+ working directory. Persistent cache files have names of the form::
+
+ client-storage.zec
+
+ where:
+
+ client -- the client name, as given by the ClientStorage's ``client``
+ argument
+
+ storage -- the storage name, as given by the ClientStorage's ``storage``
+ argument; this is typically a string denoting a small integer,
+ "1" by default
+
+ For example, the cache file for client '8881' and storage 'spam' is named
+ "8881-spam.zec".
Copied: ZODB/trunk/doc/zeo.txt (from rev 82950, ZODB/trunk/doc/ZEO/howto.txt)
===================================================================
--- ZODB/trunk/doc/zeo.txt (rev 0)
+++ ZODB/trunk/doc/zeo.txt 2008-01-19 18:53:09 UTC (rev 82954)
@@ -0,0 +1,415 @@
+==========================
+Running a ZEO Server HOWTO
+==========================
+
+Introduction
+------------
+
+ZEO (Zope Enterprise Objects) is a client-server system for sharing a
+single storage among many clients. Normally, a ZODB storage can only
+be used by a single process. When you use ZEO, the storage is opened
+in the ZEO server process. Client programs connect to this process
+using a ZEO ClientStorage. ZEO provides a consistent view of the
+database to all clients. The ZEO client and server communicate using
+a custom RPC protocol layered on top of TCP.
+
+There are several configuration options that affect the behavior of a
+ZEO server. This section describes how a few of these features
+working. Subsequent sections describe how to configure every option.
+
+Client cache
+~~~~~~~~~~~~
+
+Each ZEO client keeps an on-disk cache of recently used objects to
+avoid fetching those objects from the server each time they are
+requested. It is usually faster to read the objects from disk than it
+is to fetch them over the network. The cache can also provide
+read-only copies of objects during server outages.
+
+The cache may be persistent or transient. If the cache is persistent,
+then the cache files are retained for use after process restarts. A
+non-persistent cache uses temporary files that are removed when the
+client storage is closed.
+
+The client cache size is configured when the ClientStorage is created.
+The default size is 20MB, but the right size depends entirely on the
+particular database. Setting the cache size too small can hurt
+performance, but in most cases making it too big just wastes disk
+space. The document "Client cache tracing" describes how to collect a
+cache trace that can be used to determine a good cache size.
+
+ZEO uses invalidations for cache consistency. Every time an object is
+modified, the server sends a message to each client informing it of
+the change. The client will discard the object from its cache when it
+receives an invalidation. These invalidations are often batched.
+
+Each time a client connects to a server, it must verify that its cache
+contents are still valid. (It did not receive any invalidation
+messages while it was disconnected.) There are several mechanisms
+used to perform cache verification. In the worst case, the client
+sends the server a list of all objects in its cache along with their
+timestamps; the server sends back an invalidation message for each
+stale object. The cost of verification is one drawback to making the
+cache too large.
+
+Note that every time a client crashes or disconnects, it must verify
+its cache. Every time a server crashes, all of its clients must
+verify their caches.
+
+The cache verification process is optimized in two ways to eliminate
+costs when restarting clients and servers. Each client keeps the
+timestamp of the last invalidation message it has seen. When it
+connects to the server, it checks to see if any invalidation messages
+were sent after that timestamp. If not, then the cache is up-to-date
+and no further verification occurs. The other optimization is the
+invalidation queue, described below.
+
+Invalidation queue
+~~~~~~~~~~~~~~~~~~
+
+The ZEO server keeps a queue of recent invalidation messages in
+memory. When a client connects to the server, it sends the timestamp
+of the most recent invalidation message it has received. If that
+message is still in the invalidation queue, then the server sends the
+client all the missing invalidations. This is often cheaper than
+perform full cache verification.
+
+The default size of the invalidation queue is 100. If the
+invalidation queue is larger, it will be more likely that a client
+that reconnects will be able to verify its cache using the queue. On
+the other hand, a large queue uses more memory on the server to store
+the message. Invalidation messages tend to be small, perhaps a few
+hundred bytes each on average; it depends on the number of objects
+modified by a transaction.
+
+Transaction timeouts
+~~~~~~~~~~~~~~~~~~~~
+
+A ZEO server can be configured to timeout a transaction if it takes
+too long to complete. Only a single transaction can commit at a time;
+so if one transaction takes too long, all other clients will be
+delayed waiting for it. In the extreme, a client can hang during the
+commit process. If the client hangs, the server will be unable to
+commit other transactions until it restarts. A well-behaved client
+will not hang, but the server can be configured with a transaction
+timeout to guard against bugs that cause a client to hang.
+
+If any transaction exceeds the timeout threshold, the client's
+connection to the server will be closed and the transaction aborted.
+Once the transaction is aborted, the server can start processing other
+client's requests. Most transactions should take very little time to
+commit. The timer begins for a transaction after all the data has
+been sent to the server. At this point, the cost of commit should be
+dominated by the cost of writing data to disk; it should be unusual
+for a commit to take longer than 1 second. A transaction timeout of
+30 seconds should tolerate heavy load and slow communications between
+client and server, while guarding against hung servers.
+
+When a transaction times out, the client can be left in an awkward
+position. If the timeout occurs during the second phase of the two
+phase commit, the client will log a panic message. This should only
+cause problems if the client transaction involved multiple storages.
+If it did, it is possible that some storages committed the client
+changes and others did not.
+
+Monitor server
+~~~~~~~~~~~~~~
+
+The ZEO server updates several counters while it is running. It can
+be configured to run a separate monitor server that reports the
+counter values and other statistics. If a client connects to the
+socket, the server send a text report and close the socket
+immediately. It does not read any data from the client.
+
+An example of a monitor server report is included below::
+
+ ZEO monitor server version 2.1a1
+ Fri Apr 4 16:57:42 2003
+
+ Storage: 1
+ Server started: Fri Apr 4 16:57:37 2003
+ Clients: 0
+ Clients verifying: 0
+ Active transactions: 0
+ Commits: 0
+ Aborts: 0
+ Loads: 0
+ Stores: 0
+ Conflicts: 0
+ Conflicts resolved: 0
+
+Connection management
+~~~~~~~~~~~~~~~~~~~~~
+
+A ZEO client manages its connection to the ZEO server. If it loses
+the connection, it attempts to reconnect. While
+it is disconnected, it can satisfy some reads by using its cache.
+
+The client can be configured to wait for a connection when it is created
+or to return immediately and provide data from its persistent cache.
+It usually simplifies programming to have the client wait for a
+connection on startup.
+
+When the client is disconnected, it polls periodically to see if the
+server is available. The rate at which it polls is configurable.
+
+The client can be configured with multiple server addresses. In this
+case, it assumes that each server has identical content and will use
+any server that is available. It is possible to configure the client
+to accept a read-only connection to one of these servers if no
+read-write connection is available. If it has a read-only connection,
+it will continue to poll for a read-write connection. This feature
+supports the Zope Replication Services product,
+http://www.zope.com/Products/ZopeProducts/ZRS. In general, it could
+be used to with a system that arranges to provide hot backups of
+servers in the case of failure.
+
+Authentication
+~~~~~~~~~~~~~~
+
+ZEO supports optional authentication of client and server using a
+password scheme similar to HTTP digest authentication (RFC 2069). It
+is a simple challenge-response protocol that does not send passwords
+in the clear, but does not offer strong security. The RFC discusses
+many of the limitations of this kind of protocol. Note that this
+feature provides authentication only. It does not provide encryption
+or confidentiality.
+
+The challenge-response also produces a session key that is used to
+generate message authentication codes for each ZEO message. This
+should prevent session hijacking.
+
+Guard the password database as if it contained plaintext passwords.
+It stores the hash of a username and password. This does not expose
+the plaintext password, but it is sensitive nonetheless. An attacker
+with the hash can impersonate the real user. This is a limitation of
+the simple digest scheme.
+
+The authentication framework allows third-party developers to provide
+new authentication modules.
+
+Installing software
+-------------------
+
+ZEO is distributed as part of the ZODB3 package and with Zope,
+starting with Zope 2.7. You can download it from
+http://pypi.python.org/pypi/ZODB3.
+
+Configuring server
+------------------
+
+The script runzeo.py runs the ZEO server. The server can be
+configured using command-line arguments or a config file. This
+document only describes the config file. Run runzeo.py
+-h to see the list of command-line arguments.
+
+The runzeo.py script imports the ZEO package. ZEO must either be
+installed in Python's site-packages directory or be in a directory on
+PYTHONPATH.
+
+The configuration file specifies the underlying storage the server
+uses, the address it binds, and a few other optional parameters.
+An example is::
+
+ <zeo>
+ address zeo.example.com:8090
+ monitor-address zeo.example.com:8091
+ </zeo>
+
+ <filestorage 1>
+ path /var/tmp/Data.fs
+ </filestorage>
+
+ <eventlog>
+ <logfile>
+ path /var/tmp/zeo.log
+ format %(asctime)s %(message)s
+ </logfile>
+ </eventlog>
+
+This file configures a server to use a FileStorage from
+/var/tmp/Data.fs. The server listens on port 8090 of zeo.example.com.
+It also starts a monitor server that lists in port 8091. The ZEO
+server writes its log file to /var/tmp/zeo.log and uses a custom
+format for each line. Assuming the example configuration it stored in
+zeo.config, you can run a server by typing::
+
+ python /usr/local/bin/runzeo.py -C zeo.config
+
+A configuration file consists of a <zeo> section and a storage
+section, where the storage section can use any of the valid ZODB
+storage types. It may also contain an eventlog configuration. See
+the document "Configuring a ZODB database" for more information about
+configuring storages and eventlogs.
+
+The zeo section must list the address. All the other keys are
+optional.
+
+address
+ The address at which the server should listen. This can be in
+ the form 'host:port' to signify a TCP/IP connection or a
+ pathname string to signify a Unix domain socket connection (at
+ least one '/' is required). A hostname may be a DNS name or a
+ dotted IP address. If the hostname is omitted, the platform's
+ default behavior is used when binding the listening socket (''
+ is passed to socket.bind() as the hostname portion of the
+ address).
+
+read-only
+ Flag indicating whether the server should operate in read-only
+ mode. Defaults to false. Note that even if the server is
+ operating in writable mode, individual storages may still be
+ read-only. But if the server is in read-only mode, no write
+ operations are allowed, even if the storages are writable. Note
+ that pack() is considered a read-only operation.
+
+invalidation-queue-size
+ The storage server keeps a queue of the objects modified by the
+ last N transactions, where N == invalidation_queue_size. This
+ queue is used to speed client cache verification when a client
+ disconnects for a short period of time.
+
+monitor-address
+ The address at which the monitor server should listen. If
+ specified, a monitor server is started. The monitor server
+ provides server statistics in a simple text format. This can
+ be in the form 'host:port' to signify a TCP/IP connection or a
+ pathname string to signify a Unix domain socket connection (at
+ least one '/' is required). A hostname may be a DNS name or a
+ dotted IP address. If the hostname is omitted, the platform's
+ default behavior is used when binding the listening socket (''
+ is passed to socket.bind() as the hostname portion of the
+ address).
+
+transaction-timeout
+ The maximum amount of time to wait for a transaction to commit
+ after acquiring the storage lock, specified in seconds. If the
+ transaction takes too long, the client connection will be closed
+ and the transaction aborted.
+
+authentication-protocol
+ The name of the protocol used for authentication. The
+ only protocol provided with ZEO is "digest," but extensions
+ may provide other protocols.
+
+authentication-database
+ The path of the database containing authentication credentials.
+
+authentication-realm
+ The authentication realm of the server. Some authentication
+ schemes use a realm to identify the logic set of usernames
+ that are accepted by this server.
+
+Configuring clients
+-------------------
+
+The ZEO client can also be configured using ZConfig. The ZODB.config
+module provides several function for opening a storage based on its
+configuration.
+
+- ZODB.config.storageFromString()
+- ZODB.config.storageFromFile()
+- ZODB.config.storageFromURL()
+
+The ZEO client configuration requires the server address be
+specified. Everything else is optional. An example configuration is::
+
+ <zeoclient>
+ server zeo.example.com:8090
+ </zeoclient>
+
+The other configuration options are listed below.
+
+storage
+ The name of the storage that the client wants to use. If the
+ ZEO server serves more than one storage, the client selects
+ the storage it wants to use by name. The default name is '1',
+ which is also the default name for the ZEO server.
+
+cache-size
+ The maximum size of the client cache, in bytes.
+
+name
+ The storage name. If unspecified, the address of the server
+ will be used as the name.
+
+client
+ Enables persistent cache files. The string passed here is
+ used to construct the cache filenames. If it is not
+ specified, the client creates a temporary cache that will
+ only be used by the current object.
+
+var
+ The directory where persistent cache files are stored. By
+ default cache files, if they are persistent, are stored in
+ the current directory.
+
+min-disconnect-poll
+ The minimum delay in seconds between attempts to connect to
+ the server, in seconds. Defaults to 5 seconds.
+
+max-disconnect-poll
+ The maximum delay in seconds between attempts to connect to
+ the server, in seconds. Defaults to 300 seconds.
+
+wait
+ A boolean indicating whether the constructor should wait
+ for the client to connect to the server and verify the cache
+ before returning. The default is true.
+
+read-only
+ A flag indicating whether this should be a read-only storage,
+ defaulting to false (i.e. writing is allowed by default).
+
+read-only-fallback
+ A flag indicating whether a read-only remote storage should be
+ acceptable as a fallback when no writable storages are
+ available. Defaults to false. At most one of read_only and
+ read_only_fallback should be true.
+realm
+ The authentication realm of the server. Some authentication
+ schemes use a realm to identify the logic set of usernames
+ that are accepted by this server.
+
+A ZEO client can also be created by calling the ClientStorage
+constructor explicitly. For example::
+
+ from ZEO.ClientStorage import ClientStorage
+ storage = ClientStorage(("zeo.example.com", 8090))
+
+Running the ZEO server as a daemon
+----------------------------------
+
+In an operational setting, you will want to run the ZEO server a
+daemon process that is restarted when it dies. The zdaemon package
+provides two tools for running daemons: zdrun.py and zdctl.py. You can
+find zdaemon and it's documentation at
+http://pypi.python.org/pypi/zdaemon.
+
+Rotating log files
+~~~~~~~~~~~~~~~~~~
+
+ZEO will re-initialize its logging subsystem when it receives a
+SIGUSR2 signal. If you are using the standard event logger, you
+should first rename the log file and then send the signal to the
+server. The server will continue writing to the renamed log file
+until it receives the signal. After it receives the signal, the
+server will create a new file with the old name and write to it.
+
+Tools
+-----
+
+There are a few scripts that may help running a ZEO server. The
+zeopack.py script connects to a server and packs the storage. It can
+be run as a cron job. The zeoup.py script attempts to connect to a
+ZEO server and verify that is is functioning. The zeopasswd.py script
+manages a ZEO servers password database.
+
+Diagnosing problems
+-------------------
+
+If an exception occurs on the server, the server will log a traceback
+and send an exception to the client. The traceback on the client will
+show a ZEO protocol library as the source of the error. If you need
+to diagnose the problem, you will have to look in the server log for
+the rest of the traceback.
More information about the Zodb-checkins
mailing list