[Zope-Checkins] CVS: Zope/lib/python/ZServer/medusa/docs - README.html:1.4 composing_producers.gif:1.4 data_flow.gif:1.4 data_flow.html:1.4 producers.gif:1.4 proxy_notes.txt:1.4
Fred L. Drake, Jr.
fred@zope.com
Tue, 18 Mar 2003 16:16:41 -0500
Update of /cvs-repository/Zope/lib/python/ZServer/medusa/docs
In directory cvs.zope.org:/tmp/cvs-serv23895
Added Files:
README.html composing_producers.gif data_flow.gif
data_flow.html producers.gif proxy_notes.txt
Log Message:
Move ZServer into new location, including configuration support from the
new-install-branch.
=== Zope/lib/python/ZServer/medusa/docs/README.html 1.3 => 1.4 ===
--- /dev/null Tue Mar 18 16:16:41 2003
+++ Zope/lib/python/ZServer/medusa/docs/README.html Tue Mar 18 16:16:40 2003
@@ -0,0 +1,238 @@
+<html>
+<body>
+
+Medusa is Copyright 1996-1997, Sam Rushing (rushing@nightmare.com)
+<hr>
+
+
+<pre>
+Medusa is provided free for all non-commercial use. If you are using
+Medusa to make money, or you would like to distribute Medusa or any
+derivative of Medusa commercially, then you must arrange a license
+with me. Extension authors may either negotiate with me to include
+their extension in the main distribution, or may distribute under
+their own terms.
+
+You may modify or extend Medusa, but you may not redistribute the
+modified versions without permission.
+
+<b>
+NIGHTMARE SOFTWARE AND SAM RUSHING DISCLAIM ALL WARRANTIES WITH REGARD
+TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY
+AND FITNESS, IN NO EVENT SHALL NIGHTMARE SOFTWARE OR SAM RUSHING BE
+LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY
+DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
+WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
+ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS
+SOFTWARE.
+</b>
+
+</pre>
+
+For more information please contact me at <a href="mailto:rushing@nightmare.com">
+rushing@nightmare.com</a>
+
+<h1> What is Medusa? </h1>
+<hr>
+
+<p>
+Medusa is an architecture for very-high-performance TCP/IP servers
+(like HTTP, FTP, and NNTP). Medusa is different from most other
+servers because it runs as a single process, multiplexing I/O with its
+various client and server connections within a single process/thread.
+
+<p>
+It is capable of smoother and higher performance than most other
+servers, while placing a dramatically reduced load on the server
+machine. The single-process, single-thread model simplifies design
+and enables some new persistence capabilities that are otherwise
+difficult or impossible to implement.
+
+<p>
+Medusa is supported on any platform that can run Python and includes a
+functional implementation of the <socket> and <select>
+modules. This includes the majority of Unix implementations.
+
+<p>
+During development, it is constantly tested on Linux and Win32
+[Win95/WinNT], but the core asynchronous capability has been shown to
+work on several other platforms, including the Macintosh. It might
+even work on VMS.
+
+
+<h2>The Power of Python</h2>
+
+<p>
+A distinguishing feature of Medusa is that it is written entirely in
+Python. Python (<a href="http://www.python.org/">http://www.python.org/</a>) is a
+'very-high-level' object-oriented language developed by Guido van
+Rossum (currently at CNRI). It is easy to learn, and includes many
+modern programming features such as storage management, dynamic
+typing, and an extremely flexible object system. It also provides
+convenient interfaces to C and C++.
+
+<p>
+The rapid prototyping and delivery capabilities are hard to exaggerate;
+for example
+<ul>
+
+ <li>It took me longer to read the documentation for persistent HTTP
+ connections (the 'Keep-Alive' connection token) than to add the
+ feature to Medusa.
+
+ <li>A simple IRC-like chat server system was written in about 90 minutes.
+
+</ul>
+
+<p> I've heard similar stories from alpha test sites, and other users of
+the core async library.
+
+<h2>Server Notes</h2>
+
+<p>Both the FTP and HTTP servers use an abstracted 'filesystem object' to
+gain access to a given directory tree. One possible server extension
+technique would be to build behavior into this filesystem object,
+rather than directly into the server: Then the extension could be
+shared with both the FTP and HTTP servers.
+
+<h3>HTTP</h3>
+
+<p>The core HTTP server itself is quite simple - all functionality is
+provided through 'extensions'. Extensions can be plugged in
+dynamically. [i.e., you could log in to the server via the monitor
+service and add or remove an extension on the fly]. The basic
+file-delivery service is provided by a 'default' extension, which
+matches all URI's. You can build more complex behavior by replacing
+or extending this class.
+
+
+<p>The default extension includes support for the 'Connection: Keep-Alive'
+token, and will re-use a client channel when requested by the client.
+
+<h3>FTP</h3>
+
+<p>On Unix, the ftp server includes support for 'real' users, so that it
+may be used as a drop-in replacement for the normal ftp server. Since
+most ftp servers on Unix use the 'forking' model, each child process
+changes its user/group persona after a successful login. This is a
+appears to be a secure design.
+
+
+<p>Medusa takes a different approach - whenever Medusa performs an
+operation for a particular user [listing a directory, opening a file],
+it temporarily switches to that user's persona _only_ for the duration
+of the operation. [and each such operation is protected by a
+try/finally exception handler].
+
+
+<p>To do this Medusa MUST run with super-user privileges. This is a
+HIGHLY experimental approach, and although it has been thoroughly
+tested on Linux, security problems may still exist. If you are
+concerned about the security of your server machine, AND YOU SHOULD
+BE, I suggest running Medusa's ftp server in anonymous-only mode,
+under an account with limited privileges ('nobody' is usually used for
+this purpose).
+
+
+<p>I am very interested in any feedback on this feature, most
+especially information on how the server behaves on different
+implementations of Unix, and of course any security problems that are
+found.
+
+<hr>
+
+<h3>Monitor</h3>
+
+<p>The monitor server gives you remote, 'back-door' access to your server
+while it is running. It implements a remote python interpreter. Once
+connected to the monitor, you can do just about anything you can do from
+the normal python interpreter. You can examine data structures, servers,
+connection objects. You can enable or disable extensions, restart the server,
+reload modules, etc...
+
+<p>The monitor server is protected with an MD5-based authentication
+similar to that proposed in RFC1725 for the POP3 protocol. The server
+sends the client a timestamp, which is then appended to a secret
+password. The resulting md5 digest is sent back to the server, which
+then compares this to the expected result. Failed login attempts are
+logged and immediately disconnected. The password itself is not sent
+over the network (unless you have foolishly transmitted it yourself
+through an insecure telnet or X11 session. 8^)
+
+<p>For this reason telnet cannot be used to connect to the monitor
+server when it is in a secure mode (the default). A client program is
+provided for this purpose. You will be prompted for a password when
+starting up the server, and by the monitor client.
+
+<p>For extra added security on Unix, the monitor server will
+eventually be able to use a Unix-domain socket, which can be protected
+behind a 'firewall' directory (similar to the InterNet News server).
+
+<hr>
+<h2>Performance Notes</h2>
+
+<h3>The <code>select()</code> function</h3>
+
+<p>At the heart of Medusa is a single <code>select()</code> loop.
+This loop handles all open socket connections, both servers and
+clients. It is in effect constantly asking the system: 'which of
+these sockets has activity?'. Performance of this system call can
+vary widely between operating systems.
+
+<p>There are also often builtin limitations to the number of sockets
+('file descriptors') that a single process, or a whole system, can
+manipulate at the same time. Early versions of Linux placed draconian
+limits (256) that have since been raised. Windows 95 has a limit of
+64, while OSF/1 seems to allow up to 4096.
+
+<p>These limits don't affect only Medusa, you will find them described
+in the documentation for other web and ftp servers, too.
+
+<p>The documentation for the Apache web server has some excellent
+notes on tweaking performance for various Unix implementations. See
+<a href="http://www.apache.org/docs/misc/perf.html">
+http://www.apache.org/docs/misc/perf.html</a>
+for more information.
+
+<h3>Buffer sizes</h3>
+
+<p>
+The default buffer sizes used by Medusa are set with a bias toward
+Internet-based servers: They are relatively small, so that the buffer
+overhead for each connection is low. The assumption is that Medusa
+will be talking to a large number of low-bandwidth connections, rather
+than a smaller number of high bandwidth.
+
+<p>This choice trades run-time memory use for efficiency - the down
+side of this is that high-speed local connections (i.e., over a local
+ethernet) will transfer data at a slower rate than necessary.
+
+<p>This parameter can easily be tweaked by the site designer, and can
+in fact be adjusted on a per-server or even per-client basis. For
+example, you could have the FTP server use larger buffer sizes for
+connections from certain domains.
+
+<p>If there's enough interest, I have some rough ideas for how to make
+these buffer sizes automatically adjust to an optimal setting. Send
+email if you'd like to see this feature.
+
+<hr>
+
+<p>See <a href="medusa.html">./medusa.html</a> for a brief overview of
+some of the ideas behind Medusa's design, and for a description of
+current and upcoming features.
+
+<p><h3>Enjoy!</h3>
+
+<hr>
+<br>-Sam Rushing
+<br><a href="mailto:rushing@nightmare.com">rushing@nightmare.com</a>
+
+<!--
+ Local Variables:
+ indent-use-tabs: nil
+ end:
+-->
+
+</body>
+</html>
=== Zope/lib/python/ZServer/medusa/docs/composing_producers.gif 1.3 => 1.4 ===
<Binary-ish file>
=== Zope/lib/python/ZServer/medusa/docs/data_flow.gif 1.3 => 1.4 ===
<Binary-ish file>
=== Zope/lib/python/ZServer/medusa/docs/data_flow.html 1.3 => 1.4 ===
--- /dev/null Tue Mar 18 16:16:41 2003
+++ Zope/lib/python/ZServer/medusa/docs/data_flow.html Tue Mar 18 16:16:40 2003
@@ -0,0 +1,83 @@
+
+<h1>Data Flow in Medusa</h1>
+
+<img src="data_flow.gif">
+
+<p>Data flow, both input and output, is asynchronous. This is
+signified by the <i>request</i> and <i>reply</i> queues in the above
+diagram. This means that both requests and replies can get 'backed
+up', and are still handled correctly. For instance, HTTP/1.1 supports
+the concept of <i>pipelined requests</i>, where a series of requests
+are sent immediately to a server, and the replies are sent as they are
+processed. With a <i>synchronous</i> request, the client would have
+to wait for a reply to each request before sending the next.</p>
+
+<p>The input data is partitioned into requests by looking for a
+<i>terminator</i>. A terminator is simply a protocol-specific
+delimiter - often simply CRLF (carriage-return line-feed), though it
+can be longer (for example, MIME multi-part boundaries can be
+specified as terminators). The protocol handler is notified whenever
+a complete request has been received.</p>
+
+<p>The protocol handler then generates a reply, which is enqueued for
+output back to the client. Sometimes, instead of queuing the actual
+data, an object that will generate this data is used, called a
+<i>producer</i>.</p>
+
+<img src="producers.gif">
+
+<p>The use of <code>producers</code> gives the programmer
+extraordinary control over how output is generated and inserted into
+the output queue. Though they are simple objects (requiring only a
+single method, <i>more()</i>, to be defined), they can be
+<i>composed</i> - simple producers can be wrapped around each other to
+create arbitrarily complex behaviors. [now would be a good time to
+browse through some of the producer classes in
+<code>producers.py</code>.]</p>
+
+<p>The HTTP/1.1 producers make an excellent example. HTTP allows
+replies to be encoded in various ways - for example a reply consisting
+of dynamically-generated output might use the 'chunked' transfer
+encoding to send data that is compressed on-the-fly.</p>
+
+<img src="composing_producers.gif">
+
+<p>In the diagram, green producers actually generate output, and grey
+ones transform it in some manner. This producer might generate output
+looking like this:
+
+<pre>
+ HTTP/1.1 200 OK
+ Content-Encoding: gzip
+ Transfer-Encoding: chunked
+ Header ==> Date: Mon, 04 Aug 1997 21:31:44 GMT
+ Content-Type: text/html
+ Server: Medusa/3.0
+
+ Chunking ==> 0x200
+ Compression ==> <512 bytes of compressed html>
+ 0x200
+ <512 bytes of compressed html>
+ ...
+ 0
+
+</pre>
+
+<p>Still more can be done with this output stream: For the purpose of
+efficiency, it makes sense to send output in large, fixed-size chunks:
+This transformation can be applied by wrapping a 'globbing' producer
+around the whole thing.</p>
+
+<p>An important feature of Medusa's producers is that they are
+actually rather small objects that do not expand into actual output
+data until the moment they are needed: The <code>async_chat</code>
+class will only call on a producer for output when the outgoing socket
+has indicated that it is ready for data. Thus Medusa is extremely
+efficient when faced with network delays, 'hiccups', and low bandwidth
+clients.
+
+<p>One final note: The mechanisms described above are completely
+general - although the examples given demonstrate application to the
+<code>http</code> protocol, Medusa's asynchronous core has been
+applied to many different protocols, including <code>smtp</code>,
+<code>pop3</code>, <code>ftp</code>, and even <code>dns</code>.
=== Zope/lib/python/ZServer/medusa/docs/producers.gif 1.3 => 1.4 ===
<Binary-ish file>
=== Zope/lib/python/ZServer/medusa/docs/proxy_notes.txt 1.3 => 1.4 ===
--- /dev/null Tue Mar 18 16:16:41 2003
+++ Zope/lib/python/ZServer/medusa/docs/proxy_notes.txt Tue Mar 18 16:16:40 2003
@@ -0,0 +1,36 @@
+
+# we can build 'promises' to produce external data. Each producer
+# contains a 'promise' to fetch external data (or an error
+# message). writable() for that channel will only return true if the
+# top-most producer is ready. This state can be flagged by the dns
+# client making a callback.
+
+# So, say 5 proxy requests come in, we can send out DNS queries for
+# them immediately. If the replies to these come back before the
+# promises get to the front of the queue, so much the better: no
+# resolve delay. 8^)
+#
+# ok, there's still another complication:
+# how to maintain replies in order?
+# say three requests come in, (to different hosts? can this happen?)
+# yet the connections happen third, second, and first. We can't buffer
+# the entire request! We need to be able to specify how much to buffer.
+#
+# ===========================================================================
+#
+# the current setup is a 'pull' model: whenever the channel fires FD_WRITE,
+# we 'pull' data from the producer fifo. what we need is a 'push' option/mode,
+# where
+# 1) we only check for FD_WRITE when data is in the buffer
+# 2) whoever is 'pushing' is responsible for calling 'refill_buffer()'
+#
+# what is necessary to support this 'mode'?
+# 1) writable() only fires when data is in the buffer
+# 2) refill_buffer() is only called by the 'pusher'.
+#
+# how would such a mode affect things? with this mode could we support
+# a true http/1.1 proxy? [i.e, support <n> pipelined proxy requests, possibly
+# to different hosts, possibly even mixed in with non-proxy requests?] For
+# example, it would be nice if we could have the proxy automatically apply the
+# 1.1 chunking for 1.0 close-on-eof replies when feeding it to the client. This
+# would let us keep our persistent connection.