I have been building an "ExternalFile" class which stores the body of the file in an external file, mirroring the Zope path/hierarchy. This will allow easy integration with servers that can mount the external representation of the content and serve it with a consistent namespace. To make life zimple, I tried to move all file manipulation to Zope, including upload/download/copy/cut/paste/delete and permissions. These external files are transaction aware, blah blah.. Working with files > 20MB I notices some serious performance/scalability issues and investigated. Here are the results. A diff with my changes against version 2.2.2 is available at <http://www.superchannel.org/Playground/large_file_zope2.2.2_200010241.diff> Concerns: Zope objects like File require data as a seekable file or as a coherent block, rather than as a stream. Initializing/updating these objects *may* require loading the entire file into memory. In memory buffering of request or response data could cause excessive swapping of the working set. Multi-service architecture (ZServer->ZPublisher) could limit the reuse of stream handles. Creating temporary files as FIFOs buffers between the services causes signficant swapping. Modifications: Using pipes I found that FTPServer.ContentCollector was using a StringIO to buffer the uploads from FTP clients. I changed this into a TemporaryFile for a while which revealed the leaked file descriptor bug (see below). This intermediary temp file caused 1 extra file copy for each request. The goal is to not have any intermediary files at all, and pipeline the content directly into the Zope objects. To remove this FTP upload file buffer, I converted the FTP collector again from a TemporaryFile into a pipe with a reader and writer file objects. The FTPRequest receives the reader from which it can process the input on the publish thread in processInputs. Since we are dealing with blocking pipes it is OK to have a reader on the publish thread and a writer on the ZServer thread. The major considerations were regarding the proper way to read from a pipe through the chain of control, especially in cgi.FieldStorage. Stdin is treated as the reader of the pipe throughout the code. All seek()s and tell()s on sys.stdin type objects (a tty not a seekable file) should be considered illegal and removed. Usage of FieldStorage from FTP (Unknown content-length) To gain access to the body of a request, one typically calls REQUEST['BODY'] or REQUEST['BODYFILE']. This returns the file object the FieldStorage copied from stdin. To prevent FieldStorage from copying the file from stdin to a temporary file, we can set the CONTENT_LENGTH header to '0' in the FTP _get_env for a STOR. In this case, FieldStorage creates a temporary file but doesn't read any data from stdin so we can return stdin directly when BODYFILE is requested and 'content-length' is '0'. However, BODYFILE could be a pipe which doesn't support 'seek' or 'tell'. The code used to suck the data off the BODYFILE needs to be modified to adapt to the possibly of being passed a pipe. Updating Image.File to play with pipes The _read_data method of Image.File pulls the data out of the BODYFILE and sticks it in the instance as a string, pdata object, or a linked list of pdata objects. The existing code reads and builds the list in one clean sweep back-to-front. I belive this keeps the pdata.data chunks out of memory, quickly (sub)committing then deactivating (_p_changed = None) them. Since we can no longer safely assume 'seek' is valid for BODYFILE, I tried to read and build the list front-to-back. This kept the data in memory, even though I tried to deactivate the objects quickly. As a tradeoff, I read the data front-to-back then built the list back-to-front taking another pass to reverse the list so it is in the correct order. Memory usage appears to be steady, meaning the whole file is not loaded into the working set. This also prevents unecessary reading into a temporary FieldStorage file during an FTP upload. Web based uploads... ...suck. I do not recommend doing a web based upload for files > 1mb. First, a content-length is known, so we don't get the advantage of pipelining the data directly from the socket, a temporary file must be created, written and read. Second, I believe the content is encoded so the transferred bitcount is much higher than using FTP. Plus, most browsers today do not support a progress bar for posts, so there is no indication of status, causing most people to click 'Upload' multiple times. I haven't done any optimization for this case, but have tested that is still works properly. Cleaning up (leaked file descriptor bug) I noticed that when uploading 20+ MB a couple of times, I ran out of hard drive space. This didn't make sense and I looked into what files were open by Zope. Doing an 'lsof' I found that the temporary files which are immediatly unlinked after creation, were still open until the end of the Zope process. These files (created by tempfile.TemporaryFile) needed to be closed after the end of the REQUEST and RESPONSE, rather than at the end of the Zope process. After publishing, the close method of the REQUEST gets called. Here I added closing of stdin and the FieldStorage created TemporaryFile '_file'. Output producers The ZServer.HTTPResponse object makes a good attempt of keeping large results out of memory but does so by creating a temporary file and copying any written data to it then pushing a file_part_producer onto the channel output queue. If the Zope object knows how to produce the data themselves, they could push producer(s) directly to the channel. I added a single check in ZServer.HTTPResponse(256) where a temporary file is only created if the data is larger than the in-memory buffer *and* doesn't already look like a producer with 'more' as a method. If the temporary file doesn't exist the rest of the code simply writes the data to the channel and the channel produces the output directly from the producer created by the Zope object. Using a file producer from my Zope object cuts out a file copy, and those get expensive when one is dealing with 20+ MB files. The response time is also dramatically reduced because the file copy step before streaming to the client was removed. I would like to apply the same concept to Image.File.index_html where rather than creating a temporary file in the RESPONSE to queue the contents, create a producer to pull the data directly out of the backend when it is ready to write. I am experiencing a 10 second latency (233Mhz laptop) between requesting a 10MB file and receiving the first byte with the current code. If an output producer is used, this latency would drop < 1 sec. I made an attempt to create a pdata_producer but failed because of ZODB errors reloading the object. I get a traceback like: 2000-10-24T09:19:08 ERROR(200) ZODB Couldn't load state for '\000\000\000\000\000\000&\370' Traceback (innermost last): File /usr/local/zope/lib/python/ZODB/Connection.py, line 442, in setstate AttributeError: 'None' object has no attribute 'load' My hunch is that the Image, pdata_producer or pdata object gets deactivated and can't find its DB to load itself. I tried setting a _p_jar on the pdata_producer, but I don't really know what happens when the object context leaves publish_module. Since the object activation happens in the ZServer thread, some voodoo may be needed to get the proper state in the pdata_producer.... any takers? Testing... I have only tested these changes with FTPServer and HTTPServer, not PCGIServer or FCGIServer. I have tested round-trip coherency because of the change in Image.File._read_data. I haven't completely tested boundary conditions, where Image.File._read_data makes descisions. The extent has been large files 10+ MB and small files < 64K. I haven't tested HTTPRequest.retry which will probably fail because HTTPRequest.stdin now may be a pipe. 3rd party products which treat BODYFILE as a seekable file object may fail during FTP uploads. Summary: Most of these efforts are geared towards FTP, as HTTP form uploads don't seem to be worth the effort. I haven't taken a look at HTTP PUT, for webdav clients etc... Similar pipelining could be used, however I doubt they would be possible without modifing cgi.FieldStorage. Zope seems to be doing a lot with TempStorage and other ZODB magic I didn't care about checking out. Some performance improvements could be included here. FTP I/O with my changes including my ExternalFile custom output producer dramatically increases Zopes performance and scalability. -Sean
On Tue, 24 Oct 2000 20:31:52 +0200, seant@superchannel.org wrote:
If the Zope object knows how to produce the data themselves, they could push producer(s) directly to the channel. I added a single check in ZServer.HTTPResponse(256) where a temporary file is only created if the data is larger than the in-memory buffer *and* doesn't already look like a producer with 'more' as a method.
Wahay! thats been on my todo list for ages. Ill take a look when I get some time. Toby Dickenson tdickenson@geminidataloggers.com
I should also note that if you create a producer, you will have to override the __len__ method to return the entire length of the data. This is because RESPONSE.write doesn't allow you to set the length of a write and there code during output that checks the size of the written object. -Sean Toby Dickenson(mbel44@dial.pipex.net)@Wed, Oct 25, 2000 at 12:15:37PM +0100:
On Tue, 24 Oct 2000 20:31:52 +0200, seant@superchannel.org wrote:
If the Zope object knows how to produce the data themselves, they could push producer(s) directly to the channel. I added a single check in ZServer.HTTPResponse(256) where a temporary file is only created if the data is larger than the in-memory buffer *and* doesn't already look like a producer with 'more' as a method.
Wahay! thats been on my todo list for ages. Ill take a look when I get some time.
Toby Dickenson tdickenson@geminidataloggers.com
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
How does this differ from Local FS? cheers, Chris seant@superchannel.org wrote:
I have been building an "ExternalFile" class which stores the body of the file in an external file, mirroring the Zope path/hierarchy. This will allow easy integration with servers that can mount the external representation of the content and serve it with a consistent namespace.
To make life zimple, I tried to move all file manipulation to Zope, including upload/download/copy/cut/paste/delete and permissions. These external files are transaction aware, blah blah..
Working with files > 20MB I notices some serious performance/scalability issues and investigated. Here are the results.
A diff with my changes against version 2.2.2 is available at <http://www.superchannel.org/Playground/large_file_zope2.2.2_200010241.diff>
Concerns:
Zope objects like File require data as a seekable file or as a coherent block, rather than as a stream. Initializing/updating these objects *may* require loading the entire file into memory.
In memory buffering of request or response data could cause excessive swapping of the working set.
Multi-service architecture (ZServer->ZPublisher) could limit the reuse of stream handles.
Creating temporary files as FIFOs buffers between the services causes signficant swapping.
Modifications:
Using pipes I found that FTPServer.ContentCollector was using a StringIO to buffer the uploads from FTP clients. I changed this into a TemporaryFile for a while which revealed the leaked file descriptor bug (see below). This intermediary temp file caused 1 extra file copy for each request. The goal is to not have any intermediary files at all, and pipeline the content directly into the Zope objects.
To remove this FTP upload file buffer, I converted the FTP collector again from a TemporaryFile into a pipe with a reader and writer file objects. The FTPRequest receives the reader from which it can process the input on the publish thread in processInputs.
Since we are dealing with blocking pipes it is OK to have a reader on the publish thread and a writer on the ZServer thread. The major considerations were regarding the proper way to read from a pipe through the chain of control, especially in cgi.FieldStorage.
Stdin is treated as the reader of the pipe throughout the code. All seek()s and tell()s on sys.stdin type objects (a tty not a seekable file) should be considered illegal and removed.
Usage of FieldStorage from FTP (Unknown content-length)
To gain access to the body of a request, one typically calls REQUEST['BODY'] or REQUEST['BODYFILE']. This returns the file object the FieldStorage copied from stdin.
To prevent FieldStorage from copying the file from stdin to a temporary file, we can set the CONTENT_LENGTH header to '0' in the FTP _get_env for a STOR.
In this case, FieldStorage creates a temporary file but doesn't read any data from stdin so we can return stdin directly when BODYFILE is requested and 'content-length' is '0'. However, BODYFILE could be a pipe which doesn't support 'seek' or 'tell'. The code used to suck the data off the BODYFILE needs to be modified to adapt to the possibly of being passed a pipe.
Updating Image.File to play with pipes
The _read_data method of Image.File pulls the data out of the BODYFILE and sticks it in the instance as a string, pdata object, or a linked list of pdata objects. The existing code reads and builds the list in one clean sweep back-to-front. I belive this keeps the pdata.data chunks out of memory, quickly (sub)committing then deactivating (_p_changed = None) them.
Since we can no longer safely assume 'seek' is valid for BODYFILE, I tried to read and build the list front-to-back. This kept the data in memory, even though I tried to deactivate the objects quickly.
As a tradeoff, I read the data front-to-back then built the list back-to-front taking another pass to reverse the list so it is in the correct order.
Memory usage appears to be steady, meaning the whole file is not loaded into the working set. This also prevents unecessary reading into a temporary FieldStorage file during an FTP upload.
Web based uploads...
...suck. I do not recommend doing a web based upload for files > 1mb. First, a content-length is known, so we don't get the advantage of pipelining the data directly from the socket, a temporary file must be created, written and read. Second, I believe the content is encoded so the transferred bitcount is much higher than using FTP.
Plus, most browsers today do not support a progress bar for posts, so there is no indication of status, causing most people to click 'Upload' multiple times.
I haven't done any optimization for this case, but have tested that is still works properly.
Cleaning up (leaked file descriptor bug)
I noticed that when uploading 20+ MB a couple of times, I ran out of hard drive space. This didn't make sense and I looked into what files were open by Zope. Doing an 'lsof' I found that the temporary files which are immediatly unlinked after creation, were still open until the end of the Zope process. These files (created by tempfile.TemporaryFile) needed to be closed after the end of the REQUEST and RESPONSE, rather than at the end of the Zope process.
After publishing, the close method of the REQUEST gets called. Here I added closing of stdin and the FieldStorage created TemporaryFile '_file'.
Output producers
The ZServer.HTTPResponse object makes a good attempt of keeping large results out of memory but does so by creating a temporary file and copying any written data to it then pushing a file_part_producer onto the channel output queue.
If the Zope object knows how to produce the data themselves, they could push producer(s) directly to the channel. I added a single check in ZServer.HTTPResponse(256) where a temporary file is only created if the data is larger than the in-memory buffer *and* doesn't already look like a producer with 'more' as a method.
If the temporary file doesn't exist the rest of the code simply writes the data to the channel and the channel produces the output directly from the producer created by the Zope object.
Using a file producer from my Zope object cuts out a file copy, and those get expensive when one is dealing with 20+ MB files. The response time is also dramatically reduced because the file copy step before streaming to the client was removed.
I would like to apply the same concept to Image.File.index_html where rather than creating a temporary file in the RESPONSE to queue the contents, create a producer to pull the data directly out of the backend when it is ready to write. I am experiencing a 10 second latency (233Mhz laptop) between requesting a 10MB file and receiving the first byte with the current code. If an output producer is used, this latency would drop < 1 sec.
I made an attempt to create a pdata_producer but failed because of ZODB errors reloading the object. I get a traceback like:
2000-10-24T09:19:08 ERROR(200) ZODB Couldn't load state for '\000\000\000\000\000\000&\370' Traceback (innermost last): File /usr/local/zope/lib/python/ZODB/Connection.py, line 442, in setstate AttributeError: 'None' object has no attribute 'load'
My hunch is that the Image, pdata_producer or pdata object gets deactivated and can't find its DB to load itself. I tried setting a _p_jar on the pdata_producer, but I don't really know what happens when the object context leaves publish_module. Since the object activation happens in the ZServer thread, some voodoo may be needed to get the proper state in the pdata_producer.... any takers?
Testing...
I have only tested these changes with FTPServer and HTTPServer, not PCGIServer or FCGIServer.
I have tested round-trip coherency because of the change in Image.File._read_data.
I haven't completely tested boundary conditions, where Image.File._read_data makes descisions. The extent has been large files 10+ MB and small files < 64K.
I haven't tested HTTPRequest.retry which will probably fail because HTTPRequest.stdin now may be a pipe.
3rd party products which treat BODYFILE as a seekable file object may fail during FTP uploads.
Summary:
Most of these efforts are geared towards FTP, as HTTP form uploads don't seem to be worth the effort.
I haven't taken a look at HTTP PUT, for webdav clients etc... Similar pipelining could be used, however I doubt they would be possible without modifing cgi.FieldStorage.
Zope seems to be doing a lot with TempStorage and other ZODB magic I didn't care about checking out. Some performance improvements could be included here.
FTP I/O with my changes including my ExternalFile custom output producer dramatically increases Zopes performance and scalability.
-Sean
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
On Wed, 25 Oct 2000 12:35:23 +0100, Chris Withers <chrisw@nipltd.com> wrote:
How does this differ from Local FS?
I dont recall exactly how LocalFS worked, but without this patch it basically had three options for handling its output: 1. copy the whole file into memory before sending the first byte. (this is ZPublishers normal publishing of a functions return value) 2. copy the whole file into memory a chunk at a time, and start sending the first chunk as soon as it is available. (this is normal RESPONSE.write) 3. copy the whole file into a temporary file a chunk at a time, as soon as the first chunk is available read it back and send it. (this is RESPONSE.write after a Content-Length header has been set, as used by File objects)
Working with files > 20MB I notices some serious performance/scalability issues and investigated.
Mmmmmmmm Toby Dickenson tdickenson@geminidataloggers.com
There is not much difference between the ExternalFile class I'm working with and the File objects produced by LocalFS except External Files can be put anywhere in the Zope hierarchy and LocalFS files need to be under a LocalFS. Each approach has its pros and cons. This proposal mostly deals with the Zope framework, which will effect both products. Chris Withers(chrisw@nipltd.com)@Wed, Oct 25, 2000 at 12:35:23PM +0100:
How does this differ from Local FS?
cheers,
Chris
participants (3)
-
Chris Withers -
seant@superchannel.org -
Toby Dickenson