[Zope-dev] opinion: speeding up large PUT uploads
Florent Guillaume
fg at nuxeo.com
Mon Apr 4 08:42:46 EDT 2005
Chris McDonough <chrism at plope.com> wrote:
> Zope's ZPublisher.HTTPRequest.HTTPRequest class has a method named
> "processInputs". This method is responsible for parsing the body of all
> requests. It parses all upload bodies regardless of method: PUT, POST,
> GET, HEAD, etc. In doing so, it uses Python's FieldStorage module to
> potentially break apart multipart/* bodies into their respective parts.
> Every invocation of FieldStorage creates a tempfile that is a copy of
> the entire upload body.
>
> So in the common case, when a large file is uploaded via HTTP PUT (both
> DAV and external editor use PUT exclusively), here's what happens:
>
> - ZServer creates a tempfile T1 to hold the file body as it gets
> pulled in.
>
> - When the request makes it to the publisher, processInputs is called
> and it hands off tempfile T1 to FieldStorage.
>
> - FieldStorage reads the entire body and creates another tempfile
> T2 (an exact copy of T1*, in the case of a PUT request).
>
> - T2 is eventually put into REQUEST['BODYFILE'].
>
> (*) At least I can't imagine a case where it's not an exact copy.
>
> This is costly on large uploads. I'd like to change the top of the
> processInputs method to do this:
>
> if method == 'PUT':
> # we don't need to do any real input processing if we are
> # handling a PUT request.
> self._file = self.stdin
> return
>
> Can anyone think of a reason I shouldn't do this?
Is stdin the medusa stream or T1 at this point ? Because for
ConflictError retry we need an input that is seekable (HTTPRequest.retry
does self.stdin.seek(0)).
Florent
--
Florent Guillaume, Nuxeo (Paris, France) CTO, Director of R&D
+33 1 40 33 71 59 http://nuxeo.com fg at nuxeo.com
More information about the Zope-Dev
mailing list