[Zope-dev] Adding gzip compression to HTTPResponse.py

Tue, 5 Feb 2002 17:34:26 -0500

I'm looking for architectural suggestions for adding gzip compression to 
HTTPResponse for text types.

First, I just wanted to compress xml-rpc output, since I'm returing lots of table data as 
XML text (not objects), then loading that text/xml into a DOM for XSLT processing.

I hacked the attached code into HTTPResponse, at the end of setBody. It works for 
xml-rpc responses and I suppose any text output, so long as the response object has a 
header named "dogzip" set.

I know "dogzip" is a stupid name, but this is just a testing thing.

Representative compressions:

compress oldlen  150366 new len 11926
compress oldlen  204382 new len 14170
compress oldlen  12746 new len 1364

As you can see, very useful compressions for xml-rpc output.

But for HTML output, what's really needed is I think a special kind of Cache Object. 
One that combines HTTP Caching with Ram caching to keep gzip compressed objects 
"in memory".

Some HTML pages are really quite large, and gzip compression can make a noticable 
difference. Just the javascript code sizes themselves are .. really big.

For xml-rpc, obviously every response must be compressed if it's "worth it", and I can 
see that having to set a response property on a per request basis is appropriate for 
xml-rpc.

But for text file objects, Page Templates and stuff.. How does setBody work with Ram 
Cache objects? I have some ideas...

Anyone think this is worthwhile?

Also, RESPONSE.setBody really should have access to REQUEST.headers. What's 
the clean way to do that? Just pass the request object to response object's init 
method?

Here's quick gzip compression hack-in, based on code posted by Neil Schemenauer

Thanks Neil.

Added about line 265 in HTTPResponse.py in Zope 2.5 B3

        try:
            dogzip = self.headers['dogzip']
            del self.headers['dogzip']
            if dogzip and split(content_type,'/')[0] == 'text':
                body = self.body
                startlen = len(body)
                import zlib, struct
                _gzip_header = ("\037\213" # magic
                                "\010" # compression method
                                "\000" # flags
                                "\000\000\000\000" # time
                                "\002"
                                "\377")
                co = zlib.compressobj(6,zlib.DEFLATED,-zlib.MAX_WBITS,
                                      zlib.DEF_MEM_LEVEL,0)
                chunks = [_gzip_header, co.compress(body),
                          co.flush(),struct.pack("<ll",zlib.crc32(body),startlen)]
                z = join(chunks,"")
                newlen = len(z)
                print "compress oldlen ",startlen,"new len",newlen
                if newlen < startlen:
                    self.body = z
                    self.setHeader('content-length', newlen)
                    self.setHeader('content-encoding','gzip')
        except:
            pass

Brad Clements,                bkc@murkworks.com   (315)268-1000
http://www.murkworks.com                          (315)268-9812 Fax
netmeeting: ils://ils.murkworks.com               AOL-IM: BKClements