byteserving vs. Adobe plugin (was Re: [Zope] kludge around byteserving problem (2.4))

Thu, 19 Jul 2001 10:09:53 -0500

>ZServer does not support HTTP/1.1 completely, hence the
>need for a hack.
></speculation>
>I haven't looked at the HTTP requests and headers to verify.

[With my kludge in place, I have time to breathe and
look into this some more.  I've changed the subject
line in an attempt to be more accurate.]

I have set up a File that exhibits the bad behavior.
	https://engineering.purdue.edu/test/WRRC.pdf
For testing, it's also available through HTTP at
	http://maverick.ecn.purdue.edu:8080/test/WRRC.pdf
I verified that it results in a blank page (without
Adobe plugin controls) using MS Windows browsers.  A
request looks like this to the Apache proxy.
	[19/Jul/2001:09:37:50 -0500] 128.46.125.148 SSLv3 RC4-MD5 "GET /test/WRRC.pdf HTTP/1.0" 40960
	[19/Jul/2001:09:37:59 -0500] 128.46.125.148 SSLv3 RC4-MD5 "GET /test/WRRC.pdf HTTP/1.0" 2539462

The total file size is 2547654 bytes.  The sum of
the bytes received in the requests above is 
2580422.  It *could* have gotten everything using a
byterange request.  (When stored, the file is
perfectly viewable by the non-plugin Reader.)

I decided to test Zope's byterange handling.  I
used a terrible little Python script (included
below for your amusement) to grab random ranges of
the problem file and check the contents against the
original file.  I let it beat on the server for
awhile, but it did not come up with any
discrepencies.

It appears that Zope can properly serve ranges of
a File in some situations.  I suspect that there is
a more complicated interaction with the Adobe
plugin.  My first guess is that it has to do with
keep-alive.  I'm not sure how to test that, though.

I suspect that my next test will involve setting up
a proxy server that lets me monitor the entire
transaction.  Looking at the headers and returned
content *should* shine some light on this.

--kyler

========================================================

#!/usr/bin/env python

import httplib
import time
import random

def http_get_range(host='maverick.ecn.purdue.edu:8080', path='/test/WRRC.pdf', range=None):
	h = httplib.HTTP(host)
	h.putrequest('GET', path)
	if range:
		h.putheader('Range', range)

	h.endheaders()

	(errcode, errmsg, headers) = h.getreply()

	# print errcode
	print headers

	f = h.getfile()
	content = f.read()
	f.close()

	return content

file = open('WRRC.pdf')
local = file.read()

# size of complete file
size=len(local)
print "size=%d" % size

while (1):
	start = random.randrange(size)
	# Concentrate around 32K boundary.
	#start = random.randrange(32000,size)
	end = random.randrange(start,size)
	range = 'bytes=%d-%d' % (start, end)

	print range

	original = local[start:end+1]
	zope = http_get_range(range=range)

	# print 'original=%s' % (original)
	# print 'zope=%s' % (zope)

	if (zope != original):
		print 'difference for %s' % range
		break

	time.sleep(5)