RE: [Zope] kludge around byteserving problem (2.4)
<speculation type="mere"> Me thinks the Adobe (boo, hiss) plugin is using HTTP/1.1 to retreive _parts_ of the .pdf file. ZServer does not support HTTP/1.1 completely, hence the need for a hack. </speculation> I haven't looked at the HTTP requests and headers to verify. Troy -----Original Message----- From: Kyler B. Laird [mailto:laird@ecn.purdue.edu] Sent: Wednesday, July 18, 2001 3:52 PM To: zope@zope.org Subject: [Zope] kludge around byteserving problem (2.4) It seems that 2.4.0b3 has a problem with byteserving, but I haven't figured it out enough to submit a problem report. The symptom, however, is that large (>32KB) PDF objects can be downloaded and viewed in an external helper app, but don't appear in the Adobe plugin. My kludge is to make a simple Python Script that simply serves up the object's data. request = context.REQUEST RESPONSE = request.RESPONSE objectname=str(request.other['traverse_subpath'][0]) object=context[objectname] RESPONSE.setHeader('Content-type', object.getProperty('content_type')) return object.data This guarantees that no byteserving will take place (because it's dynamic). Named "viewfilter", it's called as .../viewfilter/foo.pdf I'll try to figure out what's really going on and submit it to the tracker. --kyler _______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
On Wed, 18 Jul 2001 15:56:51 -0500 you wrote:
<speculation type="mere"> Me thinks the Adobe (boo, hiss) plugin is using HTTP/1.1 to retreive _parts_ of the .pdf file.
I'm not familiar with HTTP/1.1 "parts", but I am fairly confident that it's using byterange requests. I can see it request 32KB then make several other requests for different sizes. I tried eliminating the Accept-Ranges header in the response and other kludges, but it continued. Some things to check for more info on Adobe workings... http://www.adobe.com/support/techdocs/3d76.htm http://www.adobe.com/support/techguides/acrobat/byteserve/byteservmain.html Incidentally, we didn't notice this before 2.4 (but I didn't have all of my PDF tools installed then, so maybe the files weren't so large?). --kyler
Hi, this seems to be another good reason to put Squid in front of Zope. It handles byte requests by his own, loading the whole object from Zope at once. Regards Tino --On Mittwoch, 18. Juli 2001 16:26 -0500 "Kyler B. Laird" <laird@ecn.purdue.edu> wrote:
On Wed, 18 Jul 2001 15:56:51 -0500 you wrote:
<speculation type="mere"> Me thinks the Adobe (boo, hiss) plugin is using HTTP/1.1 to retreive _parts_ of the .pdf file.
I'm not familiar with HTTP/1.1 "parts", but I am fairly confident that it's using byterange requests. I can see it request 32KB then make several other requests for different sizes. I tried eliminating the Accept-Ranges header in the response and other kludges, but it continued.
Some things to check for more info on Adobe workings... http://www.adobe.com/support/techdocs/3d76.htm http://www.adobe.com/support/techguides/acrobat/byteserve/byteservmain.h tml
Incidentally, we didn't notice this before 2.4 (but I didn't have all of my PDF tools installed then, so maybe the files weren't so large?).
--kyler
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
On Thu, 19 Jul 2001 10:07:40 +0200 you wrote:
this seems to be another good reason to put Squid in front of Zope. It handles byte requests by his own, loading the whole object from Zope at once.
Ah...I hadn't thought of this. Yes, it would not only get around this (temporary?!) problem, but it could be a way to move some processing away from our Zope servers (ZEO clients) - something I'm constantly trying to do. I'm pretty locked in to Apache right now (for features like SSL and PostgreSQL logging), but I think I can see how you might instruct a proxy to only provide this byterange service when Zope signals static content by returning an Accept-Ranges header. It's tempting to build this into Apache's proxy module. It's been several years since I touched that, though. I doubt I'll do it anytime soon, but I'll be watching for it as an option later. Thanks for the pointer. --kyler
ZServer does not support HTTP/1.1 completely, hence the need for a hack. </speculation> I haven't looked at the HTTP requests and headers to verify.
[With my kludge in place, I have time to breathe and look into this some more. I've changed the subject line in an attempt to be more accurate.] I have set up a File that exhibits the bad behavior. https://engineering.purdue.edu/test/WRRC.pdf For testing, it's also available through HTTP at http://maverick.ecn.purdue.edu:8080/test/WRRC.pdf I verified that it results in a blank page (without Adobe plugin controls) using MS Windows browsers. A request looks like this to the Apache proxy. [19/Jul/2001:09:37:50 -0500] 128.46.125.148 SSLv3 RC4-MD5 "GET /test/WRRC.pdf HTTP/1.0" 40960 [19/Jul/2001:09:37:59 -0500] 128.46.125.148 SSLv3 RC4-MD5 "GET /test/WRRC.pdf HTTP/1.0" 2539462 The total file size is 2547654 bytes. The sum of the bytes received in the requests above is 2580422. It *could* have gotten everything using a byterange request. (When stored, the file is perfectly viewable by the non-plugin Reader.) I decided to test Zope's byterange handling. I used a terrible little Python script (included below for your amusement) to grab random ranges of the problem file and check the contents against the original file. I let it beat on the server for awhile, but it did not come up with any discrepencies. It appears that Zope can properly serve ranges of a File in some situations. I suspect that there is a more complicated interaction with the Adobe plugin. My first guess is that it has to do with keep-alive. I'm not sure how to test that, though. I suspect that my next test will involve setting up a proxy server that lets me monitor the entire transaction. Looking at the headers and returned content *should* shine some light on this. --kyler ======================================================== #!/usr/bin/env python import httplib import time import random def http_get_range(host='maverick.ecn.purdue.edu:8080', path='/test/WRRC.pdf', range=None): h = httplib.HTTP(host) h.putrequest('GET', path) if range: h.putheader('Range', range) h.endheaders() (errcode, errmsg, headers) = h.getreply() # print errcode print headers f = h.getfile() content = f.read() f.close() return content file = open('WRRC.pdf') local = file.read() # size of complete file size=len(local) print "size=%d" % size while (1): start = random.randrange(size) # Concentrate around 32K boundary. #start = random.randrange(32000,size) end = random.randrange(start,size) range = 'bytes=%d-%d' % (start, end) print range original = local[start:end+1] zope = http_get_range(range=range) # print 'original=%s' % (original) # print 'zope=%s' % (zope) if (zope != original): print 'difference for %s' % range break time.sleep(5)
It took awhile, but I finally discovered that Zope handles multiple byte ranges with a shortcut that Adobe's plugin does not seem to like. If I ask for Range: bytes=1-2, 2-3 from Apache, I'll get 206 Date: Tue, 24 Jul 2001 23:34:17 GMT Server: Apache/1.3.9 (Unix) PHP/3.0.12 FrontPage/4.0.4.3 secured_by_Raven/1.4.1 Last-Modified: Tue, 24 Jul 2001 23:27:06 GMT ETag: "57acf-26dfc6-3b5e044a" Accept-Ranges: bytes Content-Length: 185 Connection: close Content-Type: multipart/byteranges; boundary=3b5e05f9e --3b5e05f9e Content-type: application/pdf Content-range: bytes 1-2/2547654 PD --3b5e05f9e Content-type: application/pdf Content-range: bytes 2-3/2547654 DF --3b5e05f9e-- If I ask for the same thing from Zope, I get 206 Server: Zope/(Zope 2.4.0 (source release, python 2.1, linux2), python 2.1.1, linux2) ZServer/1.1b1 Date: Tue, 24 Jul 2001 23:39:01 GMT Content-Type: application/pdf Accept-Ranges: bytes Connection: close Content-Range: bytes 1-3/2547654 Last-Modified: Thu, 19 Jul 2001 14:39:42 GMT Content-Length: 3 PDF Looking over the HTTP/1.1 spec., it's not clear to me that this is illegal, but it sure does suck to have it break the PDF plug in. I'm not sure I can call this a bug. Any advice? I've included my test program below. --kyler ============================ #!/usr/bin/env python import httplib import time import random def http_get_range(host, path, range=None): h = httplib.HTTP(host) h.putrequest('GET', path) if range: h.putheader('Range', range) h.endheaders() (errcode, errmsg, headers) = h.getreply() print errcode print headers f = h.getfile() content = f.read() f.close() return content if (1): range = 'bytes=2546630-2547653, 2342854-2372853, 2372854-2402853' range = 'bytes=2546630-2547653, 2342854-2372853' range = 'bytes=1-2, 2-3' print '===== Apache =====' apache = http_get_range(host='lairds.com', path='/Kyler/tmp/WRRC.pdf', range=range) print apache print '===== Zope =====' zope = http_get_range(host='maverick.ecn.purdue.edu:8080', path='/test/WRRC.pdf', range=range) print zope
I found that by disabling HTTPRangeSupport.optimizeRanges() I can keep Zope from messing with the desired ranges as sent by the Acrobat plugin. This still didn't fix my problem, so I ended up commenting out the "Accept-Ranges" headers and disabling the "if range is not None:" block in OFS/Image.py. I suggest that if you are serving up PDF files from Zope, you do this too. If you don't usually use the Acrobat _plugin_ you might not notice that some users can not see your content. I'm not sure what the incompatibility is now. I'm giving up on it for awhile (so my eyes can get back into alignment - too much staring at trace dumps). --kyler
participants (3)
-
Farrell, Troy -
Kyler B. Laird -
Tino Wildenhain