Isolating changes that can only be done globally
Hello All, This is probably more a Python question but maybe you will have a quick solution for me - I guess the greatest *multi-threaded* Python application provides the greatest basis for this problem domain :-) The situation: external method that is doing a http request using urllib2. I don't care about the response, and whether there was any response at all, I just need to send the request through - thus I want this request to time out very fast (let it be 2 seconds). I found no documented way to set the timeout for urllib2, after some googling I found an advice to manipulate the timeout on the lower-level socket module, something like this: import socket import urllib2 def do_request(): timeout = 2 socket.setdefaulttimeout(timeout) req = urllib2.Request(url='http://my.site.com/do_something_quick') response = urllib2.urlopen(req) The problem is this way this default timeout is set for _all_ new socket created by this Python process using the socket module. Even if I return the default to its previous state after the request it won't help me much - there are three more threads in my Zope that should be able to work with default timeout. So there are two possible solutions - to find a (preferably documented) way of accessing and parameterizing the socket object created by urllib2 to make a request or to find a way to isolate(??) global module settings between Zope threads. Zope 2.8.3, Python 2.4.2, FreeBSD 4.10 if this is relevant. Any hints/TFMs? Many thanks Michael
How about a different approach: use an os.fork or spawn call that launches a 'wget' (or equivalent) command into a new process. Caution: depending on how you do this you may get a bunch of dead 'children' processes cluttering up the process table and will need to implement some kind of reaper process (google is your friend here). hth Jonathan ----- Original Message ----- From: "Michael Vartanyan" <pycry@doli.biz> To: <zope@zope.org> Sent: Monday, May 29, 2006 7:27 PM Subject: [Zope] Isolating changes that can only be done globally
Hello All,
This is probably more a Python question but maybe you will have a quick solution for me - I guess the greatest *multi-threaded* Python application provides the greatest basis for this problem domain :-)
The situation: external method that is doing a http request using urllib2. I don't care about the response, and whether there was any response at all, I just need to send the request through - thus I want this request to time out very fast (let it be 2 seconds). I found no documented way to set the timeout for urllib2, after some googling I found an advice to manipulate the timeout on the lower-level socket module, something like this:
import socket import urllib2
def do_request(): timeout = 2 socket.setdefaulttimeout(timeout) req = urllib2.Request(url='http://my.site.com/do_something_quick') response = urllib2.urlopen(req)
The problem is this way this default timeout is set for _all_ new socket created by this Python process using the socket module. Even if I return the default to its previous state after the request it won't help me much - there are three more threads in my Zope that should be able to work with default timeout. So there are two possible solutions - to find a (preferably documented) way of accessing and parameterizing the socket object created by urllib2 to make a request or to find a way to isolate(??) global module settings between Zope threads.
Zope 2.8.3, Python 2.4.2, FreeBSD 4.10 if this is relevant.
Any hints/TFMs?
Many thanks Michael
_______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
I suspect you're going to have to subclass the urllib2 Opener class (es) and use a timeoutsocket where they create sockets. Making this be an option upstream would be a valuable addition to Python, FWIW. The other way to do this would be to ditch urllib2 and write a very simple HTTP client using asyncore (which is far more malleable). One that I created based on some code I found floating around in RDFLib is attached. On May 29, 2006, at 7:27 PM, Michael Vartanyan wrote:
Hello All,
This is probably more a Python question but maybe you will have a quick solution for me - I guess the greatest *multi-threaded* Python application provides the greatest basis for this problem domain :-)
The situation: external method that is doing a http request using urllib2. I don't care about the response, and whether there was any response at all, I just need to send the request through - thus I want this request to time out very fast (let it be 2 seconds). I found no documented way to set the timeout for urllib2, after some googling I found an advice to manipulate the timeout on the lower- level socket module, something like this:
import socket import urllib2
def do_request(): timeout = 2 socket.setdefaulttimeout(timeout) req = urllib2.Request(url='http://my.site.com/do_something_quick') response = urllib2.urlopen(req)
The problem is this way this default timeout is set for _all_ new socket created by this Python process using the socket module. Even if I return the default to its previous state after the request it won't help me much - there are three more threads in my Zope that should be able to work with default timeout. So there are two possible solutions - to find a (preferably documented) way of accessing and parameterizing the socket object created by urllib2 to make a request or to find a way to isolate(??) global module settings between Zope threads.
Zope 2.8.3, Python 2.4.2, FreeBSD 4.10 if this is relevant.
Any hints/TFMs?
Many thanks Michael
_______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
FWIW, the last time I did something like this I ended up using pycurl ( http://pycurl.sourceforge.net/ ). I don't have that code any more (it belongs to a former employer) but it was pretty easy to get working, it supported timeouts, it was fast, and it worked with a cantankerous proxy server setup that broke httplib. (discussion here: http://mail.python.org/pipermail/python-list/2005-April/275678.html ) -PW On Tue, May 30, 2006 at 12:21:48AM -0400, Chris McDonough wrote:
I suspect you're going to have to subclass the urllib2 Opener class (es) and use a timeoutsocket where they create sockets. Making this be an option upstream would be a valuable addition to Python, FWIW.
The other way to do this would be to ditch urllib2 and write a very simple HTTP client using asyncore (which is far more malleable). One that I created based on some code I found floating around in RDFLib is attached.
# this code based on Daniel Krech's RDFLib HTTP client code (see rdflib.net)
import sys import socket import asyncore import asynchat import base64 from urlparse import urlparse
CR="\x0d" LF="\x0a" CRLF=CR+LF
class Listener(object):
def status(self, url, status): pass
def error(self, url, error): pass
def response_header(self, url, name, value): pass
def done(self, url): pass
def feed(self, url, data): print data
def close(self, url): pass
class HTTPHandler(object, asynchat.async_chat): def __init__(self, listener, username='', password=None): super(HTTPHandler, self).__init__() asynchat.async_chat.__init__(self) self.listener = listener self.user_agent = 'Supervisor HTTP Client' self.buffer = '' self.set_terminator(CRLF) self.connected = 0 self.part = self.status_line self.chunk_size = 0 self.chunk_read = 0 self.length_read = 0 self.length = 0 self.encoding = None self.username = username self.password = password self.url = None self.error_handled = False
def get(self, url): assert(self.url==None, "Already doing a get") #@@ self.url = url scheme, host, path, params, query, fragment = urlparse(url) if not scheme=="http": raise NotImplementedError self.host = host if ":" in host: hostname, port = host.split(":", 1) port = int(port) else: hostname = host port = 80
self.path = "?".join([path, query]) self.port = port
ip = hostname self.create_socket(socket.AF_INET, socket.SOCK_STREAM) self.connect((ip, self.port))
def close (self): self.listener.close(self.url) self.connected = 0 self.del_channel() self.socket.close() self.url = "CLOSED"
def header(self, name, value): self.push('%s: %s' % (name, value)) self.push(CRLF)
def handle_error (self): if self.error_handled == True: return if 1 or self.connected: t,v,tb = sys.exc_info() print t, v, tb msg = 'Cannot connect to %s, error: %s' % (self.url, t) self.part = self.ignore self.close() print msg self.error_handled = True
def handle_connect(self): self.connected = 1 method = "GET" version = "HTTP/1.1" self.push("%s %s %s" % (method, self.path, version)) self.push(CRLF) self.header("Host", self.host)
self.header('Accept-Encoding', 'chunked') self.header('Accept', '*/*') self.header('User-agent', self.user_agent) if self.password: auth = '%s:%s' % (self.username, self.password) auth = base64.encodestring(auth).strip() self.header('Authorization', 'Basic %s' % auth) self.push(CRLF) self.push(CRLF)
def feed(self, data): self.listener.feed(self.url, data)
def collect_incoming_data(self, bytes): self.buffer = self.buffer + bytes if self.part==self.body: self.feed(self.buffer) self.buffer = ''
def found_terminator(self): self.part() self.buffer = ''
def ignore(self): self.buffer = ''
def status_line(self): line = self.buffer
version, status, reason = line.split(None, 2) status = int(status) if not version.startswith('HTTP/'): raise ValueError(line)
self.listener.status(self.url, status)
if status == 200: self.part = self.headers else: self.part = self.ignore print 'Cannot read %s, status code %s' % (self.url, status) self.close() return version, status, reason
def headers(self): line = self.buffer if not line: if self.encoding=="chunked": self.part = self.chunked_size else: self.part = self.body self.set_terminator(self.length) else: name, value = line.split(":", 1) if name and value: name = name.lower() value = value.strip() if name=="Transfer-Encoding".lower(): self.encoding = value elif name=="Content-Length".lower(): self.length = int(value) self.response_header(name, value)
def response_header(self, name, value): self.listener.response_header(self.url, name, value)
def body(self): self.done() self.close()
def done(self): self.listener.done(self.url)
def chunked_size(self): line = self.buffer if not line: return chunk_size = int(line.split()[0], 16) if chunk_size==0: self.part = self.trailer else: self.set_terminator(chunk_size) self.part = self.chunked_body self.length += chunk_size
def chunked_body(self): line = self.buffer self.set_terminator(CRLF) self.part = self.chunked_size self.feed(line)
def trailer(self): # http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.6.1 # trailer = *(entity-header CRLF) line = self.buffer if line==CRLF: self.done() self.close()
if __name__ == '__main__': url = sys.argv[1] listener = Listener() handler = HTTPHandler(listener) try: handler.get(url) except Exception, e: listener.error(url, "Error connecting '%s'" % e)
asyncore.loop() print "-"
On May 29, 2006, at 7:27 PM, Michael Vartanyan wrote:
Hello All,
This is probably more a Python question but maybe you will have a quick solution for me - I guess the greatest *multi-threaded* Python application provides the greatest basis for this problem domain :-)
The situation: external method that is doing a http request using urllib2. I don't care about the response, and whether there was any response at all, I just need to send the request through - thus I want this request to time out very fast (let it be 2 seconds). I found no documented way to set the timeout for urllib2, after some googling I found an advice to manipulate the timeout on the lower- level socket module, something like this:
import socket import urllib2
def do_request(): timeout = 2 socket.setdefaulttimeout(timeout) req = urllib2.Request(url='http://my.site.com/do_something_quick') response = urllib2.urlopen(req)
The problem is this way this default timeout is set for _all_ new socket created by this Python process using the socket module. Even if I return the default to its previous state after the request it won't help me much - there are three more threads in my Zope that should be able to work with default timeout. So there are two possible solutions - to find a (preferably documented) way of accessing and parameterizing the socket object created by urllib2 to make a request or to find a way to isolate(??) global module settings between Zope threads.
Zope 2.8.3, Python 2.4.2, FreeBSD 4.10 if this is relevant.
Any hints/TFMs?
Many thanks Michael
_______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
_______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
-- Paul Winkler http://www.slinkp.com
Thank you Paul and others for all your invaluable input! I considered using pycurl and it looks like this a bit of an overkill for my purpose (which is to simply tickle an external URL). At the same time I guess nobody would argue that that this is the most straightforward way. I agree with Chris though that it would not harm to add a possibility to adjust the timeout on the level of Request object in urllib2. Thanks again for your kind and quick help. Paul Winkler wrote:
FWIW, the last time I did something like this I ended up using pycurl ( http://pycurl.sourceforge.net/ ). I don't have that code any more (it belongs to a former employer) but it was pretty easy to get working, it supported timeouts, it was fast, and it worked with a cantankerous proxy server setup that broke httplib. (discussion here: http://mail.python.org/pipermail/python-list/2005-April/275678.html )
-PW
On Tue, May 30, 2006 at 04:11:33PM +0200, Michael Vartanyan wrote:
... I agree with Chris though that it would not harm to add a possibility to adjust the timeout on the level of Request object in urllib2.
Yes please! I've been annoyed by the lack of this for years, but never quite enough annoyed to fix it myself :) -- Paul Winkler http://www.slinkp.com
why can you not use something along the following: (note timeoutsocket is part of python 2.4) robert class Connection(httplib.HTTPConnection): def __init__(self, host, port=None, strict=None, scheme="http", timeout=5): self.scheme = scheme if scheme == "http": self.default_port = httplib.HTTP_PORT elif scheme == "https": self.default_port = httplib.HTTPS_PORT else: raise ValueError, "Invalid scheme '%s'" % (scheme,) httplib.HTTPConnection.__init__(self, host, port, strict) self.timeout = timeout def connect(self): # We want to connect in normal blocking mode, else Python on Windows # will timeout even for 'connection refused' style errors (fixed in # Python 2.4 on Windows) if self.scheme == "http": httplib.HTTPConnection.connect(self) elif self.scheme == "https": # Clone of httplib.HTTPSConnection.connect sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) sock.connect((self.host, self.port)) key_file = cert_file = None ssl = socket.ssl(sock, key_file, cert_file) self.sock = httplib.FakeSocket(sock, ssl) else: raise ValueError, "Invalid scheme '%s'" % (self.scheme,) # Once we have connected, set the timeout. self.sock.settimeout(self.timeout) Michael Vartanyan wrote:
Hello All,
This is probably more a Python question but maybe you will have a quick solution for me - I guess the greatest *multi-threaded* Python application provides the greatest basis for this problem domain :-)
The situation: external method that is doing a http request using urllib2. I don't care about the response, and whether there was any response at all, I just need to send the request through - thus I want this request to time out very fast (let it be 2 seconds). I found no documented way to set the timeout for urllib2, after some googling I found an advice to manipulate the timeout on the lower-level socket module, something like this:
import socket import urllib2
def do_request(): timeout = 2 socket.setdefaulttimeout(timeout) req = urllib2.Request(url='http://my.site.com/do_something_quick') response = urllib2.urlopen(req)
The problem is this way this default timeout is set for _all_ new socket created by this Python process using the socket module. Even if I return the default to its previous state after the request it won't help me much - there are three more threads in my Zope that should be able to work with default timeout. So there are two possible solutions - to find a (preferably documented) way of accessing and parameterizing the socket object created by urllib2 to make a request or to find a way to isolate(??) global module settings between Zope threads.
Zope 2.8.3, Python 2.4.2, FreeBSD 4.10 if this is relevant.
Any hints/TFMs?
Many thanks Michael
_______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
participants (5)
-
Chris McDonough -
Jonathan -
Michael Vartanyan -
Paul Winkler -
robert rottermann