there's also KebasData, although I don't think it does much in the way of rewriting of the retrieved content. Warning though - with any of these solutions, you will want to test what happens when the remote resource is unavailable - e.g. very slow to respond, or blocked by a firewall, etc. For example: I had an external method using urllib2 to retrieve data from another server and embed it in a zope page. This worked fine until something went wonky on the network and requests to the remote page would never yield any response. The result was that requests to my zope page would hang forever. And apparently urllib2 blocks while waiting for a response, so once there were a few requests to this page I had all my worker threads blocked there. zope was effectively dead. I used the "Debug spinning zope" recipe to diagnose that all the threads were waiting in urllib2. I changed this to instead use LocalFS pointing at copies of the data on the hard drive, which are updated periodically via cron & wget. A quick hack but it fixed the symptom. This was all zope 2.6.2 / python 2.1.3. Now in python 2.3 you can set timeouts via socket.setdefaulttimeout() and this should (I hope) affect urllib2, but I have not tested it. On Thu, Jul 01, 2004 at 11:21:07PM +1000, Anthony Baxter wrote:
On Thu, 01 Jul 2004 20:02:02 +0900, Grant Morganryuuguu <grant@ryuuguu.com> wrote:
I am considering Zope/python for a project and would like to get some pointers to see if this is a reasonable fit. I need to get a URL from the web, parse the HTML ,extract some data from the page, rewrite the <a href> tags and display it on the website. I found the HTML parser in library http://www.python.org/doc/current/lib/markup.html and http://www.crummy.com/software/BeautifulSoup/ (which is down now but was up a couple of days ago)
BeatifulSoup, ClientCookie and ClientForm together make a very very nice webscraping package. _______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
-- Paul Winkler http://www.slinkp.com