Caching/http-acceleration and proxying Zope-served content
I have a question, for anyone experienced in working with Zope and caching proxies: I'm setting up a load-balanced server farm that has nodes that will run Apache and proxy (via mod_proxy) to ZEO clients running ZServer. This farm is routed (both ways) through a layer 4 load-balancing appliance, and all these boxes (both nodes and the balancer) are sitting inside a DMZ with private IP addresses. The public world will access these servers via a firewall box running transparent proxy (actually, I guess, similar to squid's http_accel mode; the semantics here are a bit tricky, as it's more of a inverse trans-proxy). Between Apache and Zope, there would be several virtual hosts, and I'd be using the SiteAccess product. It gets a bit tricky in that I need to access several different virtual hosts inside the DMZ (one for the ZEO farm, and another for a dedicated CGI-based ad server on another box) via the proxy. A more detailed (ascii art) diagram of what I am trying to do, is at http://209.132.8.98/server_ascii_art.txt My question is this: does anybody have any thoughts on the merits of Squid (http accelerator mode) versus Apache/mod_proxy in terms of caching, virtual hosts, and the like when working with Zope sites? Any big pitfalls to this kind of setup with Zope sites? Also, somewhat related, is: if anybody knows if Squid can even handle multiple virtual hosts running on different boxes? Or is squid not suitable for an inverse-proxy running on the border of a DMZ containing more that a single host? I'd really appreciate anyone's thoughts. Once I get this thing working, I'll likely end up writing a howto regarding how I did it, as this seems like a useful setup for a high-volume, heterogeneous (not ideal, but reality), production media site running Zope (and other legacy CGI apps / static content as well)... Much thanks in advance, Sean ========================= Sean Upton Senior Programmer/Analyst SignOnSanDiego.com The San Diego Union-Tribune 619.718.5241 sean.upton@uniontrib.com =========================
sean.upton@uniontrib.com wrote:
I have a question, for anyone experienced in working with Zope and caching proxies:
I'm setting up a load-balanced server farm that has nodes that will run Apache and proxy (via mod_proxy) to ZEO clients running ZServer. This farm is routed (both ways) through a layer 4 load-balancing appliance, and all these boxes (both nodes and the balancer) are sitting inside a DMZ with private IP addresses. The public world will access these servers via a firewall box running transparent proxy (actually, I guess, similar to squid's http_accel mode; the semantics here are a bit tricky, as it's more of a inverse trans-proxy). Between Apache and Zope, there would be several virtual hosts, and I'd be using the SiteAccess product. It gets a bit tricky in that I need to access several different virtual hosts inside the DMZ (one for the ZEO farm, and another for a dedicated CGI-based ad server on another box) via the proxy. A more detailed (ascii art) diagram of what I am trying to do, is at http://209.132.8.98/server_ascii_art.txt
My question is this: does anybody have any thoughts on the merits of Squid (http accelerator mode) versus Apache/mod_proxy in terms of caching, virtual hosts, and the like when working with Zope sites? Any big pitfalls to this kind of setup with Zope sites?
I would prefer Squid since its only purpose in life is caching. It follows the "do one thing and do it well" mantra. But whatever your choice, I hope you make use of the new CacheManagement feature in Zope 2.3. It is designed to make things like this straightforward and easy. There's a recent news announcement that links to everything you need--including complete help docs! Shane
i used to use squid and was very happy with it. you can do a lot of fine tuning as to which and how you want to have yor objects cached and its _very_ fast. i didnt get that with apache/mod_proxy; to tell the truth i never was sure how much (and why) got cached by mod_proxy since it doesnt write a log. actually i use a combination of squid / apache because i need some re-writing, you could as well use squid for caching and apache for (name-based) virtual hosting. this of course introduces additional latency, but this shouldnt be a problem if your objects are fairly cacheable, ie. most content would be served out of squid anyway. cheers, peter. On Mon, 8 Jan 2001 sean.upton@uniontrib.com wrote: :I have a question, for anyone experienced in working with Zope and caching :proxies: : :I'm setting up a load-balanced server farm that has nodes that will run :Apache and proxy (via mod_proxy) to ZEO clients running ZServer. This farm :is routed (both ways) through a layer 4 load-balancing appliance, and all :these boxes (both nodes and the balancer) are sitting inside a DMZ with :private IP addresses. The public world will access these servers via a :firewall box running transparent proxy (actually, I guess, similar to :squid's http_accel mode; the semantics here are a bit tricky, as it's more :of a inverse trans-proxy). Between Apache and Zope, there would be several :virtual hosts, and I'd be using the SiteAccess product. It gets a bit :tricky in that I need to access several different virtual hosts inside the :DMZ (one for the ZEO farm, and another for a dedicated CGI-based ad server :on another box) via the proxy. A more detailed (ascii art) diagram of what :I am trying to do, is at http://209.132.8.98/server_ascii_art.txt : :My question is this: does anybody have any thoughts on the merits of Squid :(http accelerator mode) versus Apache/mod_proxy in terms of caching, virtual :hosts, and the like when working with Zope sites? Any big pitfalls to this :kind of setup with Zope sites? : :Also, somewhat related, is: if anybody knows if Squid can even handle :multiple virtual hosts running on different boxes? Or is squid not suitable :for an inverse-proxy running on the border of a DMZ containing more that a :single host? : :I'd really appreciate anyone's thoughts. Once I get this thing working, :I'll likely end up writing a howto regarding how I did it, as this seems :like a useful setup for a high-volume, heterogeneous (not ideal, but :reality), production media site running Zope (and other legacy CGI apps / :static content as well)... : :Much thanks in advance, :Sean : :========================= :Sean Upton :Senior Programmer/Analyst :SignOnSanDiego.com :The San Diego Union-Tribune :619.718.5241 :sean.upton@uniontrib.com :========================= : :_______________________________________________ :Zope maillist - Zope@zope.org :http://lists.zope.org/mailman/listinfo/zope :** No cross posts or HTML encoding! ** :(Related lists - : http://lists.zope.org/mailman/listinfo/zope-announce : http://lists.zope.org/mailman/listinfo/zope-dev ) : -- _________________________________________________ peter sabaini, mailto: sabaini@niil.at -------------------------------------------------
On Tue, 9 Jan 2001 09:31:35 +0100 (CET), Peter Sabaini <sabaini@niil.at> wrote:
actually i use a combination of squid / apache because i need some re-writing, you could as well use squid for caching and apache for (name-based) virtual hosting. this of course introduces additional latency, but this shouldnt be a problem if your objects are fairly cacheable, ie. most content would be served out of squid anyway.
That's an interesting configuration. For a while Ive been considering a solution based on longer-than-usual chains of http proxies, and a "do one thing well" principal. In my case: Apache (for rewriting and SSL) -> Squid accelerator -> A custom load-balancing redirector -> Multiple Zopes Have you had any significant latency, or other problems? Toby Dickenson tdickenson@geminidataloggers.com
On Tue, 9 Jan 2001, Toby Dickenson wrote: :On Tue, 9 Jan 2001 09:31:35 +0100 (CET), Peter Sabaini :<sabaini@niil.at> wrote: : :>actually i use a combination of squid / apache because i need some :>re-writing, you could as well use squid for caching and apache for :>(name-based) virtual hosting. this of course introduces additional :>latency, but this shouldnt be a problem if your objects are fairly :>cacheable, ie. most content would be served out of squid anyway. : :That's an interesting configuration. For a while Ive been considering :a solution based on longer-than-usual chains of http proxies, and a :"do one thing well" principal. In my case: : :Apache (for rewriting and SSL) : -> Squid accelerator : -> A custom load-balancing redirector : -> Multiple Zopes : :Have you had any significant latency, or other problems? : :Toby Dickenson :tdickenson@geminidataloggers.com my configuration was: --> squid --> apache w/ rewriting and logging --> zserver / zope there was of course some latency added but nothing significant -- < 0.5s afai can remember. and, since squid was the first stage, most content would be served of squid (and thus with no added latency) anyway. ssl was not an issue. the problem i had with zserver-only was that the most-requested pages (frontpage and index pages) were also the most expensive to render (drawing in content from diverse categories etc.) with the proxy setup, the most-requested pages would be in the cache, and little-requested pages (article views) are a) simple to render and therefore dont contribute much to load and latency and b) there's more willingness for users to wait half a second longer for a detailed view than for an index page, where they decide if they want to read anything at all (imho). i needed apache for logging and also served all static images via apache -- all those small gifs can have quite an impact of subjective load times. and since these dont change often you dont need manageability via zope. you just have to write <img src="&dtml-spacer_gif_path;"> instead of <dtml-var spacer_gif> (or whatever)... ru, peter. -- _________________________________________________ peter sabaini, mailto: sabaini@niil.at -------------------------------------------------
participants (4)
-
Peter Sabaini -
sean.upton@uniontrib.com -
Shane Hathaway -
Toby Dickenson