Will ZEO solve my problem?
[This is a terribly fuzzy question/situation. I'm looking for insight/suggestions, not guarantees.] We've been migrating everything we do to Zope for the past few months. We are just starting to get buy-in and funding for equipment. Already, however, we have lots of people using our Zope server which is hosted on a dual P-850 I was able to hijack for the purpose. (I found that the Sun E450 I had available was way too slow.) We're planning to split things up into a cluster that consists of multiple HTTPS proxies, some ZEO clients, a ZEO server and a hot ZEO server spare. Until then, I need to make the current system usable. (We have all of our Heads and Deans using ZWiki Pages to put together our strategic plan.) The problem we've been encountering is that every so often (several times a day) the server goes nuts and I end up restarting it. I can watch "top" and see a single python process take all of one CPU time for an extended period. I've been unable to determine from the logs (and the server-status of the HTTPS proxy) what's causing this to happen. Because we have so many applications running on this host, the traditional debugging methods I've seen don't seem to (easily) apply. Since I'm going to use ZEO eventually, I thought I would try it this week to see if it helps with this problem. At least I should be able to move the problem to either the ZEO server or client. If it ends up in the ZEO client (as I suspect it will), I can use the front-end proxy to route requests to other ZEO clients while I restart the one that's spinning. I can also restart the spinning ZEO client quickly. If it's in the ZEO server, I'm still screwed. I have been avoiding packing the database (so that I have the history and ability to undo some of the actions), so it takes awhile to start up. Any thoughts? Is there a way to tell exactly what a process is doing (in Zope/Python terms)? I just need a clue where to start looking. Thank you. --kyler Some versions: Mandrake 8.0 Linux 2.4.4-ac11 (will change on reboot) version['ZDChart'] = '0.5.1b' version['MSWordMunger'] = '0.1.1' version['ZipFolder'] = '0.2.2' version['Formulator'] = '0.9.3' version['Distutils'] = '1.0.2' version['MXDateTime'] = '2.0.1' version['psycopg'] = '0.99.2' version['HTMLDocument'] = '0.2' version['PoPy'] = '2.0.2' #'3.0-beta1' version['ZPoPyDA'] = '1.2' version['ParsedXML'] = '1.1b1' version['PageTemplates'] = '1.3.2' version['ZTUtils'] = '1.0.0' version['TAL'] = '1.3.2' version['OpenLDAP']='1.2.11' version['Python']='1.5.2' version['Python_lib'] = string.join(string.split(version['Python'], '.')[:2], '.') version['PythonLDAP']='1.10alpha3' version['Zope']='2.3.3' # '2.3.3b1' # '2.3.2' # '2.3.2b2' # '2.3.2b1' # '2.3.1' # '2.3.0' version['ZopeLDAP']='1-1-0' version['ZWiki']='0.9.3' #'0.8.1' version['CMF']='1.1beta' # '1.0' #'1.0beta' version['DCOracle2'] = 'Beta3' # 'beta2'
Kyler, You may want to make use of the detailed debug log (see the -M switch in z2.py) in combination with the requestprofiler.py script that ships within 2.4.0 and trunk series stuff in the "utilities" directory. There is some talk about how to use the requestprofiler back a while ago in this maillist, and searching should give it up. - C "Kyler B. Laird" wrote:
[This is a terribly fuzzy question/situation. I'm looking for insight/suggestions, not guarantees.]
We've been migrating everything we do to Zope for the past few months. We are just starting to get buy-in and funding for equipment. Already, however, we have lots of people using our Zope server which is hosted on a dual P-850 I was able to hijack for the purpose. (I found that the Sun E450 I had available was way too slow.)
We're planning to split things up into a cluster that consists of multiple HTTPS proxies, some ZEO clients, a ZEO server and a hot ZEO server spare. Until then, I need to make the current system usable. (We have all of our Heads and Deans using ZWiki Pages to put together our strategic plan.)
The problem we've been encountering is that every so often (several times a day) the server goes nuts and I end up restarting it. I can watch "top" and see a single python process take all of one CPU time for an extended period.
I've been unable to determine from the logs (and the server-status of the HTTPS proxy) what's causing this to happen. Because we have so many applications running on this host, the traditional debugging methods I've seen don't seem to (easily) apply.
Since I'm going to use ZEO eventually, I thought I would try it this week to see if it helps with this problem. At least I should be able to move the problem to either the ZEO server or client.
If it ends up in the ZEO client (as I suspect it will), I can use the front-end proxy to route requests to other ZEO clients while I restart the one that's spinning. I can also restart the spinning ZEO client quickly.
If it's in the ZEO server, I'm still screwed. I have been avoiding packing the database (so that I have the history and ability to undo some of the actions), so it takes awhile to start up.
Any thoughts? Is there a way to tell exactly what a process is doing (in Zope/Python terms)? I just need a clue where to start looking.
Thank you.
--kyler
Some versions:
Mandrake 8.0 Linux 2.4.4-ac11 (will change on reboot) version['ZDChart'] = '0.5.1b' version['MSWordMunger'] = '0.1.1' version['ZipFolder'] = '0.2.2' version['Formulator'] = '0.9.3' version['Distutils'] = '1.0.2' version['MXDateTime'] = '2.0.1' version['psycopg'] = '0.99.2' version['HTMLDocument'] = '0.2' version['PoPy'] = '2.0.2' #'3.0-beta1' version['ZPoPyDA'] = '1.2' version['ParsedXML'] = '1.1b1' version['PageTemplates'] = '1.3.2' version['ZTUtils'] = '1.0.0' version['TAL'] = '1.3.2' version['OpenLDAP']='1.2.11' version['Python']='1.5.2' version['Python_lib'] = string.join(string.split(version['Python'], '.')[:2], '.') version['PythonLDAP']='1.10alpha3' version['Zope']='2.3.3' # '2.3.3b1' # '2.3.2' # '2.3.2b2' # '2.3.2b1' # '2.3.1' # '2.3.0' version['ZopeLDAP']='1-1-0' version['ZWiki']='0.9.3' #'0.8.1' version['CMF']='1.1beta' # '1.0' #'1.0beta' version['DCOracle2'] = 'Beta3' # 'beta2'
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
-- Chris McDonough Digital Creations, Inc. Publishers of Zope http://www.zope.org """ Killing hundreds of birds with thousands of stones """
On Mon, 02 Jul 2001 07:14:43 -0500, "Kyler B. Laird" <laird@ecn.purdue.edu> wrote: Sorry, Ive no suggestions for your problem, but...
We're planning to split things up into a cluster that consists of multiple HTTPS proxies, some ZEO clients, a ZEO server and a hot ZEO server spare.
Do you have a plan for replicating data onto the second ZEO server? We are looking to implement a similar system and there seems to be no obviously good choice. Toby Dickenson tdickenson@geminidataloggers.com
On Mon, 02 Jul 2001 13:58:23 +0100 you wrote:
Do you have a plan for replicating data onto the second ZEO server?
Yeah, but it's not pretty.
We are looking to implement a similar system and there seems to be no obviously good choice.
That's what I determined about a week ago. I was sad to see that work on Replicated Storage had been halted, but I'm not sure that it was the ideal solution for my needs. Then I had a little problem and became more intimate with the Data.fs file. Again, this is *not* pretty... Right now, our Zope server moves around (It even uses DHCP sometimes.) and our HTTPS server stays in one place. They're even in different buildings. I use the HTTPS server to convert from HTTPS to HTTP and to handle some other things like automatic selection of the source port (for WebDAV things). When my Zope host comes up, it sets up an SSH session to the HTTPS server. As part of the session, it makes tunnels for the ZopE ports. Now I also have it run a command. It's essentially this. tail -f -n +0 Data.fs | ssh proxyhost "cat >somepath/Data.fs" (The real server name, account and tunnel info are all in the SSH config. Instead of "Data.fs", for now I use a filename that has the current date in it. That gives me multiple versions.) This gives me an up-to-the-second copy of the database. If the building with the Zope server is suddenly destroyed, I should not lose anyuthing. My plan is to do something similar between two ZEO servers. One will run and stream its updates to the other. If the primary server fails, I will switch to the other one (by notifying the HTTPS proxy or by twiddling the SSH sessions). Without any modifications, the secondary server would have to start Zope and read through the Data.fs before becoming active. That's quite a delay for us. My plan is to modify the Zope startup so that it does something similar to a "tail -f" on Data.fs so that it can build its internal database as it goes along (mirroring the primary server). I would then send a signal to it to halt reading (when it reaches EOF) and start serving. I might be able to get away with *not* modifying the Zope startup by using a FIFO. I would either write directly to it from my SSH process or use tail to duplicate the streamed file. When the primary server dies, I'll kill whatever is writing to the FIFO (thus closing the write end) and Zope should discontinue reading and start serving. Of course, this is going to require some manual intervention before the primary server is made available again. For now, I am happy to have it killed and forgotten until someone puts everything right. If my primary and secondary ZEO servers are identical (in terms of performance), I'll probably just flip-flop them, and call the running server the primary. After I fix the dead server, I'll replace its Data.fs file with a stream of the running server's file. It will then be ready to go as the fail-over server. Note that although I am streaming the Data.fs file now, everything else I'm planning is untested. I'm hoping to get to it this week, but I've got some other things in the way. --kyler
On Mon, 2 Jul 2001, Kyler B. Laird wrote:
On Mon, 02 Jul 2001 13:58:23 +0100 you wrote:
Do you have a plan for replicating data onto the second ZEO server?
Yeah, but it's not pretty.
We are looking to implement a similar system and there seems to be no obviously good choice.
That's what I determined about a week ago. I was sad to see that work on Replicated Storage had been halted, but I'm not sure that it was the ideal solution for my needs.
I don't think it's permenantly halted, the main project leader is on paternity leave. -Michel
On Mon, 2 Jul 2001 15:12:10 -0700 (PDT) you wrote:
That's what I determined about a week ago. I was sad to see that work on Replicated Storage had been halted, but I'm not sure that it was the ideal solution for my needs.
I don't think it's permenantly halted, the main project leader is on paternity leave.
I shouldn't have said "halted". It's waiting for an interrupt that could be awhile. http://www.zope.org//Wikis/DevSite/Projects/ZEOReplicatedStorage/CurrentStat... Work on ReplicatedStorage will be delayed until work on ZEO 1.0 and ZEO 2.0 is finished. ZEO 2.0 will be the basis for ReplicatedStorage. --kyler
Many replies in one message.... On Mon, 02 Jul 2001 09:44:43 -0500, "Kyler B. Laird" <laird@ecn.purdue.edu> wrote:
Now I also have it run a command. It's essentially this. tail -f -n +0 Data.fs | ssh proxyhost "cat >somepath/Data.fs"
Thats a low-cost trick I hadnt thought of before, however there is still a small time in between: a) data being synced onto the main server, and the transaction succeeding b) data being written to the backup server. Data can be lost if your main machine explodes at that point. Another disadvantage to this solution; current versions of FileStore are not purely loggging; *undo* will write a single byte in the middle of the database file, which will corrupt your replica. A recent change to FileStorage handles undo a different way, which I guess that might fix this problem... Ill ask over on the zodb-dev list. On Mon, 2 Jul 2001 15:11:14 -0700 (PDT), Michel Pelletier <michel@digicool.com> wrote:
have you looked at Coda or Intermezzo? Both distributed file systems, although I've used niether.
Ive looked at them as candidates for this task, but never used them. Both of them replicate in the same 'direction' as ZEO, on reads rather than on writes, so I dont think they are suitable for this task. On 02 Jul 2001 12:13:03 -0600, Bill Anderson <bill@libc.org> wrote:
If you are running the ZSS on Linux, you can use a network block device setup that establishes a raid1 over the netwrok, and have it do a split read/write such that it only reads from the primary, but writes over the raid.
Have you tried this with FileStorage? Testing this configuration is approaching the top of my to-do list. I am a little worried that it may not be entirely stable in high-write usage, such as during a pack. (although at the moment that is just unsubstantiated FUD) If the network block device doesnt work out, I am considering patching FileStorage to distribute writes across the network, and only allow the transaction commit to complete once data is consistent across multiple remote mirrors. Of course replicated-ZEO would solve this and many other problems, but I am hoping that by concentrating on just this one use case I can develop something useful, sooner. http://www.zope.org/Wikis/DevSite/Projects/ZEOReplicatedStorage/SurviveTotal... I take it there would be willing beta testers for such a system? Toby Dickenson tdickenson@geminidataloggers.com
On Tue, 03 Jul 2001 09:49:34 +0100 you wrote:
tail -f -n +0 Data.fs | ssh proxyhost "cat >somepath/Data.fs"
Thats a low-cost trick I hadnt thought of before, however there is still a small time in between: a) data being synced onto the main server, and the transaction succeeding b) data being written to the backup server.
Data can be lost if your main machine explodes at that point.
Certainly. There's also time between when data is written to a buffer and when it ends up on the local disk. I can live with it. It's *much* better than just hoping that nothing goes wrong and making periodic backups.
Another disadvantage to this solution; current versions of FileStore are not purely loggging; *undo* will write a single byte in the middle of the database file, which will corrupt your replica.
Ack! I didn't realize this. (Undo usually doesn't work for me, but I would like to use it.)
A recent change to FileStorage handles undo a different way, which I guess that might fix this problem... Ill ask over on the zodb-dev list.
Thank you. I am quite interested in this. (It reminds me of the undo in Berkeley Storage.)
If you are running the ZSS on Linux, you can use a network block device setup that establishes a raid1 over the netwrok, and have it do a split read/write such that it only reads from the primary, but writes over the raid.
Have you tried this with FileStorage? Testing this configuration is approaching the top of my to-do list. I am a little worried that it may not be entirely stable in high-write usage, such as during a pack. (although at the moment that is just unsubstantiated FUD)
NBD could cause performance problems, but in theory it should work. (It's pretty cool.) It should have very little network overhead beyond that of the "tail" method, but it would support writing to random locations within the file. It would not be a simple cross-platform solution. In the most basic usage, it would also be blocking; it would slow down the server.
If the network block device doesnt work out, I am considering patching FileStorage to distribute writes across the network, and only allow the transaction commit to complete once data is consistent across multiple remote mirrors.
This would be great, and I do hope that you do it. I'm still interested in a simple method that does not force the primary server to wait for updates to be made. If someone uploads a 1GB PDF file to my primary server, I'd rather not wait while every backup is made. I wouldn't mind it for my on-site hosts, but I'm willing to sacrifice some transactions in a (rare) failure than have them always delayed.
Of course replicated-ZEO would solve this and many other problems, but I am hoping that by concentrating on just this one use case I can develop something useful, sooner.
I appreciate your efforts.
http://www.zope.org/Wikis/DevSite/Projects/ZEOReplicatedStorage/SurviveTotal...
I take it there would be willing beta testers for such a system?
Count on me. --kyler
On Tue, 03 Jul 2001 07:21:09 -0500, "Kyler B. Laird" <laird@ecn.purdue.edu> wrote:
If the network block device doesnt work out, I am considering patching FileStorage to distribute writes across the network, and only allow the transaction commit to complete once data is consistent across multiple remote mirrors.
This would be great, and I do hope that you do it.
I'm still interested in a simple method that does not force the primary server to wait for updates to be made. If someone uploads a 1GB PDF file to my primary server, I'd rather not wait while every backup is made. I wouldn't mind it for my on-site hosts, but I'm willing to sacrifice some transactions in a (rare) failure than have them always delayed.
Yes, this patch would need a flexible configuration of what degree of replication is 'enough'.... whether that is high (at least one on-site plus one off-site replica) or low (as you describe)
Of course replicated-ZEO would solve this and many other problems, but I am hoping that by concentrating on just this one use case I can develop something useful, sooner.
I appreciate your efforts.
http://www.zope.org/Wikis/DevSite/Projects/ZEOReplicatedStorage/SurviveTotal...
I take it there would be willing beta testers for such a system?
Count on me.
Toby Dickenson tdickenson@geminidataloggers.com
This seems more germane to zodb-dev. cc'ed there for further replies to go there. On 03 Jul 2001 07:21:09 -0500, Kyler B. Laird wrote:
If you are running the ZSS on Linux, you can use a network block device setup that establishes a raid1 over the netwrok, and have it do a split read/write such that it only reads from the primary, but writes over the raid.
Have you tried this with FileStorage? Testing this configuration is approaching the top of my to-do list. I am a little worried that it may not be entirely stable in high-write usage, such as during a pack. (although at the moment that is just unsubstantiated FUD)
Testing with Filestorage really is just icing. The process _does_ work, I know people using it for always-on hot-backup. it occurs at the OS level, and as far as FS is concerned, it just looks at a file on the fs. it has no clue that the fs is actually running on a RAID device, let alone across a network.
NBD could cause performance problems, but in theory it should work. (It's pretty cool.) It should have very little network overhead beyond that of the "tail" method, but it would support writing to random locations within the file.
It would not be a simple cross-platform solution.
Thus the 'if on Linux remark', but then again, remember this is only for the ZSS, and only until ZEO Replication is accomplished. The ZEO Clients can stil be on anything ZEO runs on.
In the most basic usage, it would also be blocking; it would slow down the server.
How do you figure this? From the server's standpoint, it is just writing to a file on a filesystem. Writes do go out on the network, but they occur simultaneously. Reads are limited to 'local', so there is no performance loss there. I don't know where you get this 'blocking' bit. Writes may suffer slight performance drop in writing speed, but will not increase the 'blockiness' of ZSS. Any replication method that occurs in real time, or even damned-near-real time will cause this to happen.
If the network block device doesnt work out, I am considering patching FileStorage to distribute writes across the network, and only allow the transaction commit to complete once data is consistent across multiple remote mirrors.
This would be great, and I do hope that you do it.
I'm still interested in a simple method that does not force the primary server to wait for updates to be made. If someone uploads a 1GB PDF file to my primary server, I'd rather not wait while every backup is made. I wouldn't mind it for my on-site hosts, but I'm willing to sacrifice some transactions in a (rare) failure than have them always delayed.
Hmmm, seems you may be misunderstanding ZEO and backing up/replicating of the ZODB. If someone uploads a 1gb PDF, you _will_ be waiting a while for that update to span networks in replicated ZSS, just as you will be waiting for that PDF to be uploaded in the first place. Your ZEO Clients will only be waiting on the server to receive the file. So, with a single ZSS, and 5 ZEO clients (let's pretend they are zope servers), the file is uploaded once. now, when a ZEO Client receives a request, it will indeed request the data from the server, and ship it to the client. There is possibly, depending on network bandwidth, a delay, but unles your the end-user browser is on a network connection to the zope server that is as fast or faster than the ZSS->Zope connection, they will not notice a difference. I don't think ZEO was meant for such large file transfers anyway, but I could be wrong. In any event, distributing large files across a network will always cost at a minimum, network delays, regardless of the method used. if there are network delays between primary and any secondaries, and the relay is not done 'offline' (that is to say, not in full sync with the primary), there will be on the primary. The reason is that if you are, in your example, uploading a 1Gb file, and ZEO is replicating that in sync, it has to monitor the progress, in the event that a) network data corruption occurs, or b) user cancels the upload (or it times out, or similar activities/events), at which point it needs to cancel the transaction across primaries. In NBD, you would only be waiting once, as the writes occur in unison. In replicated ZSS, I cannot say, as I don't know much of the details (does anyone ;") ? Bill
On 02 Jul 2001 13:58:23 +0100, Toby Dickenson wrote:
On Mon, 02 Jul 2001 07:14:43 -0500, "Kyler B. Laird" <laird@ecn.purdue.edu> wrote:
Sorry, Ive no suggestions for your problem, but...
We're planning to split things up into a cluster that consists of multiple HTTPS proxies, some ZEO clients, a ZEO server and a hot ZEO server spare.
Do you have a plan for replicating data onto the second ZEO server? We are looking to implement a similar system and there seems to be no obviously good choice.
If you are running the ZSS on Linux, you can use a network block device setup that establishes a raid1 over the netwrok, and have it do a split read/write such that it only reads from the primary, but writes over the raid. Then, through the use of tools such as heartbeat/mon/linux virtual server setup that detects failure of the primary, that then remounts it's half of the raid as rewrite. Then through use of the RAID rebuild tools, when the first comes back up, you re-sync the RAID set, and switch back to the primary. Just one means of doing it. :) Bill
On Mon, 2 Jul 2001, Toby Dickenson wrote:
On Mon, 02 Jul 2001 07:14:43 -0500, "Kyler B. Laird" <laird@ecn.purdue.edu> wrote:
Sorry, Ive no suggestions for your problem, but...
We're planning to split things up into a cluster that consists of multiple HTTPS proxies, some ZEO clients, a ZEO server and a hot ZEO server spare.
Do you have a plan for replicating data onto the second ZEO server? We are looking to implement a similar system and there seems to be no obviously good choice.
have you looked at Coda or Intermezzo? Both distributed file systems, although I've used niether. -Michel
On Mon, 2 Jul 2001 15:11:14 -0700 (PDT) you wrote:
have you looked at Coda or Intermezzo? Both distributed file systems, although I've used niether.
I've looked at these for other purposes. They're pretty complex solutions for such a simple problem. Because Data.fs is a transaction log, and thus only grows by having bits added at the end (except when packed - certainly an exception that must be handled), it is ripe for this simple "tail -f" solution. There is no need to have something that is good at recognizing changes within a file or handling all the other filesystem goodies (like ownership/modes/locks/...). After talking about it more today, it looks like we will probably even stream the updates to at least two machines. One will be off-site in case things *really* go bad. It will not be used as a ZEO server, but will only serve to hold the data in case the other two go bad. (This is especially important when the primary has already gone down.) I might even encrypt the outgoing stream and store it at another location that I do not administer. Hmmmm...it shouldn't take much hacking to be able to specify one "file" as input to Zope and one as output. That would mean that I could, for example, write each day's transactions to a separate file, concatenating all of them any time I start Zope. This would allow me to easily back the database out a day (as I did recently) or stream only updates to another server. (I could even compress the input files.) --kyler
Kyler B. Laird writes:
.... The problem we've been encountering is that every so often (several times a day) the server goes nuts and I end up restarting it.... .... Since I'm going to use ZEO eventually, I thought I would try it this week to see if it helps with this problem. At least I should be able to move the problem to either the ZEO server or client.
If it ends up in the ZEO client (as I suspect it will), I can use the front-end proxy to route requests to other ZEO clients while I restart the one that's spinning. I can also restart the spinning ZEO client quickly.
If it's in the ZEO server, I'm still screwed. I have been avoiding packing the database (so that I have the history and ability to undo some of the actions), so it takes awhile to start up. I would not expect it there, as the ZEO server is really occupied only with reading and writing object state and does not execute application specific code.
Dieter
<stuff snipped>
The problem we've been encountering is that every so often (several times a day) the server goes nuts and I end up restarting it. I can watch "top" and see a single python process take all of one CPU time for an extended period.
are you using any RDB? we used to have this problem where suddenly, only one zope thread is active. we were using postgres-6.x. an upgrade to postgres-7.0.x and PoPy, we managed to stop having such problem. dunno whether that has anything to do with the "single active zope thread". we are now considering upgrading to 7.1.x and psycopg. on a devel box, postgres7.1.2 and psycopg really flies
If it ends up in the ZEO client (as I suspect it will), I can use the front-end proxy to route requests to other ZEO clients while I restart the one that's spinning. I can also restart the spinning ZEO client quickly.
yes, ZEO did save our backend when we were having this intermittent trouble.
participants (7)
-
bak -
Bill Anderson -
Chris McDonough -
Dieter Maurer -
Kyler B. Laird -
Michel Pelletier -
Toby Dickenson