Dear all, I'm planning to rebuild one of my commercial portal sites using Zope with more than 300,000 users. The portal has been running on APM( apache + php + mysql ) without a glitch for three years. I've been worried about Zope's scalibility since I already have too many users with as much data. I came to a conclusion that ZEO might help with a bit more hardware thrown in. Here's my plan. (OS: Linux) 1. Get a rock-solid machine for ZEO server: Intel XEON dual cpus (SR2300 if you want to know this Intel server model) with 2G mem + six 73G segate SCA scsi hard disks + RAID5 for fault-tolerance (intel srcu32 dual channel raid controller) 2. Throw in at least three less-powerful machines for ZEO clients: 1 P3 tualatin + 1G mem + 1 36G hard disk with NO RAID. I plan to buy one No.1 machine plus one No.2 machine since I already have two machines, candidates for the other two ZEO clients. I came under the impression using Zope for the past year ( and from python/zope docs ) that python uses only one CPU no matter how many I have. One of the two machines I already have has four Xeon cpus but Zope on that machine runs way slower than one on my single P3 desktop with a bit more horsepower. Suppose you have 300,000 users with tons of data and expect they'd start adding tons more data once you open this portal based on Zope + heavily customized CMF/Plone. Would this setup seem reasonable? Or better still, how would you set your portal up if you got extra dough for hardware? Specifically I want to know... 1. Would the above setup seem reasonable? 2. Some say P3 tualatin is better for Python than P4 or even Xeon processors. Is that true? 3. I'm looking towards the Directory Storage over File Storate for tons of reasons, the reasons you might easily guess. I'll have six 73G disks with RAID5, which means I'd have to let go 73G for storing parity information. That leaves me about 365G. At least 300G will be allocated for ZODB. Would FileStorage maintain its integrity if the db grows to 300G? What I'm worried about most is that I can't make it versionless. Directory Storage has that option. Comments? 4. Would I really get no benefits from running Python on SMP machines? That is, would I better off with two 1 cpu machines than one dual cpu machine? If true, do I really have to buy Dual XEON intel server for ZEO central storage server? Why don't I just settle for a single CPU machine for that with more RAM? 5. Can I run a ZEO client on ZEO server? To sum up, I'd get a total of four machines: one ZEO server and three ZEO clients. I want to make it four clients. I assume I can also run a ZEO client on the ZEO server but I hunch that some would probably say no to that... . May I ask why? 6. You have the following machines. How would you set up your ZEO server and clients? Remember, you have 300,000 users eager to add data :-) Machine No.1 - 2 Xeon 2.0GHZ CPUs + 2G mem + six 73G disks + Intel high-end RAID controller Machine No.2 - 1 P3 tualatin CPU + 1G mem + 1 36G disk Machine No.3 - 2 Xeon CPUs (older model) + 1G mem + 3 36G disks (disks old, will be recycled for backup storage or something) Machine No.4 - 4 Xeon CPUs (older model) + 2G mem + 4 36G disks (disks old, will be recycled for backup storage or something) I'd appreciate any comments. Thanks in advance. Regards, Wankyu Choi --------------------------------------------------------------- To the dedicated staff at NeoQuest, language is not a problem to be dealt with, but an art waiting to be performed. --------------------------------------------------------------- Wankyou Choi CEO/President NeoQuest Communications, Inc. 3rd Floor, HMC Bldg., 730-14, Yoksam-dong, Kangnam-gu Seoul, Korea Tel: 82-2 - 501 - 7124 Fax: 82-2-501-7058 Corporate Home: http://www.neoqst.com Personal Home: http://www.zoper.net, http://www.neoboard.net e-mail: wankyu@neoqst.com ---------------------------------------------------------------
--On Mittwoch, 29. Januar 2003 13:44 +0900 Wankyu Choi <wankyu@neoqst.com> wrote:
3. I'm looking towards the Directory Storage over File Storate for tons of reasons, the reasons you might easily guess. I'll have six 73G disks with RAID5, which means I'd have to let go 73G for storing parity information. That leaves me about 365G. At least 300G will be allocated for ZODB. Would FileStorage maintain its integrity if the db grows to 300G? What I'm worried about most is that I can't make it versionless. Directory Storage has that option. Comments?
With 300GB you are at the point where I would think about a RDBMS as backend. I assume maintaining or packing a 300GB Data.fs might be a pain (20-30GB are already hard to handle). -aj
With 300GB you are at the point where I would think about a RDBMS as backend. I assume maintaining or packing a 300GB Data.fs might be a pain (20-30GB are already hard to handle).
You lost me there. I don't think you mean **RDBMS storage**.. Then you mean you'd think about setting up an RDMBS backend outside of ZODB? ( alert! newbie factors ;-) I've been thinking: 1. Take out as much binary data as possible via External File ( my version, not the ones found in Zope products page ). 2. Store textual data ( raw data + its rendered version, etc ) in MySQL. 3. User data goes into MySQL. Then again... why should I use Zope if I have to go to all these troubles? I already have APM set up to do just that. Given all the benefits Zope offers, it might be worth the efforts... but I'd need to heavily modify all the content types to store data outside of ZODB such as External File. I tried once but that seemed a lot of hassle, so I decided to wait for other solutions like the Directory Storage. If most Zope users ( including yourself ) feel Filestorage is not a solution for more than 20GB of data, wouldn't ZC feel the same too? It seems the stock Zope is not up to a large-scale web site. (if you call 300,000 user web site large-scale... I wouldn't but...) Some even say a coule of thousand users would be the limit. I've posted similar queries about Zope's scalibility on a number of occasions, but replies suggest "one might do this" kind of stuff. There's been no concrete answer to these queries, an answer out of real experience, not a guesstimation. That confuses me. Does that mean nobody has reached the limit using Zope? 20GB of data is so normal these days. I already have double the amount of data on my site. Guess I'll start worrying about Zope's scalibility again.... Please convince me. Anyone? One mentioned months ago that Filestorage is so robust that it could withstand most abuse I could throw at it. Okay, I believe it so. But the question still remains. Why couldn't we have a FileStorage that can split over partitions ( in multiple files, I mean, why one single golliath? ) and that has an option to turn off versioning? One might say, "It's opensource, please yourself. Write one yourself." ;-) But I just really wanted to know why ZODB guys hasn't done that. Is there a reason I'm missing? Or is it still on the to-do list? Maybe I'm assuming wrong. Would you please elaborate what you mean by 'RDBMS backend'? Do you really mean I should write my own products to use MySQL as backend bypassing ZODB , for example? Or is there an RDBMS storage ( Orcale is not an option, if you mean Oracle storage )? Thanks anyway for your comments. Best Regards, Wankyu Choi -----Original Message----- From: Andreas Jung [mailto:lists@andreas-jung.com] Sent: Wednesday, January 29, 2003 2:55 PM To: Wankyu Choi; zope@zope.org Subject: Re: [Zope] Hardware for Zope + ZEO --On Mittwoch, 29. Januar 2003 13:44 +0900 Wankyu Choi <wankyu@neoqst.com> wrote:
3. I'm looking towards the Directory Storage over File Storate for tons of reasons, the reasons you might easily guess. I'll have six 73G disks with RAID5, which means I'd have to let go 73G for storing parity information. That leaves me about 365G. At least 300G will be allocated for ZODB. Would FileStorage maintain its integrity if the db grows to 300G? What I'm worried about most is that I can't make it versionless. Directory Storage has that option. Comments?
With 300GB you are at the point where I would think about a RDBMS as backend. I assume maintaining or packing a 300GB Data.fs might be a pain (20-30GB are already hard to handle). -aj
--On Mittwoch, 29. Januar 2003 15:32 +0900 Wankyu Choi <wankyu@neoqst.com> wrote:
If most Zope users ( including yourself ) feel Filestorage is not a solution for more than 20GB of data, wouldn't ZC feel the same too? It seems the stock Zope is not up to a large-scale web site. (if you call 300,000 user web site large-scale... I wouldn't but...) Some even say a coule of thousand users would be the limit. I've posted similar queries about Zope's scalibility on a number of occasions, but replies suggest "one might do this" kind of stuff. There's been no concrete answer to these queries, an answer out of real experience, not a guesstimation. That confuses me. Does that mean nobody has reached the limit using Zope? 20GB of data is so normal these days. I already have double the amount of data on my site. Guess I'll start worrying about Zope's scalibility again.... Please convince me. Anyone?
The largest Data.fs files I have seen so for and heard of were up to 30-40GB. It is not a question of scalablility but a question of handling. I don't like to handle a single 40GB large file.
One mentioned months ago that Filestorage is so robust that it could withstand most abuse I could throw at it. Okay, I believe it so. But the question still remains. Why couldn't we have a FileStorage that can split over partitions ( in multiple files, I mean, why one single golliath? ) and that has an option to turn off versioning? One might say, "It's opensource, please yourself. Write one yourself." ;-) But I just really wanted to know why ZODB guys hasn't done that. Is there a reason I'm missing? Or is it still on the to-do list?
Maybe I'm assuming wrong. Would you please elaborate what you mean by 'RDBMS backend'? Do you really mean I should write my own products to use MySQL as backend bypassing ZODB , for example? Or is there an RDBMS storage ( Orcale is not an option, if you mean Oracle storage )?
I was not thinking about an RDBMS as storage but about a normal database to store your data. -aj
But as I mentioned, storing data in MySQL (or other RDBMSs) would mean I'd have to modify the content types, user folder and stuff. Did you mean the following, for example? 1. A user registers as a member of a CMF site: his member data and credentials all go into MySQL; user folder and member data tool are just wrappers to pool data from MySQL. I wrote a userfolder doing exactly this months ago but realized that the member data schema couldn't be as flexible as a normal member data tool would be as a dynamic dictionary :-( 2. That user starts adding content: internally, binary data such as Images and Files are saved as external files; textual data such as Documents and Artilces go into MySQL, which means Document and Article content types are simple wrappers to pool data from MySQL. Do you suggest these are the only reasonable options I'd have? In fact, I dreaded this since it's a lot more hassle than I wanted to go thru. Plus, I'd still need a dedicated machine for MySQL even if I chose to use ZEO. Running MySQL and ZEO on the same machine seems too much burden on the machine in question (my current MySQL DB machine is a 4-way 2G ram machine and it's already feeling the hit.). Separating them would mean two costly fault-tolerant machines. Too much cost... Well, if you say yes, I'd need to go about taking apart all the content types and tools I've created not to rely 100% on ZODB. Thanks again and I appologize for these constant stupid questions^^ But they're really important to me since I have to shell out the dough soon for new machines anyway, and need a concrete picture of what I'd do with Zope, ZEO and stuff. Best Regards, Wankyu Choi --------------------------------------------------------------- To the dedicated staff at NeoQuest, language is not a problem to be dealt with, but an art waiting to be performed. --------------------------------------------------------------- Wankyou Choi CEO/President NeoQuest Communications, Inc. 3rd Floor, HMC Bldg., 730-14, Yoksam-dong, Kangnam-gu Seoul, Korea Tel: 82-2 - 501 - 7124 Fax: 82-2-501-7058 Corporate Home: http://www.neoqst.com Personal Home: http://www.zoper.net, http://www.neoboard.net e-mail: wankyu@neoqst.com --------------------------------------------------------------- -----Original Message----- From: Andreas Jung [mailto:lists@andreas-jung.com] Sent: Wednesday, January 29, 2003 3:42 PM To: Wankyu Choi; 'Andreas Jung'; zope@zope.org Subject: RE: [Zope] Hardware for Zope + ZEO --On Mittwoch, 29. Januar 2003 15:32 +0900 Wankyu Choi <wankyu@neoqst.com> wrote:
If most Zope users ( including yourself ) feel Filestorage is not a solution for more than 20GB of data, wouldn't ZC feel the same too? It seems the stock Zope is not up to a large-scale web site. (if you call 300,000 user web site large-scale... I wouldn't but...) Some even say a coule of thousand users would be the limit. I've posted similar queries about Zope's scalibility on a number of occasions, but replies suggest "one might do this" kind of stuff. There's been no concrete answer to these queries, an answer out of real experience, not a guesstimation. That confuses me. Does that mean nobody has reached the limit using Zope? 20GB of data is so normal these days. I already have double the amount of data on my site. Guess I'll start worrying about Zope's scalibility again.... Please convince me. Anyone?
The largest Data.fs files I have seen so for and heard of were up to 30-40GB. It is not a question of scalablility but a question of handling. I don't like to handle a single 40GB large file.
One mentioned months ago that Filestorage is so robust that it could withstand most abuse I could throw at it. Okay, I believe it so. But the question still remains. Why couldn't we have a FileStorage that can split over partitions ( in multiple files, I mean, why one single golliath? ) and that has an option to turn off versioning? One might say, "It's opensource, please yourself. Write one yourself." ;-) But I just really wanted to know why ZODB guys hasn't done that. Is there a reason I'm missing? Or is it still on the to-do list?
Maybe I'm assuming wrong. Would you please elaborate what you mean by 'RDBMS backend'? Do you really mean I should write my own products to use MySQL as backend bypassing ZODB , for example? Or is there an RDBMS storage ( Orcale is not an option, if you mean Oracle storage )?
I was not thinking about an RDBMS as storage but about a normal database to store your data. -aj
----- Original Message ----- From: "Wankyu Choi" <wankyu@neoqst.com> <..znip..>
1. A user registers as a member of a CMF site: his member data and credentials all go into MySQL; user folder and member data tool are just wrappers to pool data from MySQL. I wrote a userfolder doing exactly this months ago but realized that the member data schema couldn't be as flexible as a normal member data tool would be as a dynamic dictionary :-(
No, but if you look at how, for instance, exUserFolder you can have a single table called properties that you can use to store info (one row per user and property/data snippet). It would also be possible to use this set up build hierarchies of properties. These can very simply be translated to dictionaries. <..znip..>
Thanks again and I appologize for these constant stupid questions^^ But they're really important to me since I have to shell out the dough soon for new machines anyway, and need a concrete picture of what I'd do with Zope, ZEO and stuff.
hehe... I think they are not stupid at all. I agree with you that it is very hard to find information on how to use zope in a Very Large Scale scenario. So I think they are not stupid at all - IMHO, they just haven't been adressed properly previously. I am very interested in what you will be able to conclude since I have similar concerns (with about 1/10 of your user base :-)
Best Regards, Wankyu Choi
Ditto, /dario - -------------------------------------------------------------------- Dario Lopez-Kästen, dario@ita.chalmers.se IT Systems & Services System Developer/System Administrator Chalmers University of Tech.
No, but if you look at how, for instance, exUserFolder you can have a single table called properties that you can use to store info (one row per user and property/data snippet). It would also be possible to use this set up build hierarchies of properties. These can very simply be translated to dictionaries.
I did look into exUserFolder when I wrote a userfolder using MySQL. The problem was, yes, a row per property. That was the reason I gave up on it and wrote my own MySQL userfolder. A user of mine has about 30 varchar/text properties, which translates into 30 rows. Multiply that by 300,000 ;-) It just seems not that good a solution performance-wise. I could pull a row of user data with one read with the current schema. With exUserFolder, to get the same results as "select * from user_properties where userid='wankyu'", you'd need 30 reads. I'd rather create a new table by copying the existing user table and add new fields. Cumbersome but would be way better in terms of performance. Just a guesstimation. Correct me if I'm wrong.
hehe... I think they are not stupid at all. I agree with you that it is very hard to find information on how to use zope in a Very Large Scale scenario. So I think they are not stupid at all - IMHO, they just haven't been adressed properly previously.
Indeed. I've been after this issue for the past year :-) Nobody seems to have a definate answer. I'm surprised to find that few has gone that path with Zope. ( Should I go that untrodden path? I get to wonder :-( Here's my plan anyway (99% sure now): - ZEO ZSS on ReiserFS with RAID5 enforced + a couple of clients : ZSS will be a single point of failure and it'll be protected by RAID5 setup - ZSS on DirectoryStorage or AdaptableStorage + content types that can store data in external files and MySQL ( I rewrote all the CMF content types and then some as NeoPortal Content Pak. Will throw in MySQL adapter or something for this. ) I just tested out AdaptableStorage and tried adding a Plone instance. Failed miserabley with a non-picklable object error. No problem with a CMF instance, though. Guess it'd be a rough uphill battle...sigh... No other solution seems viable as of now if I want to stick with Zope ;-)
I am very interested in what you will be able to conclude since I have similar concerns (with about 1/10 of your user base :-)
Well, no matter what solution I settle for, I have to do it in two months. I'll let you know the results. Cross your fingers for me :-) Best Regards, Wankyu Choi --------------------------------------------------------------- To the dedicated staff at NeoQuest, language is not a problem to be dealt with, but an art waiting to be performed. --------------------------------------------------------------- Wankyou Choi CEO/President NeoQuest Communications, Inc. 3rd Floor, HMC Bldg., 730-14, Yoksam-dong, Kangnam-gu Seoul, Korea Tel: 82-2 - 501 - 7124 Fax: 82-2-501-7058 Corporate Home: http://www.neoqst.com Personal Home: http://www.zoper.net, http://www.neoboard.net e-mail: wankyu@neoqst.com ---------------------------------------------------------------
Am Mittwoch, 29. Januar 2003 07:32 erhalten: <snip>
'RDBMS backend'? Do you really mean I should write my own products to use MySQL as backend bypassing ZODB , for example?
maybe you can find an easy to use connector at an open source server e.g. O2DB - web-object api from zope to mysql (sourceforge.net) <hth> fritz (-:fs)
On Wed, Jan 29, 2003 at 03:32:20PM +0900, Wankyu Choi wrote:
I've been thinking:
1. Take out as much binary data as possible via External File ( my version, not the ones found in Zope products page ).
i can understand doing this if you're running FileStorage and want to avoid a single bloated Data.fs file, but since you're leaning towards DirectoryStorage, I think there's no reason to use ExternalFile at all - it just complicates things.
2. Store textual data ( raw data + its rendered version, etc ) in MySQL.
ditto
3. User data goes into MySQL.
ditto
once but that seemed a lot of hassle, so I decided to wait for other solutions like the Directory Storage.
Wait no longer. ;)
If most Zope users ( including yourself ) feel Filestorage is not a solution for more than 20GB of data,
hell, I don't like it for 2 GB of data. (see recent thread "POSKeyError II: Dead by Dawn" ... I wasn't kidding when I titled it that!) the problem is not that FileStorage won't scale - I have no evidence either way about that, but others have reported 100 GB (*) - the problem is that it's very hard to manage properly when it gets big. Incremental backups are impossible, and repair tools (fsrecover.py) take forever to run. I can only imagine the pain of doing this with ~100GB. Anonther solution that hasn't been mentioned in this thread: multiple mounted storages. Never done this, but I understand zope.org does it (one storage for wikis, another for everything else). http://www.zope.org/Members/jim/ZODB/MountedDatabases Shane's DBTab allows you to mix-and-match storages, running several at once. http://www.my-zope.org/exp/20030116101458 http://hathaway.freezope.org/Software/DBTab -- Paul Winkler http://www.slinkp.com Look! Up in the sky! It's THE MYSTICAL SATIRE! (random hero from isometric.spaceninja.com)
On Wed, Jan 29, 2003 at 03:32:20PM +0900, Wankyu Choi wrote:
I've been thinking:
1. Take out as much binary data as possible via External File ( my version, not the ones found in Zope products page ).
i can understand doing this if you're running FileStorage and want to avoid a single bloated Data.fs file, but since you're leaning towards DirectoryStorage, I think there's no reason to use ExternalFile at all - it just complicates things.
Hm.. your reply pretty much convinces me^^ Now I'll be trying DirectoryStorage in earnest.
2. Store textual data ( raw data + its rendered version, etc ) in MySQL.
ditto
Why shouldn't I let DirectoryStorage also take care of this? Toby Dickenson seems to prefer putting everything into the storage. ( I need to keep searchable textual data in MySQL tables anyway since Zope's catalog engine is a pie in the sky for me: Korean simply doesn't work since there's no reliable Korean splitter; will have to rely on old '%word%' select tricks. :(
3. User data goes into MySQL.
ditto
Again why? In fact, I want to keep the existing user data where they are but am just curious what I would be missing if I didn't^^
once but that seemed a lot of hassle, so I decided to wait for other solutions like the Directory Storage.
Wait no longer. ;)
Okay, but is there anything I should be aware of before deciding on the directory storage as my storage of choice? The data will be both read- and write-intensive.
If most Zope users ( including yourself ) feel Filestorage is not a solution for more than 20GB of data,
hell, I don't like it for 2 GB of data. (see recent thread "POSKeyError II: Dead by Dawn" ... I wasn't kidding when I titled it that!)
Yeah, that POSKeyError thing scares the hell out of me ;-)
Anonther solution that hasn't been mentioned in this thread: multiple mounted storages. Never done this, but I understand zope.org does it (one storage for wikis, another for everything else). http://www.zope.org/Members/jim/ZODB/MountedDatabases
Can I mix it with the DirectoryStorage?
Shane's DBTab allows you to mix-and-match storages, running several at once. http://www.my-zope.org/exp/20030116101458 http://hathaway.freezope.org/Software/DBTab
I tried Shane's AdaptableStorage with DBTab. Works with stock Zope and CMF. Doesn't work with Plone. Will have to wait for it to stabilize a bit more. Thanks for your comments. Cheers, Wankyu Choi --------------------------------------------------------------- To the dedicated staff at NeoQuest, language is not a problem to be dealt with, but an art waiting to be performed. --------------------------------------------------------------- Wankyou Choi CEO/President NeoQuest Communications, Inc. 3rd Floor, HMC Bldg., 730-14, Yoksam-dong, Kangnam-gu Seoul, Korea Tel: 82-2 - 501 - 7124 Fax: 82-2-501-7058 Corporate Home: http://www.neoqst.com Personal Home: http://www.zoper.net, http://www.neoboard.net e-mail: wankyu@neoqst.com ---------------------------------------------------------------
On Friday 31 January 2003 7:00 am, Wankyu Choi wrote:
mounted storages. Never done this, but I understand zope.org does it (one storage for wikis, another for everything else). http://www.zope.org/Members/jim/ZODB/MountedDatabases
Can I mix it with the DirectoryStorage?
I dont see why not, but havent tried it. I understand this is used on Zope.org to prevent packing the main database from destroying the wiki history. DirectoryStorage 1.1 will allow you to do this *in* *one* *storage*, provided the objects with extra history can be identified by class name. http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/dirstorage/DirectorySto rage/doc/changes.diff?r1=1.20&r2=1.21 -- Toby Dickenson http://www.geminidataloggers.com/people/tdickenson
On Fri, Jan 31, 2003 at 04:00:39PM +0900, Wankyu Choi wrote:
On Wed, Jan 29, 2003 at 03:32:20PM +0900, Wankyu Choi wrote:
I've been thinking:
1. Take out as much binary data as possible via External File ( my version, not the ones found in Zope products page ).
i can understand doing this if you're running FileStorage and want to avoid a single bloated Data.fs file, but since you're leaning towards DirectoryStorage, I think there's no reason to use ExternalFile at all - it just complicates things.
Hm.. your reply pretty much convinces me^^
Well, i wouldn't go that far. :) I should add the disclaimer that i have not actually used DirectoryStorage in production yet. I made the above statement based on the DS docs. I have only started evaluating options myself after our recent troubles with FS corruption. All this is IM-very-humble O, and YMMV and all that.
Now I'll be trying DirectoryStorage in earnest.
2. Store textual data ( raw data + its rendered version, etc ) in MySQL.
ditto
Why shouldn't I let DirectoryStorage also take care of this?
sorry, by "ditto" I meant to convey "same answer I gave to the last point". If DS performs reliably as advertised, it would completely obviate the need for storing anything outside the ZODB, from a scalability & maintenance standpoint at least. **HOWEVER** you are concerned about performance, and DS is even slower than FileStorage for writes. From the DS FAQ: """Intermittant writes are a factor of 1.5 slower. ... Under high write pressure the journal queue becomes a bottleneck, and performance degrades to 3 times slower than FileStorage. """ The question then becomes, what is "high write pressure"? And what does 3x slower than FS feel like to the user?
3. User data goes into MySQL.
ditto
Again why? In fact, I want to keep the existing user data where they are but am just curious what I would be missing if I didn't^^
again, by "ditto" I meant to convey "same answer I gave to the last point" :)
Okay, but is there anything I should be aware of before deciding on the directory storage as my storage of choice? The data will be both read- and write-intensive.
given what the DS FAQ says about write performance, I'd look into setting up a test server and bombard it with automated writes to see if it will handle the load you anticipate. But of course you were going to do that anyway. ;)
If most Zope users ( including yourself ) feel Filestorage is not a solution for more than 20GB of data,
hell, I don't like it for 2 GB of data. (see recent thread "POSKeyError II: Dead by Dawn" ... I wasn't kidding when I titled it that!)
Yeah, that POSKeyError thing scares the hell out of me ;-)
I could almost feel the server screaming "JOIIINNNN USSSSSS" in an eery DSP-treated voice. -- Paul Winkler http://www.slinkp.com Look! Up in the sky! It's SNAZZY DEVIANT GIRL! (random hero from isometric.spaceninja.com)
(cc the direstorage-users list too) On Friday 31 January 2003 5:07 pm, Paul Winkler wrote:
**HOWEVER** you are concerned about performance, and DS is even slower than FileStorage for writes. From the DS FAQ:
"""Intermittant writes are a factor of 1.5 slower. ... Under high write pressure the journal queue becomes a bottleneck, and performance degrades to 3 times slower than FileStorage. """
The question then becomes, what is "high write pressure"?
A benchmark that bombards the storage with nothing but writes sufficient to saturate the disk interface. Most production loads dont look like that. The storage probably spends some of its time handling reads, and some (most?) time idle. DirectoryStorage is optimised for writes that come in bursts. It reduces the latency of individual writes within the burst, under the assumption that it can do the rest of the work asynchronously once the burst is over. The 3x slowdown applies if the 'burst' goes on too long. Yes, you can configure the size of a burst. (for what its worth, I expect to be able to improve on that 1.5x with the latest reiserfs kernel patches)
And what does 3x slower than FS feel like to the user?
A typical human Zope user wont notice. Most of the time in a Zope request is spend in DTML processing, application logic, traversing, and security checks. In my experience only a small proportion of the time is spent in the storage. 3 times small is still small. Expect something like 3x for scripts that perform many writes. The write response profile changes once the storage is pushed into this 3x mode in its default configuration. Some writes will be much slower than others, and this will be noticeable to a human. The cause and effect are analogous to virtual memory thrashing. This can be tweaked, but I doubt anyone will need to.
given what the DS FAQ says about write performance, I'd look into setting up a test server and bombard it with automated writes to see if it will handle the load you anticipate. But of course you were going to do that anyway. ;)
Indeed. Please share your results. Note than for modern storages it is important to measure the performance under a realistic load, rather than applying a huge load and seeing where it saturates. The original Berkeley storage benchmarks were bogus (imo) for this reason. -- Toby Dickenson http://www.geminidataloggers.com/people/tdickenson
At 13:44 29-01-2003 +0900, Wankyu Choi wrote:
I'm planning to rebuild one of my commercial portal sites using Zope with more than 300,000 users. The portal has been running on APM( apache + php + mysql ) without a glitch for three years.
I wouldnt worry much about number of users but more about number of concurrent users and number of hits per second/hour/day. We had a similar setup (IIS + ASP + MSSQL) that was crashing at least once a day. Once converted to Linux+Zope+MySQL it ran happily on the same hardware (we actually took out dual processor frontends and put in single processor frontends and there was no problem) and had uptimes of 70 days. This was for a consumer portal serving about 2 million hits per day to about 50.000 visits per day.
I came to a conclusion that ZEO might help with a bit more hardware thrown in.
Yes, it definitely would...
Here's my plan. (OS: Linux)
A Unix OS is definitely a good idea....
1. Get a rock-solid machine for ZEO server: Intel XEON dual cpus (SR2300 if you want to know this Intel server model) with 2G mem + six 73G segate SCA scsi hard disks + RAID5 for fault-tolerance (intel srcu32 dual channel raid controller)
Great. Also put the MySQL server in this machine. You could/should do it in a separate machine, but I think there's no need.
2. Throw in at least three less-powerful machines for ZEO clients: 1 P3 tualatin + 1G mem + 1 36G hard disk with NO RAID.
OK. Make sure that the clients are connected to the backend server (the one above) on a private network.
I plan to buy one No.1 machine plus one No.2 machine since I already have two machines, candidates for the other two ZEO clients.
OK, although I think that since those machines are multiprocessors they might be put to better use.
I came under the impression using Zope for the past year ( and from python/zope docs ) that python uses only one CPU no matter how many I have. One of the two machines I already have has four Xeon cpus but Zope on that machine runs way slower than one on my single P3 desktop with a bit more horsepower.
I would use this machine as a backend server (with an upgrade to the disks and RAM perhaps), saving the expense of buying machine nr 1.
Would this setup seem reasonable? Or better still, how would you set your portal up if you got extra dough for hardware?
The setup is OK. You just have to take care about types of storage, network setup, etc. If you intend to spend some extra money, spend it on a front end load balancer caching system. You would put this in front of your ZEO client machines, spreading the load and caching most of the pages.
1. Would the above setup seem reasonable?
Yes, see above.
2. Some say P3 tualatin is better for Python than P4 or even Xeon processors. Is that true?
I'm not aware of this. But even if true, your machines wont be running Python only. They'll be running the OS, system tasks, etc.
3. I'm looking towards the Directory Storage over File Storate for tons of reasons, the reasons you might easily guess. I'll have six 73G disks with RAID5, which means I'd have to let go 73G for storing parity information. That leaves me about 365G. At least 300G will be allocated for ZODB. Would FileStorage maintain its integrity if the db grows to 300G? What I'm worried about most is that I can't make it versionless. Directory Storage has that option. Comments?
I definitely woulnd use FileStorage. Mainly because when Data.fs grows big, Zope startup times go to hell. We're using BerkeleyDB Storage with great success. You can either run it versionless or not. And its quite fast. We've been looking at DirectoryStorage but right now its still not final and we're afraid that at some point (ie large sites with lots of objects) the OS might run out of file handles.
4. Would I really get no benefits from running Python on SMP machines? That is, would I better off with two 1 cpu machines than one dual cpu machine? If true, do I really have to buy Dual XEON intel server for ZEO central storage server? Why don't I just settle for a single CPU machine for that with more RAM?
You can of course run several Python/Zope instances each on their own processor and on different IP ports and the load balance between them. But I guess its just trying to complicate. I would use dual/multi processor machines for the backend(s) (Zeo+MySQL) and use single processor machines on the frontend Zeo clients. Also keep in mind that dual processor systems help in eliminating single points of failure. You woulnt want to have a nice setup with a single processor backend server and have that processor fail :-)
5. Can I run a ZEO client on ZEO server? To sum up, I'd get a total of four machines: one ZEO server and three ZEO clients. I want to make it four clients. I assume I can also run a ZEO client on the ZEO server but I hunch that some would probably say no to that... . May I ask why?
First because you dont need the client's processing interfering with the server serving Zope objects and MySQL queries. Second because you would have to expose the backend server to the internet and you dont want to do that (remember: put it in a private network to which only the clients are connected).
6. You have the following machines. How would you set up your ZEO server and clients? Remember, you have 300,000 users eager to add data :-)
Machine No.1 - 2 Xeon 2.0GHZ CPUs + 2G mem + six 73G disks + Intel high-end RAID controller Machine No.2 - 1 P3 tualatin CPU + 1G mem + 1 36G disk Machine No.3 - 2 Xeon CPUs (older model) + 1G mem + 3 36G disks (disks old, will be recycled for backup storage or something)
Use these as your front end servers (Zeo clients). Put a load balancer in front of these.
Machine No.4 - 4 Xeon CPUs (older model) + 2G mem + 4 36G disks (disks old, will be recycled for backup storage or something)
Use this as your backend server (MySQL + ZEO). Use BerkeleyDB or DirectoryStorage. You should only store in Zope what is "Zope stuff" (DTML methods, Python methods, ZClasses, ZSQL queries, etc). All the original site data should be kept in MySQL. All images should be stored as External Files (as well as any other BLOBs like PDF files, DOC files, etc). Take some of the RAM out of the 3 client machines (1 Gb is enough) and put it into the server. Also take the RAID controller from nr 1 and use it on machine nr 4. Hope this helps. C U! -- Mario Valente
On Wednesday 29 January 2003 1:37 pm, Mario Valente wrote:
OK. Make sure that the clients are connected to the backend server (the one above) on a private network.
For security or performance reasons?
If you intend to spend some extra money, spend it on a front end load balancer caching system. You would put this in front of your ZEO client machines, spreading the load and caching most of the pages.
I would say that some kind of front end proxy is essential for this setup. There are happy users of Squid, Apache, and Pound.
fast. We've been looking at DirectoryStorage but right now its still not final
FYI, we hit 1.0.0 a few weeks ago.
and we're afraid that at some point (ie large sites with lots of objects) the OS might run out of file handles.
Apart from two lock files, DirectoryStorage only opens file handles momentarily. On some filesystems you may want to be concerned with running out of inodes. On linux I strongly recommend reiserfs, which does not have this problem because it can create inodes on demand. (thanks for raising this concern - I will add these to the DirectoryStorage FAQ)
All images should be stored as External Files (as well as any other BLOBs like PDF files, DOC files, etc).
Im not sure that advice is universally true, unless your blobs are never change or mastered elsewhere. Putting everything in ZODB/ZEO simplifies backups, replication, and distribution of changes. -- Toby Dickenson http://www.geminidataloggers.com/people/tdickenson
At 14:17 29-01-2003 +0000, Toby Dickenson wrote:
On Wednesday 29 January 2003 1:37 pm, Mario Valente wrote:
OK. Make sure that the clients are connected to the backend server (the one above) on a private network.
For security or performance reasons?
Both, of course.
fast. We've been looking at DirectoryStorage but right now its still not final
FYI, we hit 1.0.0 a few weeks ago.
Yes, I'm aware of that. Nonetheless we prefer to "road test" it for a while on less important less traficked sites before we go whole hog :-)
and we're afraid that at some point (ie large sites with lots of objects) the OS might run out of file handles.
Apart from two lock files, DirectoryStorage only opens file handles momentarily.
On some filesystems you may want to be concerned with running out of inodes. On linux I strongly recommend reiserfs, which does not have this problem because it can create inodes on demand.
I was actually thinking inodes when I wrote filehandles. Thanks for the aditional info on reiserfs.
All images should be stored as External Files (as well as any other BLOBs like PDF files, DOC files, etc).
Im not sure that advice is universally true, unless your blobs are never change or mastered elsewhere. Putting everything in ZODB/ZEO simplifies backups, replication, and distribution of changes.
If you put all images/docs/whatever in a single directory or directory tree you have simpler solutions to backup, replication and distribution. We found out by experience that its not beneficial to store large binary files in Zope, especially when there are constant file uploads going on. C U! -- Mario Valente
From: Toby Dickenson [mailto:tdickenson@geminidataloggers.com] Sent: Wednesday, January 29, 2003 11:18 PM To: Mario Valente; Wankyu Choi; zope@zope.org Subject: Re: [Zope] Hardware for Zope + ZEO <snip>
All images should be stored as External Files (as well as any other BLOBs like PDF files, DOC files, etc).
Im not sure that advice is universally true, unless your blobs are never change or mastered elsewhere. Putting everything in ZODB/ZEO simplifies backups, replication, and distribution of changes.
If possible, that's exactly what I want to do: putting everything into ZODB. Yes, that simplifies everything: I wouldn't have to rewrite NeoBoard ( web-based discussion board ) and the other products for one thing ;-) I've been developing Zope products with one thing in mind in terms of storage: some day, they'll come up with a better storage model; don't reinvent the wheel. Yes, most of my blobs will be constantly changing. What would be the pros and cons on 1. putting everything in ZODB using the directory storage as backend 2. getting some help from RDBMS like MySQL? I mean everything including acl_users. While we're at the subject... I'd be pulling, say, 10,000 objects from directory storage in one go if a NeoBoard forum gets heavily populated ( 10,000 articles or 5,000 articles with each article having one attachment object ). How slow would it be compared to an SQL "select * from heavily_populated_board limit 50" (supposing a pageful of article list would spew out 50 articles )? ( I just assume it'd be slower than using MySQL. Correct me if I'm wrong. ) Thanks in advance. Regards, Wankyu Choi --------------------------------------------------------------- To the dedicated staff at NeoQuest, language is not a problem to be dealt with, but an art waiting to be performed. --------------------------------------------------------------- Wankyou Choi CEO/President NeoQuest Communications, Inc. 3rd Floor, HMC Bldg., 730-14, Yoksam-dong, Kangnam-gu Seoul, Korea Tel: 82-2 - 501 - 7124 Fax: 82-2-501-7058 Corporate Home: http://www.neoqst.com Personal Home: http://www.zoper.net, http://www.neoboard.net e-mail: wankyu@neoqst.com ---------------------------------------------------------------
On Friday 31 January 2003 7:17 am, Wankyu Choi wrote:
While we're at the subject... I'd be pulling, say, 10,000 objects from directory storage in one go if a NeoBoard forum gets heavily populated ( 10,000 articles or 5,000 articles with each article having one attachment object ).
Either I misunderstand this statement, or your application design may be flawed. What is 'one go'? How many 'goes per second' do you expect to sustain?
How slow would it be compared to an SQL "select * from heavily_populated_board limit 50" (supposing a pageful of article list would spew out 50 articles )? ( I just assume it'd be slower than using MySQL. Correct me if I'm wrong. )
The DirectoryStorage overhead for this is tiny: nothing more than reading 100 files from some nested directories and some simple validation on each file header. Not much different to FileStorage, which would have 50 seeks and 50 reads in one file. Both storages would transfer the same number of data bytes. The key performance issue is how fast your filesystem can traverse those directories. Or for FileStorage, the cost of keeping that directory/dictionary in memory. If you have tried FileStorage and it lacked raw speed, then I doubt DirectoryStorage will be faster. -- Toby Dickenson http://www.geminidataloggers.com/people/tdickenson
On Friday 31 January 2003 7:17 am, Wankyu Choi wrote:
While we're at the subject... I'd be pulling, say, 10,000 objects from directory storage in one go if a NeoBoard forum gets heavily populated ( 10,000 articles or 5,000 articles with each article having one attachment object ).
Either I misunderstand this statement, or your application design may be flawed. What is 'one go'? How many 'goes per second' do you expect to sustain?
I meant ( basically the same, but the actual code is slightly different than the following ): --- code excerpt from NeoBoard --- def getArticles(...): ... prepare.... articles = [article for article in board.ZopeFind( obj_metatypes='NeoBoard Article' , search_sub=1)] # I use ZopeFind instead of objectValues since articles get nested ...sort articles... ...return articles... --------------------------- Note that my python/zope experience is limited. I'm still learning :-) Suppose the 'board' folderish object ( let's assume it's a BTreeFolder; if it's present NeoBoard uses BTreeFolder2, otherwise Zope Folder ) holds about 10,000 articles. Whenever a list of articles gets displayed, this statement should be executed. Most RDBMS-based boards do the same with a select/limit combo like the PHP NeoBoard does: "SELECT article_field_list from board limit start_num, end_num" Am I missing something? Is there a way in Zope to limit the returned results as a limit clause would do in an SQL statement? Even the Zope core source code seems to get every subobject from folderish objects no matter how many there might be. That gave me the impression that Zope doesn't give you that option. Correct me if I'm wrong. With RDBMS, I can limit the numver of returned rows with a limit clause that is constructed with a request variable, say, 'b_start'. In Zope, I have to pull all the rows (objects) first and get a specified batch from them by wrapping them into batches. I didn't mean 'goes per second'. Just wanted to know how much slower this one go of pulling too many objects from a folder would be in comparison to RDBMS's select/limit combo.
How slow would it be compared to an SQL "select * from heavily_populated_board limit 50" (supposing a pageful of article list would spew out 50 articles )? ( I just assume it'd be slower than using MySQL. Correct me if I'm wrong. )
The DirectoryStorage overhead for this is tiny: nothing more than reading 100 files from some nested directories and some simple validation on each file
header. Not much different to FileStorage, which would have 50 seeks and 50 reads in one file. Both storages would transfer the same number of data bytes.
The key performance issue is how fast your filesystem can traverse those directories. Or for FileStorage, the cost of keeping that directory/dictionary in memory. If you have tried FileStorage and it lacked raw speed, then I doubt DirectoryStorage will be faster.
I applologize for not being specific with the query. I didn't mean to compare DS with FS. I already know DS could be slower than FS and guess that's acceptable given its benefits. I was just curious how slower NeoBoard might get if I rely solely on ZODB ( irregardless of its storage model ) than on RDBMS. In fact, PHP/MySQL NeoBoard is pretty fast. Given the skins and stuff, Zope NeoBoard should be slower. But I found (and am a bit worried) that pulling data from ZODB is too much slower than selecting rows from MySQL. I'd have to re-design NeoBoard to use MySQL as storage backend like I did in good ol' days. bummer... Bear with my ignorance^^ You have every right not to answer this stupid question ;-) Thanks for your time anyway. Best Regards, Wankyu Choi --------------------------------------------------------------- To the dedicated staff at NeoQuest, language is not a problem to be dealt with, but an art waiting to be performed. --------------------------------------------------------------- Wankyou Choi CEO/President NeoQuest Communications, Inc. 3rd Floor, HMC Bldg., 730-14, Yoksam-dong, Kangnam-gu Seoul, Korea Tel: 82-2 - 501 - 7124 Fax: 82-2-501-7058 Corporate Home: http://www.neoqst.com Personal Home: http://www.zoper.net, http://www.neoboard.net e-mail: wankyu@neoqst.com ---------------------------------------------------------------
On Friday 31 January 2003 8:40 am, Wankyu Choi wrote:
On Friday 31 January 2003 7:17 am, Wankyu Choi wrote:
--- code excerpt from NeoBoard ---
def getArticles(...):
... prepare....
articles = [article for article in board.ZopeFind( obj_metatypes='NeoBoard Article' , search_sub=1)]
# I use ZopeFind instead of objectValues since articles get nested
...sort articles... ...return articles...
---------------------------
Note that my python/zope experience is limited. I'm still learning :-)
Suppose the 'board' folderish object ( let's assume it's a BTreeFolder; if it's present NeoBoard uses BTreeFolder2, otherwise Zope Folder ) holds about 10,000 articles.
Whenever a list of articles gets displayed, this statement should be executed. Most RDBMS-based boards do the same with a select/limit combo like the PHP NeoBoard does:
"SELECT article_field_list from board limit start_num, end_num"
Am I missing something?
Yes. Your SQL approach will access only the 50 relevent articles. The other 9950 stay untouched on disk. Your ZODB approach loads all 10000 into memory, then your presentation logic (I guess) ignores 99.5% of them.
Is there a way in Zope to limit the returned results as a limit clause would do in an SQL statement?
The best approach is to use ZCatalog. ZCatalog means you only need to load the articles into memory when you need them. -- Toby Dickenson http://www.geminidataloggers.com/people/tdickenson
Is there a way in Zope to limit the returned results as a limit clause would do in an SQL statement?
The best approach is to use ZCatalog. ZCatalog means you only need to load the articles into memory when you need them.
A threaded board needs to maintain its threaded structure. It's rather cumbersome with SQL but efficient. In Zope, it's a snap but gives poor performance. A folder (parent article) and its subfolders (its replies) strucutre directly turns into a thread hierarchy. I do use ZCatalog for all views of NeoBoard except for its default threaded look. Maybe I was stupid in not using ZCatalog for the threaded look. I'll give it a try. But I just wanted to know if there was a way to pull a limited set of objects from a **folderish object** not using ZCatalog. Not sure if your suggestion would work with NeoBoard yet, but I ran into a number of situations where I should pull all objects from a folder. I posted a simliar query months ago when I was trying to write my own user folder. No definate answer yet. The stock userfolder, for example, returns every user object no matter how many there are in response to a getUsers method call. I wouldn't use it with my 300,000 users. Nobody would, I suspect. There was no info on how to pull a limited set of users from a user folder. And guess we can't use ZCatalog to do the job. Simply put, are Zope folderish objects including BTreeFolderish objects **not** designed to hold this many objects? Well, if so, I'll just rewrite NeoBoard to use MySQL again. That would be a lot easier :( Thanks again for your time. Regards, Wankyu Choi --------------------------------------------------------------- To the dedicated staff at NeoQuest, language is not a problem to be dealt with, but an art waiting to be performed. --------------------------------------------------------------- Wankyou Choi CEO/President NeoQuest Communications, Inc. 3rd Floor, HMC Bldg., 730-14, Yoksam-dong, Kangnam-gu Seoul, Korea Tel: 82-2 - 501 - 7124 Fax: 82-2-501-7058 Corporate Home: http://www.neoqst.com Personal Home: http://www.zoper.net, http://www.neoboard.net e-mail: wankyu@neoqst.com ---------------------------------------------------------------
From: Mario Valente [mailto:mvalente@ruido-visual.pt] Sent: Wednesday, January 29, 2003 10:38 PM To: Wankyu Choi; zope@zope.org Subject: Re: [Zope] Hardware for Zope + ZEO First of all, thanks for your detailed and informative feedback. I could hardly find any feedback regarding this issue for months.
We had a similar setup (IIS + ASP + MSSQL) that was crashing at least once a day. Once converted to Linux+Zope+MySQL it ran happily on the same hardware (we actually took out dual processor frontends and put in single processor frontends and there was no problem) and had uptimes of 70 days.
My servers have been up and running for a year and a half :-) I'm pretty much satisfied with the existing APM setup. The only reasonly I'm switching to Zope/Python is ... for better application productivity. What I could do in an APM setup, I could do it in 1/10 time and efforts. With these ZODB storage worries gone, I have no complaint about Zope/Python ;-)
I plan to buy one No.1 machine plus one No.2 machine since I already have two machines, candidates for the other two ZEO clients.
OK, although I think that since those machines are multiprocessors they might be put to better use.
The problem is the existing machines should still be serving users while I toy around with the new machines. That is, I plan to slap together a beta testing environment with the new machines installing everything from scratch while the existing ones keep running. I'll need at least two months to finish the setup and necessary Zope/CMF/Plone applicatioins like NeoBoard and NeoPortal Content Pak and more.
I came under the impression using Zope for the past year ( and from python/zope docs ) that python uses only one CPU no matter how many I have. One of the two machines I already have has four Xeon cpus but Zope on that machine runs way slower than one on my single P3 desktop with a bit more horsepower.
I would use this machine as a backend server (with an upgrade to the disks and RAM perhaps), saving the expense of buying machine nr 1.
Like I mentioned above, it should keep running while I set up the other machines. When the two new machines are properly setup and my application development projects are complete, I'll go public with these machines and take down the existing ones; clear up the mess in the old machines ( running too long with old OS stuff ) reinstalling everything from scratch as I will have done with the new machines; put them back in service. The number 4 machine with 4 CPUs will also be put to use as a mail server as it is now for example. So I won't be wasting their horsepower. I run serveral mailinglists with an average 100,000 subscribers, for one thing ;-) This might look like a stupid plan... but whenever I did a major overhaul, something went wrong. About this time of the year...2002... I even tried an automatic setup process of my own creation with a tons of shell scripts and tested them on my Linux box again and again before actually running them on the production machines: an hour of a job, I thought. But it took a day and a half ( no service was available; users were screaming down my neck ) Never did I dream that RedHat wouldn't install on an intel 440 machine with a particular adaptec scsi controller; it just hung. Hours of searching the Redhat Bugzilla turned up an entry only days old regarding this issue. I even passed out after that day and a half of nightmare :( With two months of leeway and lastest models of servers, I won't have to worry about such nonsensical, unexpected glitches ;-)
Would this setup seem reasonable? Or better still, how would you set your portal up if you got extra dough for hardware?
If you intend to spend some extra money, spend it on a front end load balancer caching system. You would put this in front of your ZEO client machines, spreading the load and caching most of the pages.
Can't I just use linux virtual server? Seems like a lot of waste to buy an expensive load balancer for so few a machines... Zope will be running behind Apache + Squid fronts. Bad ideas? Just a thought, haven't tried it.
2. Some say P3 tualatin is better for Python than P4 or even Xeon processors. Is that true?
I'm not aware of this. But even if true, your machines wont be running Python only. They'll be running the OS, system tasks, etc.
I meant a ZEO client machine. I don't plan to hog its CPU with any other serious tasks. The other two exsting machines already have multiple CPUs and they'll double as a mail server and stuff.
I definitely woulnd use FileStorage. Mainly because when Data.fs grows big, Zope startup times go to hell. We're using BerkeleyDB Storage with great success. You can either run it versionless or not. And its quite fast. We've been looking at DirectoryStorage but right now its still not final and we're afraid that at some point (ie large sites with lots of objects) the OS might run out of file handles.
Thanks for bringing the BDB Storage to my attention. ( ReiserFS has no file handle/inode problems, though. )
Also keep in mind that dual processor systems help in eliminating single points of failure. You woulnt want to have a nice setup with a single processor backend server and have that processor fail :-)
Hm... that's a good point. I haven't thought of that :-) Taking out a CPU from the total cost of buying a server doesn't even make a dent anyway :(.
Machine No.4 - 4 Xeon CPUs (older model) + 2G mem + 4 36G disks (disks old, will be recycled for backup storage or something)
Use this as your backend server (MySQL + ZEO). Use BerkeleyDB or DirectoryStorage. You should only store in Zope what is "Zope stuff" (DTML methods, Python methods, ZClasses, ZSQL queries, etc). All the original site data should be kept in MySQL. All images should be stored as External Files (as well as any other BLOBs like PDF files, DOC files, etc).
which means... you wouldn't put any serious heavy-weight data into your ZODB storage no matter how good it might be? That's the problem I'm not exactly sure about. For example, I have tons of textual data stored in MySQL via PHP NeoBoard ( a web-based discussion board ). I rewrote NeoBoard from scratch as a Zope product. It stores all its data in ZODB. I was thinking of pulling all the existing articles into ZODB to use Zope NeoBoard. Now the plan looks like a silly idea... I seem to have to do just the opposite: take out all data stored in ZODB and put it into MySQL. That way, Zope NeoBoard Article objects will only act as wrappers pulling acutal data from MySQL tables. Plus, I rewrote all the CMF content types to add extra features and stuff. Now I'll need to do the same with these content types to make use of MySQL. Is my assumption right? That makes ZODB look like a metadata holder, not a database :(
Hope this helps.
It surely helped. Thanks a lot. Cheers, Wankyu Choi --------------------------------------------------------------- To the dedicated staff at NeoQuest, language is not a problem to be dealt with, but an art waiting to be performed. --------------------------------------------------------------- Wankyou Choi CEO/President NeoQuest Communications, Inc. 3rd Floor, HMC Bldg., 730-14, Yoksam-dong, Kangnam-gu Seoul, Korea Tel: 82-2 - 501 - 7124 Fax: 82-2-501-7058 Corporate Home: http://www.neoqst.com Personal Home: http://www.zoper.net, http://www.neoboard.net e-mail: wankyu@neoqst.com ---------------------------------------------------------------
At 15:36 31-01-2003 +0900, Wankyu Choi wrote:
From: Mario Valente [mailto:mvalente@ruido-visual.pt] Sent: Wednesday, January 29, 2003 10:38 PM To: Wankyu Choi; zope@zope.org Subject: Re: [Zope] Hardware for Zope + ZEO
Use this as your backend server (MySQL + ZEO). Use BerkeleyDB or DirectoryStorage. You should only store in Zope what is "Zope stuff" (DTML methods, Python methods, ZClasses, ZSQL queries, etc). All the original site data should be kept in MySQL. All images should be stored as External Files (as well as any other BLOBs like PDF files, DOC files, etc).
which means... you wouldn't put any serious heavy-weight data into your ZODB storage no matter how good it might be?
Plus, I rewrote all the CMF content types to add extra features and stuff. Now I'll need to do the same with these content types to make use of MySQL.
Is my assumption right? That makes ZODB look like a metadata holder, not a database :(
The ZODB is an object-oriented database. It should be used to store object oriented data (especially data that depens on hierarchical relations and on inheritance). The ZODB shouldnt be used to store structured tabular data; for that you should use a relational database management system. Its just a question of using the right tools for the problem. I wouldn store binary files neither in the ZODB or in a relational database; likewise I wouldnt put text files/fields either into the ZODB or a relational database (altough in Zope, using ZCatalog or NGTextIndex to index text, it wouldnt be so bad). C U! -- Mario Valente
participants (7)
-
Andreas Jung -
Dario Lopez-Kästen -
fritz -
Mario Valente -
Paul Winkler -
Toby Dickenson -
Wankyu Choi