At 13:44 29-01-2003 +0900, Wankyu Choi wrote:
I'm planning to rebuild one of my commercial portal sites using Zope with more than 300,000 users. The portal has been running on APM( apache + php + mysql ) without a glitch for three years.
I wouldnt worry much about number of users but more about number of concurrent users and number of hits per second/hour/day. We had a similar setup (IIS + ASP + MSSQL) that was crashing at least once a day. Once converted to Linux+Zope+MySQL it ran happily on the same hardware (we actually took out dual processor frontends and put in single processor frontends and there was no problem) and had uptimes of 70 days. This was for a consumer portal serving about 2 million hits per day to about 50.000 visits per day.
I came to a conclusion that ZEO might help with a bit more hardware thrown in.
Yes, it definitely would...
Here's my plan. (OS: Linux)
A Unix OS is definitely a good idea....
1. Get a rock-solid machine for ZEO server: Intel XEON dual cpus (SR2300 if you want to know this Intel server model) with 2G mem + six 73G segate SCA scsi hard disks + RAID5 for fault-tolerance (intel srcu32 dual channel raid controller)
Great. Also put the MySQL server in this machine. You could/should do it in a separate machine, but I think there's no need.
2. Throw in at least three less-powerful machines for ZEO clients: 1 P3 tualatin + 1G mem + 1 36G hard disk with NO RAID.
OK. Make sure that the clients are connected to the backend server (the one above) on a private network.
I plan to buy one No.1 machine plus one No.2 machine since I already have two machines, candidates for the other two ZEO clients.
OK, although I think that since those machines are multiprocessors they might be put to better use.
I came under the impression using Zope for the past year ( and from python/zope docs ) that python uses only one CPU no matter how many I have. One of the two machines I already have has four Xeon cpus but Zope on that machine runs way slower than one on my single P3 desktop with a bit more horsepower.
I would use this machine as a backend server (with an upgrade to the disks and RAM perhaps), saving the expense of buying machine nr 1.
Would this setup seem reasonable? Or better still, how would you set your portal up if you got extra dough for hardware?
The setup is OK. You just have to take care about types of storage, network setup, etc. If you intend to spend some extra money, spend it on a front end load balancer caching system. You would put this in front of your ZEO client machines, spreading the load and caching most of the pages.
1. Would the above setup seem reasonable?
Yes, see above.
2. Some say P3 tualatin is better for Python than P4 or even Xeon processors. Is that true?
I'm not aware of this. But even if true, your machines wont be running Python only. They'll be running the OS, system tasks, etc.
3. I'm looking towards the Directory Storage over File Storate for tons of reasons, the reasons you might easily guess. I'll have six 73G disks with RAID5, which means I'd have to let go 73G for storing parity information. That leaves me about 365G. At least 300G will be allocated for ZODB. Would FileStorage maintain its integrity if the db grows to 300G? What I'm worried about most is that I can't make it versionless. Directory Storage has that option. Comments?
I definitely woulnd use FileStorage. Mainly because when Data.fs grows big, Zope startup times go to hell. We're using BerkeleyDB Storage with great success. You can either run it versionless or not. And its quite fast. We've been looking at DirectoryStorage but right now its still not final and we're afraid that at some point (ie large sites with lots of objects) the OS might run out of file handles.
4. Would I really get no benefits from running Python on SMP machines? That is, would I better off with two 1 cpu machines than one dual cpu machine? If true, do I really have to buy Dual XEON intel server for ZEO central storage server? Why don't I just settle for a single CPU machine for that with more RAM?
You can of course run several Python/Zope instances each on their own processor and on different IP ports and the load balance between them. But I guess its just trying to complicate. I would use dual/multi processor machines for the backend(s) (Zeo+MySQL) and use single processor machines on the frontend Zeo clients. Also keep in mind that dual processor systems help in eliminating single points of failure. You woulnt want to have a nice setup with a single processor backend server and have that processor fail :-)
5. Can I run a ZEO client on ZEO server? To sum up, I'd get a total of four machines: one ZEO server and three ZEO clients. I want to make it four clients. I assume I can also run a ZEO client on the ZEO server but I hunch that some would probably say no to that... . May I ask why?
First because you dont need the client's processing interfering with the server serving Zope objects and MySQL queries. Second because you would have to expose the backend server to the internet and you dont want to do that (remember: put it in a private network to which only the clients are connected).
6. You have the following machines. How would you set up your ZEO server and clients? Remember, you have 300,000 users eager to add data :-)
Machine No.1 - 2 Xeon 2.0GHZ CPUs + 2G mem + six 73G disks + Intel high-end RAID controller Machine No.2 - 1 P3 tualatin CPU + 1G mem + 1 36G disk Machine No.3 - 2 Xeon CPUs (older model) + 1G mem + 3 36G disks (disks old, will be recycled for backup storage or something)
Use these as your front end servers (Zeo clients). Put a load balancer in front of these.
Machine No.4 - 4 Xeon CPUs (older model) + 2G mem + 4 36G disks (disks old, will be recycled for backup storage or something)
Use this as your backend server (MySQL + ZEO). Use BerkeleyDB or DirectoryStorage. You should only store in Zope what is "Zope stuff" (DTML methods, Python methods, ZClasses, ZSQL queries, etc). All the original site data should be kept in MySQL. All images should be stored as External Files (as well as any other BLOBs like PDF files, DOC files, etc). Take some of the RAM out of the 3 client machines (1 Gb is enough) and put it into the server. Also take the RAID controller from nr 1 and use it on machine nr 4. Hope this helps. C U! -- Mario Valente