On Monday 01 Jul 2002 10:56 pm, Phil Glaser wrote:
Hi,
I am beginning the planning process for a Zope content management system that will support approximately 8,000 users. It is hard to predict the day-to-day use of the system, so I am trying to think ahead about scalability issues, and I have a number of questions.
STATIC FILES One of the suggestions in the literature for improving performance is to allow Apache to serve static files. It would seem, however, that doing so completely takes away the meta data and permission management features of Zope.
The heart of this suggesting is using the right tool for the job. Apache is great if your web content is just a bunch of files in a directory. One similar approach to performance is to use a caching proxy in front of Zope. Either Apache/mod_proxy or Squid. The front-end proxy can take some of the load if your pages are "loosely dynamic" - that is, they dont change on *every* request.
The LocalFS product, on the other hand, enables you to serve content from the file system and maintain meta data and apply user permissions from Zope. Is there any performance advantage with LocalFS, or is it basically the same as storing the content in Data.fs?
Ill leave LocalFS considerations to someone else.
DATA.FS LIMITATIONS If all the site's content is stored in Data.fs, I'm concerned that it would quickly grow to a size that would result in performance drag.
FileStorage (the component which manages the Data.fs file) is damn fast as long as its index fits in memory. If it doesnt, it sucks. One easy approach is to use FileStorage for as long as you can. Migrate to a different storage when you need to, not before.
Since I'm used to the RDBMS world, it seems odd to store all that data in one file. Is there a rule of thumb with respect to the amount of data you can put into Data.fs before performance becomes an issue?
as always "it depends". Generally, the amount of RAM needed by FileStorage for normal use is 1/10th of its disk space; so thats 200M RAM if you have a 2G Data.fs. It needs at least the same again when packing.
ALTERNATIVES TO DATA.FS It seems like the following alternatives to DATA.FS in its default configuration are available:
***Distribution*** This option would involve separating the server that stores the .FS file from the one(s) running Zope. You would do this with a ZeoStorageServer.
Yes. you wont regret using ZEO.
A variation on this theme would be to use NAS/NFS to put the data on a separate server.
Using FileStorage over NFS is dangerous. Also, it doesnt solve the main problem; you still need all that RAM in your server machine.
***ExternalMount*** Here you would use the ExternalMount product to store the data for selected portions of ZODB (e.g., for a specific Product) in a separate .FS file, either on the same or a separate server. Presumably this option would mitigate performance issues resulting purely from the size of Data.FS.
That might get you under a Filesystem/OS 2G limit, but thats about it.
***BerkeleyDB or Oracle*** Oracle or Berkeley DB can be used as the storage mechanism instead of .FS. But in doing so do you loose Zope functionality?
No, there is no loss of Zope functionality. The main cost is extra administration overhead. If you are already an Oracle or BerkeleyDB guru, I suggest going with this. All of these Storage options have one common scalability limitation; they need RAM proportional to database size during packing. I am currently working on a new storage which doesnt, but its not yet at production quality: http://dirstorage.sourceforge.net/