ZODB vs. Gadfly vs. ???
Hi, First, thanks to everyone that has helped me along so far! It is *tremendously* appreciated! You can see what I've managed to hack together thus far at http://www.complete.org:8080/ACLUG/events Now I need some advice rather than pointers on syntax (though I'm still learning that, too). Here's my situation. I want to put up information about some things (they happen to be single-time events, but that's not terribly relevant). Kevin Dangoor and others suggested that I make an Event ZClass, then give it an index_html or whatever to fetch the events. Pretty slick, and it was easier to implement than a gadfly thing, with which I had been failing due to not being aware of _.DateTime(). However, I am running into some problems. I have only a couple dozen Event objects in my eventsDb folder, but already there is a noticeable performance hit. It takes the server about 3-4 seconds to render the above page in calendar mode -- which is completely out of line no this server, which is a cream-of-the-crop 600MHz Alpha. This is, no doubt, due to inefficiency. To search for events on a given date, I have to iterate through *all* the objects in there, inspecting dates. This must be done for each day in a given month to determine whether there is an event on that day (for displaying on a calendar square). Ick. I'm using the Calendar product from the site, incidentally. So: * Am I missing out on some whiz-bang way to do searches through a directory full of custom objects? * Should I be using Gadfly instead? Would it be faster? Why is ZODB so slow? * In essence, because of the ZODB architecture or my own ignorance of how to do it better, I'm getting performance of less than 10 queries per second. This is not acceptable. Also, I am having performance worries. If the server chokes this fast with only a couple dozen items, I am concerned. This server is normally capable of dishing out many thousands of documents a second, and even figuring worst-case here, (24 * 30), it's getting only 720 (and those aren't even complete documents, just lookups). Can someone help ease my mind on this one? Many thanks, John Goerzen -- John Goerzen Linux, Unix consulting & programming jgoerzen@complete.org | Developer, Debian GNU/Linux (Free powerful OS upgrade) www.debian.org | ----------------------------------------------------------------------------+ The 49,581,309th prime number is 973,777,817.
* In essence, because of the ZODB architecture or my own ignorance of how to do it better, I'm getting performance of less than 10 queries per second. This is not acceptable.
Also, I am having performance worries. If the server chokes this fast with only a couple dozen items, I am concerned. This server is normally capable of dishing out many thousands of documents a second, and even figuring worst-case here, (24 * 30), it's getting only 720 (and those aren't even complete documents, just lookups). Can someone help ease my mind on this one?
I haven't used the calendar product yet but whenever speed issues like this crop up on the mysql list (eg "MS Access takes 0.1 secs, mysql takes > 2 minutes") it is invariably due to one thing alone : indexes (or lack of them). So, shooting in the dark here - check the indexes in your zcatalog instance. chas
John Goerzen wrote:
Hi,
First, thanks to everyone that has helped me along so far! It is *tremendously* appreciated! You can see what I've managed to hack together thus far at http://www.complete.org:8080/ACLUG/events
Now I need some advice rather than pointers on syntax (though I'm still learning that, too). Here's my situation. I want to put up information about some things (they happen to be single-time events, but that's not terribly relevant). Kevin Dangoor and others suggested that I make an Event ZClass, then give it an index_html or whatever to fetch the events. Pretty slick, and it was easier to implement than a gadfly thing, with which I had been failing due to not being aware of _.DateTime().
However, I am running into some problems. I have only a couple dozen Event objects in my eventsDb folder, but already there is a noticeable performance hit. It takes the server about 3-4 seconds to render the above page in calendar mode -- which is completely out of line no this server, which is a cream-of-the-crop 600MHz Alpha. This is, no doubt, due to inefficiency. To search for events on a given date, I have to iterate through *all* the objects in there, inspecting dates. This must be done for each day in a given month to determine whether there is an event on that day (for displaying on a calendar square). Ick.
So you iterate over each object 28-31 times? Then your problem is purely algorithmic. It would be better to catalog the date attributes of each object and then ask the Catalog: <dtml-in "Catalog.searchResults({'date_property' : [ZopeTime(), (ZopeTime + 1)], 'date_property_usage' : 'range:min:max'})"> This will return you all objects with a 'date_property' whose date is between now (ZopeTime()) and now+24 hours (ZopeTime()+1). This may not be exactly what you want, but you get the idea. This will happen very, very fast, and will scale to thousands of events. One of our customers uses the catalog to search over 10,000 objects in the blink of eye. In fact, I think he uses the Catalog and Calendar together in a way like this. Jason? In addition, you can teach your ZClass instances to automaticly catalog and uncatalog themselves without managment intervention.
* Am I missing out on some whiz-bang way to do searches through a directory full of custom objects?
Yes, the Catalog. Immagine seaching through millions of Oracle records iterativly. Catalog uses indexes just like relational databases do to greatly speed up searches.
* Should I be using Gadfly instead? Would it be faster? Why is ZODB so slow?
ZODB isn't slow. It may be slowER than <insert your favorite database here> but then again, it might be faster. Gadfly might be faster, it might not. From what I understand, Gadfly keeps all or part of it's data in memory at all times and never disk writes; if this is the case then it will be faster, until you run out of memory (I could be wrong, haven't used Gadfly in a bit).
* In essence, because of the ZODB architecture or my own ignorance of how to do it better, I'm getting performance of less than 10 queries per second. This is not acceptable.
Also, I am having performance worries. If the server chokes this fast with only a couple dozen items, I am concerned. This server is normally capable of dishing out many thousands of documents a second, and even figuring worst-case here, (24 * 30), it's getting only 720 (and those aren't even complete documents, just lookups). Can someone help ease my mind on this one?
Your server is choking because of your choice of algorithm. This is unrelated to the problem that Zope cannot serve up information as fast as a static web server like Apache. This is obvious when you take into account that Apache is a C program that serves static files, with gobs of optimizations to make that operation fast. There is no script evaluation, acquisition, advanced security model, etc. Zope is written in a higher level language and serves up dynamic content, it must go through a code path on every request that varies from simple to complex. I run Zope on a nifty little P75 with 32MB of ram and it works fairly dandy. Bruce Perens runs his popular technocrat website (http://www.technocrat.net) on a humble P120. The technocrat site survived with flying colors a full on slashdot effect over the course of 24 hours. We are addressing the concepts of performance, but I don't think there is nearly the kind of problem in Zope like you are experiencing. Granted there are many area we can improve through cleaner design and conversion to C code, none of these optimizations would help your case. I would suggest looking into the Catalog. -Michel
Many thanks,
John Goerzen
participants (3)
-
chas -
John Goerzen -
Michel Pelletier