Hello Zopistas out there! I just stumbled about the threadsafety of different SQL-DAs. I'm coming from mod_perl + MySQL so it seemed a natural choice for me to also run Zope with MySQL in the background. I'm doing research (= low budget) in clinical psychology and have people fill in all kinds of questionnaires regularly which I have to store somewhere later. Looking at some past messages it seems to me that free and speedy MySQL is awfully bottlenecked by MySQLdb and financially-out-of-range Oracle rules. But how big is the difference? Does anybody know? It also came to my mind that I might be missing an important feature of Zope: ZODB. For some reason (lack of marketing?) I just didn't realize that there is a database behind Zope that I could use to store my stuff. Last night I read some documentation but I have some questions left open: Can I access ZODB via TCP/IP socket to retrieve data for statistical calculations? What about performance? Ragnar Beer
At 9:01 am +0100 15/3/00, Ragnar Beer wrote:
Hello Zopistas out there!
I just stumbled about the threadsafety of different SQL-DAs. I'm coming from mod_perl + MySQL so it seemed a natural choice for me to also run Zope with MySQL in the background. I'm doing research (= low budget) in clinical psychology and have people fill in all kinds of questionnaires regularly which I have to store somewhere later. Looking at some past messages it seems to me that free and speedy MySQL is awfully bottlenecked by MySQLdb and financially-out-of-range Oracle rules. But how big is the difference? Does anybody know?
I could be talking out of a hole in the top of my head here, but here goes: If you get Andy Dustmans' MySQLdb Module (http://www.zope.org/Members/adustman/MySQLdb). It's designed to be thread-safe. I've always taken that to mean that if a Zope request does a ZSQL query that uses this module, then if another ZSQL query comes along (in a different Zope thread) then it is *not* blocked until the first query is finished, but runs alongside the first query. If I'm wrong on this I would *really* appreciate a correction here. The Digital Creations people aren't overly keen on MySQL because it doesn't support transactions, which the Zope system uses extensively (I only use it for 'Undo'). For the sort of queries I have (many many SELECTs, few INSERTs, that isn't a major problem to me).
It also came to my mind that I might be missing an important feature of Zope: ZODB. For some reason (lack of marketing?) I just didn't realize that there is a database behind Zope that I could use to store my stuff. Last night I read some documentation but I have some questions left open:
Can I access ZODB via TCP/IP socket to retrieve data for statistical calculations?
I'd reckon so. Look into XML-RPC as an example. Of course you can just construct URLs that obtain the data you need (eg have a folder with lots of properties of the form prop_n and get the values as http://server/path/to/folder/prop_125) and perhaps use something like urllib (Python library) to do some machinations at the client end).
What about performance?
People seem to reckon ZODB is fast enough. I think it's Python limited (but again, I could be wrong here). That's possibly where the pystone benchmark comes in; check these out; 180MHz 603e (Linux) = 850 (my home Zope system) 300MHz UltraSparc (Solaris) = 2800 (our production machine! :( 266MHz G3 (MacOS) = 4600 500MHz PIII (SUSE Linux) = 5400 700MHz PIII (Win32) = 10150!! (this is really annoying, this box cost 1100 quid - the same as the yearly maintenance on our Sun iron :( I have no idea why those last two numbers are so disparate for the want of 200MHz. We'll be putting Linux onto the 700MHz box to see what happens there... hth tone ------ Dr Tony McDonald, FMCC, Networked Learning Environments Project http://nle.ncl.ac.uk/ The Medical School, Newcastle University Tel: +44 191 222 5888 Fingerprint: 3450 876D FA41 B926 D3DD F8C3 F2D0 C3B9 8B38 18A2
At 9:01 am +0100 15/3/00, Ragnar Beer wrote:
Hello Zopistas out there!
I just stumbled about the threadsafety of different SQL-DAs. I'm coming from mod_perl + MySQL so it seemed a natural choice for me to also run Zope with MySQL in the background. I'm doing research (= low budget) in clinical psychology and have people fill in all kinds of questionnaires regularly which I have to store somewhere later. Looking at some past messages it seems to me that free and speedy MySQL is awfully bottlenecked by MySQLdb and financially-out-of-range Oracle rules. But how big is the difference? Does anybody know?
I could be talking out of a hole in the top of my head here, but here goes:
If you get Andy Dustmans' MySQLdb Module (http://www.zope.org/Members/adustman/MySQLdb). It's designed to be thread-safe. I've always taken that to mean that if a Zope request does a ZSQL query that uses this module, then if another ZSQL query comes along (in a different Zope thread) then it is *not* blocked until the first query is finished, but runs alongside the first query.
If I'm wrong on this I would *really* appreciate a correction here.
The Python DB API spec 2.0 talks about different levels of thread safety. MySQLdb has threadsafety level 1 (= threads my share the module but not connections), the ORACLE db has threadsafety level 3 (= threads may share the module, connections and cursors). I wonder what that means in % performance gain in high traffic situations. I have no idea where the different levels of threadsafety kick in. Any experienced users?
Can I access ZODB via TCP/IP socket to retrieve data for statistical calculations?
I'd reckon so. Look into XML-RPC as an example. Of course you can just construct URLs that obtain the data you need (eg have a folder with lots of properties of the form prop_n and get the values as http://server/path/to/folder/prop_125) and perhaps use something like urllib (Python library) to do some machinations at the client end).
I was rather thinking of connecting directly to ZODB (in a kind of server mode) via TCP/IP - not via Zope. But I guess that's an aberrant thought that would rather be applicable to connecting to ZEO. Still I feel like I'm only beginning to understand but in about two weeks the new production site must be ready running zope ;) Ragnar
On Wed, 15 Mar 2000, Tony McDonald wrote:
If you get Andy Dustmans' MySQLdb Module (http://www.zope.org/Members/adustman/MySQLdb). It's designed to be thread-safe. I've always taken that to mean that if a Zope request does a ZSQL query that uses this module, then if another ZSQL query comes along (in a different Zope thread) then it is *not* blocked until the first query is finished, but runs alongside the first query.
If I'm wrong on this I would *really* appreciate a correction here.
I think that's probably wrong. The thread-safety-ness of MySQLdb is that two threads can share the module. You can't safely share connections, but this is more of a MySQL limitation than a MySQLdb limitation. Of course, you can share connections if you wrap a mutex around it so that only one thread uses it at a time, but it is far better to have multiple connections: Each connection is a seperate MySQL thread on the server side. The old MySQLmodule was probably thread-safe, but not thread-friendly: If you ran a blocking operation (like a query), it would block all the threads (i.e. it did not give up the Python interpreter lock during blocking calls). This apparently causes really bad performance with Zope. I may be deficient on Zope zen, but it seems like you would definitely want to have several connection objects around, perhaps one for each ZSQL Method, so that these can all work in parallel. From a security perspective (i.e. you aren't just paranoid, you know they are out to get you), this may make sense, as you can set up seperate users in MySQL with very specific privileges, so each connection object can only do one thing and can't be hijacked for something else. -- andy dustman | programmer/analyst | comstar.net, inc. telephone: 770.485.6025 / 706.549.7689 | icq: 32922760 | pgp: 0xc72f3f1d "Therefore, sweet knights, if you may doubt your strength or courage, come no further, for death awaits you all, with nasty, big, pointy teeth!"
On Wed, 15 Mar 2000, Ragnar Beer wrote:
Hello Zopistas out there!
It also came to my mind that I might be missing an important feature of Zope: ZODB. For some reason (lack of marketing?) I just didn't realize that there is a database behind Zope that I could use to store my stuff. Last night I read some documentation but I have some questions left open:
Can I access ZODB via TCP/IP socket to retrieve data for statistical calculations?
If you are not planning of storing thousands of forms and have them accessed by many users/per second then ZODB is really a good solution and it is *easy* too use. For example (in the simplest case) you can design a ZClass that contains all the required fields and a form for the clients to fill. A method can instantiate the ZClass in a folder and add the parameters provided by the customer and thats about it in terms of simple storage. You can have a method that returns the data in some convenient format as a text file which you can then input into your favorite stat package. Or you can get fancier and build an xmlrpc method in your stat package if it supports xmlrpc (ie if your stat package is a set of perl programs) and get the data directly into your package, or get even fancier and incorporate your statistical analysis routines inside your Zope product, but I am getting carried away here :-) (and yes it is doable, it has been done and if you decide to take this route I have some PCA routines in python you might want) Pavlos
On Wed, 15 Mar 2000, Ragnar Beer wrote:
Hello Zopistas out there!
It also came to my mind that I might be missing an important feature of Zope: ZODB. For some reason (lack of marketing?) I just didn't realize that there is a database behind Zope that I could use to store my stuff. Last night I read some documentation but I have some questions left open:
Can I access ZODB via TCP/IP socket to retrieve data for statistical calculations?
If you are not planning of storing thousands of forms and have them accessed by many users/per second then ZODB is really a good solution and it is *easy* too use. For example (in the simplest case) you can design a ZClass that contains all the required fields and a form for the clients to fill. A method can instantiate the ZClass in a folder and add the parameters provided by the customer and thats about it in terms of simple storage. You can have a method that returns the data in some convenient format as a text file which you can then input into your favorite stat package. Or you can get fancier and build an xmlrpc method in your stat package if it supports xmlrpc (ie if your stat package is a set of perl programs) and get the data directly into your package, or get even fancier and incorporate your statistical analysis routines inside your Zope product, but I am getting carried away here :-) (and yes it is doable, it has been done and if you decide to take this route I have some PCA routines in python you might want)
Actually I'm not planning anything except to be prepared. I'm only starting and have no idea what I will end up with. So of course I am looking for a scalable solution. From what I've learned from the list scalability seems to be a weak spot of Zope (while e.g. the latest Enhydra includes a failover solution). I have no idea how much ZEO costs. (People from DC could you help here? I couldn't find a price on the website.) I tried to understand ZClasses and read the documentation a couple of times but it didn't help a lot. So now I'm using Python classes which is very easy (maybe only because it's better documented). Can you e.g. simply add a dictionary or a list of dictionaries to a ZClass and access each element individually? Would be nice if I could get back to you some time later to talk about the PCA. :) Ragnar
Ragnar Beer wrote:
Hello Zopistas out there!
I just stumbled about the threadsafety of different SQL-DAs. I'm coming from mod_perl + MySQL so it seemed a natural choice for me to also run Zope with MySQL in the background. I'm doing research (= low budget) in clinical psychology and have people fill in all kinds of questionnaires regularly which I have to store somewhere later. Looking at some past messages it seems to me that free and speedy MySQL is awfully bottlenecked by MySQLdb and financially-out-of-range Oracle rules. But how big is the difference? Does anybody know?
I'm not certain if the MySQL DA is level 3 (threaded) or not. I am certain that MySQL does not support transactions and Zope does. What this means is, if you call some SQL and later on an error occours, Zope will "roll back the transaction" and discard any changes you made in Zope. If Zope is used with Oracle, Sybase, ODBC, Postgress or any other database that supports transactions, then those databases will synchronize transaction commits and rollbacks with Zope. This means that your data is allways consistent across databases, which is a good thing. This may not effect you. Your database may be entirely read only, in which case it's not a problem at all. Just keep in mind that you are missing an important piece.
It also came to my mind that I might be missing an important feature of Zope: ZODB. For some reason (lack of marketing?) I just didn't realize that there is a database behind Zope that I could use to store my stuff.
Keep in mind that this is not an _relational_ database. It's an object database.
Last night I read some documentation but I have some questions left open:
Can I access ZODB via TCP/IP socket to retrieve data for statistical calculations?
Yep. You can access the ZODB through HTTP, FTP, XML-RPC and WebDAV. That's really what Zope is, it's a network aware application on top of an object database.
What about performance?
The question is really too vague to get a specific answer. -Michel
participants (5)
-
Andy Dustman -
Michel Pelletier -
Pavlos Christoforou -
Ragnar Beer -
Tony McDonald