Interest in continued Oracle support in Zope
[Sent to zope-db as well, this version better edited] Greets, ladies and gentlemen. I worry about the future of Oracle support in Zope, and in Python, too, for that matter. So I am wondering what interest there actually is for having Oracle support in Python/zope, I have a few questions: (Oracle support = having quality Oracle connectivity from python/Zope) 1) Does anyone else out there have a business case for using Oracle with Python/Zope? 2) Will those from (1) suffer badly, ie. will switch from Python or Zope to something else, if Oracle support in Python/Zope would dissapear? 3) Will those from (1) suffer badly, ie. will switch from Python or Zope to something else, if Oracle support is not improved? (ie less crashes, better throughput, etc). 4) Of those who say yes to at least one of 1-3, which ones can invest actual cash to create and maintain good quality Oracle Support in Python/Zope? 5) Of those who say yes to at least one of 1-3, which ones can invest *time* to create and maintain good quality Oracle Support in Python/Zope? What are the sentiments of other people on this list? I myself say yes to 1-3; however, at the moment I can only invest time (somewhat limited) and no cash. Also, I seem to be one of those that really stretch the limits on DCOracle2 - we use it super-extensivly for thousends of users, and seem to encounter all kinds of problems with it, ranging from lockups to connectivity loss. So I am interested in finding out if there actually are people using Oracle with Zope/Python and what their thoughts are. Thanks, /dario -- -- ------------------------------------------------------------------- Dario Lopez-Kästen, IT Systems & Services Chalmers University of Tech.
On Thu, 26 Aug 2004 08:10:28 +0200, Dario Lopez-Kästen <dario@ita.chalmers.se> wrote:
1) Does anyone else out there have a business case for using Oracle with Python/Zope?
It's an absolute requirement for us. We use it every day, from a large installation of ZEO clients. It's used for both end-user things, and for the session database (via SQLSession). I don't recall having many problems with it, and none at all recently.
2) Will those from (1) suffer badly, ie. will switch from Python or Zope to something else, if Oracle support in Python/Zope would dissapear?
What do you mean "disappear"? In our case, we're not going to rewrite the applications in something other than Zope, there's just no way to justify the expense. It's all open source, so I can't see how it could ever "disappear". If a new version of Zope came out that in some way made it impossible to access Oracle, we'd just stick with the version that works.
3) Will those from (1) suffer badly, ie. will switch from Python or Zope to something else, if Oracle support is not improved? (ie less crashes, better throughput, etc).
We don't have crashes. The throughput isn't an issue for us, it's proven to be fine.
4) Of those who say yes to at least one of 1-3, which ones can invest actual cash to create and maintain good quality Oracle Support in Python/Zope?
In the past we've spent money on various bits of Zope (via ZC) to improve database connectivity. I don't see any need to do so at the moment, as the Oracle interface isn't broken for us.
5) Of those who say yes to at least one of 1-3, which ones can invest *time* to create and maintain good quality Oracle Support in Python/Zope?
As I said - if Oracle was broken at some point, I'd fix it, in the same way that I fixed broken ZClasses in 2.6 (ObChrisWithersNote: Yes, yes, I know. Shut up.) That, to me, is part of the *reason* we use open source for critical infrastructure - we're not forced to switch to SQL Server because our vendor got bought by Microsoft, for instance.
Also, I seem to be one of those that really stretch the limits on DCOracle2 - we use it super-extensivly for thousends of users, and seem to encounter all kinds of problems with it, ranging from lockups to connectivity loss.
We don't see this. We run on Solaris, talking to Oracle 8.1.7, with a couple of dozen ZEO clients across the entire installation - they all have at least one connection to an Oracle database - in some cases, they have several different connections (for instance, SQLSession works far better if you give it it's own DB connection). The primary cluster has something like 9 ZEO clients, and it gets hammered 24x7. I cannot recall the last time we had problems with the Zope/Oracle interaction, it's at least a couple of years ago. This has been in use for something like 5+ years now. Anthony
Anthony Baxter wrote:
We don't see this. We run on Solaris, talking to Oracle 8.1.7, with a couple of dozen ZEO clients across the entire installation - they all have at least one connection to an Oracle database - in some cases, they have several different connections (for instance, SQLSession works far better if you give it it's own DB connection). The primary cluster has something like 9 ZEO clients, and it gets hammered 24x7. I cannot recall the last time we had problems with the Zope/Oracle interaction, it's at least a couple of years ago. This has been in use for something like 5+ years now.
this is certainly encouraging, in terms of opening the possibility of errors on our behalf that we can fix. We had *terrible* performace with Solaris in general with python and Zope so we reluctantly abondened Solaris about 2 years ago. This was probably due to not fully understanding the implications of the GIL at the time. On Linux we have had problems when running long, slow queries, and having several instances of DCOracel DA. We have expericend loss of connectivity and having 3 of 4 of zope's threads being completly non-responsive, even when the site is idle. -- -- ------------------------------------------------------------------- Dario Lopez-Kästen, IT Systems & Services Chalmers University of Tech.
Dario wrote:
...We have expericend loss of connectivity and having 3 of 4 of zope's threads being completly non-responsive, even when the site is idle.
gah... to quick with <ctrl>-<return> :-P These problems are on non-zeo sites. We have just deployed our heaviest site (oru student portal) on ZEO with directory storage. We are just now entering a period of heavy activity with, I believe 5-20 oracle queries for each request, so it will be interesting to see what impact ZEO has on performance of ZSQL stuff. Have you setup your zope's in any particualr way (ie more ZODB threads or so)? /dario -- -- ------------------------------------------------------------------- Dario Lopez-Kästen, IT Systems & Services Chalmers University of Tech.
On Thu, 26 Aug 2004 16:34:50 +0200, Dario Lopez-Kästen <dario@ita.chalmers.se> wrote:
Have you setup your zope's in any particualr way (ie more ZODB threads or so)?
No. Just the default - we can run one ZEO client for each CPU in the machine that's not tasked for something else. They all back onto a single ZEO server (using standard Data.fs). No user data is stored in the ZODB, only the website logic and pretty crap (like the silly amount of branding). The ZEO clients all talk fastcgi via pythondirector to a pile of machines running Apache. They, in turn are behind a pair of cisco LocalDirectors. Internet - LDs - Apache boxes - pythondirector - ZEO client cluster - ZEO server. There's a lot more things going on as well, but the other details are not relevant to this case.
Anthony Baxter wrote:
On Thu, 26 Aug 2004 16:34:50 +0200, Dario Lopez-Kästen <dario@ita.chalmers.se> wrote:
Have you setup your zope's in any particualr way (ie more ZODB threads or so)?
No. Just the default - we can run one ZEO client for each CPU in the machine that's not tasked for something else. They all back onto a single ZEO server (using standard Data.fs). No user data is stored in the ZODB, only the website logic and pretty crap (like the silly amount of branding).
The ZEO clients all talk fastcgi via pythondirector to a pile of machines running Apache. They, in turn are behind a pair of cisco LocalDirectors.
aha... may I ask if this means that the zope-clients then only run with one thread each? I have always assumed that running fastcgi or pcgi meant slower response because only one thread was used, and that running with ZServer behind rewriterules was better because then you get multithreaded? If this is so, then perhaps that may the difference - we do not use cgi, only zserver with rewrite rules. If the assumptions expressed above are correct, then perhaps it partly boils down to a threading issue? /dario -- -- ------------------------------------------------------------------- Dario Lopez-Kästen, IT Systems & Services Chalmers University of Tech.
On Tue, 31 Aug 2004 12:56:15 +0200, Dario Lopez-Kästen <dario@ita.chalmers.se> wrote:
aha... may I ask if this means that the zope-clients then only run with one thread each?
I have always assumed that running fastcgi or pcgi meant slower response because only one thread was used, and that running with ZServer behind rewriterules was better because then you get multithreaded?
Don't know about pcgi, but you're completely wrong about fastcgi. It does the multithreaded thing fine.
If this is so, then perhaps that may the difference - we do not use cgi, only zserver with rewrite rules. If the assumptions expressed above are correct, then perhaps it partly boils down to a threading issue?
Nope. To be honest, I'd suspect Oracle-on-Linux. We ran with that for a while, several years ago, but gave it up because it was too fragile. AFAIK, Solaris is still one of Oracle's primary platforms. It's possible, of course, that Oracle have fixed Oracle on Linux up in the meantime. Anthony
Anthony Baxter wrote:
Nope. To be honest, I'd suspect Oracle-on-Linux. We ran with that for a while, several years ago, but gave it up because it was too fragile. AFAIK, Solaris is still one of Oracle's primary platforms. It's possible, of course, that Oracle have fixed Oracle on Linux up in the meantime.
hm.. maybe you are right... though we only use the client libs - oru oracle servers are on Solaris atm. Here is an selection of a traceback that I got a short while ago: ----- 2004-08-31T18:36:23 ERROR(200) SiteError https://XXXX/search_icon.gif/index_html Traceback (most recent call last): File "/usr/local/zope/software/zope-2.7.1/lib/python/ZPublisher/Publish.py", line 92, in publish object=request.traverse(path, validated_hook=validated_hook) File "/usr/local/zope/software/zope-2.7.1/lib/python/ZPublisher/BaseRequest.py", line 423, in traverse else: user=v(request, auth, roles) .... File "/usr/local/zope/software/zope-2.7.1/lib/python/Shared/DC/ZRDB/DA.py", line 454, in __call__ else: result=DB__.query(query, self.max_rows_) File "/usr/local/zope/instances/cdks/Products/DCOracle2/db.py", line 175, in query r=c.execute(qs) # Returns 1 on SELECT File "/usr/local/zope/instances/cdks/Products/DCOracle2/DCOracle2/DCOracle2.py", line 1006, in execute result = self._cursor.execute() DatabaseError: (3113, 'ORA-03113: end-of-file on communication channel') ------ This happens a lot on RH 3.0, DCOracle2 from CVS HEAD, Oracle 9 libraries. We do not use Solaris to host Zope-servers anymore, since we had horrible perfromance problems on Solaris and could not get around them. Dunno if this actulayy gives any hints on where to begin with... What version of DCOracle2 do you use? /dario -- -- ------------------------------------------------------------------- Dario Lopez-Kästen, IT Systems & Services Chalmers University of Tech.
Dario Lopez-Kästen wrote at 2004-8-31 18:57 +0200:
... hm.. maybe you are right... though we only use the client libs - oru oracle servers are on Solaris atm.
Here is an selection of a traceback that I got a short while ago:
----- 2004-08-31T18:36:23 ERROR(200) SiteError https://XXXX/search_icon.gif/index_html Traceback (most recent call last): ... "/usr/local/zope/instances/cdks/Products/DCOracle2/DCOracle2/DCOracle2.py", line 1006, in execute result = self._cursor.execute() DatabaseError: (3113, 'ORA-03113: end-of-file on communication channel')
This problem does not look like a Zope or DCOracle problem but like an Oracle problem. When I had to work with Oracle 3 years ago, I have been *VERY* disappointed (this was Oracle 8i). Despite its high costs (100.000 USD), it was almost unfunctional: * memory corruption inside the Oracle client libraries brought our Zope down within minutes of use This was finally fixed by an alternate open protocol. * the Oracle server died often, non-deterministically, after some amount of work we had to regularly close the connection and reopen a new one in order not to let one server do too much work * the Oracle client library caught SIGCHILD making "system" and "popen" unreliable -- this could be fixed with a special configuration option, deeply hidden in the tons of Oracle documentation * full-text reindexing after large imports often crashed non-deterministically * upgrading from one Oracle version to the next Oracle subversion was a nightmare * some combinations of relational and full text subqueries let Oracle forget about its indexes. For example: while a query "Q1 and Q2" took seconds (Oracle recognized that it had indexes) the query "Q1 or Q2" (with the same "Q1" and "Q2") took days (!) (because Oracle had forgotten about the indexes and used full table scans -- which is a bad idea with hundread of millions of records and hundreds of gigabyte of data). The Oracle support was unable to fix this problem within half a year. Oracle recommended to avoid the combination of relational and full text subqueries ! Now, I use Postgres -- and I am *MUCH* happier: I have at most 1 per cent of the problems I have had with Oracle. And this with no costs and no support (but the amount of data is smaller now and there are no full text queries (that were responsible for most of Oracles problems)). -- Dieter
Dieter Maurer wrote:
This problem does not look like a Zope or DCOracle problem but like an Oracle problem.
perhaps - I am starting to suspect Oraxcle 8/9 client problems (on linux)
When I had to work with Oracle 3 years ago, I have been *VERY* disappointed (this was Oracle 8i). Despite its high costs (100.000 USD), it was almost unfunctional:
* memory corruption inside the Oracle client libraries brought our Zope down within minutes of use
This was finally fixed by an alternate open protocol.
Can't comment on this, since I have not your knowlegde on how to identify and understand these issues (I do whish I had, though :-)
* the Oracle server died often, non-deterministically, after some amount of work
we had to regularly close the connection and reopen a new one in order not to let one server do too much work
We had this problem briefly on versions like, for instance Oracle 8.1.6. Basically from what I remember of those days, the last version of Oracle 7 (7.3 or so) was rock stable, but we were not using Oracle 7, next "stable" version was 8.0.5, which I think was the first version we had in serious production. You really wanted to avoid like the plague everything before 8.1.5 and 8.1.7. (8.1.5/8.1.7 were pretty good, and we still use a 8.1.7 in production - mostly because we have not had the opportunity to upgrade it). 8.1.6 was... let's just say it was very good. :-P
* the Oracle client library caught SIGCHILD making "system" and "popen" unreliable -- this could be fixed with a special configuration option, deeply hidden in the tons of Oracle documentation
jup remember that too :-)
* full-text reindexing after large imports often crashed non-deterministically
Never did that. We never really did consider doing this kind of stuff at all, and the tentative plans we had were to use Oracle Context, as it was called then. We were in contact with people that were sucessuflly using Context to do large scale text indexing , and who where investigating the feasability to do stuff doing genome code indexing etc, for research purposes. This was in 1998-1999...
* upgrading from one Oracle version to the next Oracle subversion was a nightmare
Wtried that once, as an experiment and I can't remember if it was from 8.1.5 to 8.1.6 or from 8.1.6 to 8.1.7. We very quickly decided that, as we suspected, it was much less work to install a new instance with the new version and then migrate the data and structures to the new instance. Just like when installing windows :-)
* some combinations of relational and full text subqueries let Oracle forget about its indexes. For example: while a query "Q1 and Q2" took seconds (Oracle recognized that it had indexes) the query "Q1 or Q2" (with the same "Q1" and "Q2") took days (!) (because Oracle had forgotten about the indexes and used full table scans -- which is a bad idea with hundread of millions of records and hundreds of gigabyte of data).
The Oracle support was unable to fix this problem within half a year. Oracle recommended to avoid the combination of relational and full text subqueries !
Well, unless you have a personal contact with Oracle support, then it basically sucks, even today - though MetaLink has become a bit more uable lately. The thing with oralce is that you have to know how to use it; just the sam as with Zope. In versions 8 and 9, that means that you have to take special care of HOW you construct your SQL. I know this from personal experience, both from personal mistakes, and from extensive work trying to fix other peoples code and SQL. I kid you not when I say that a very simple rewrite of an sql query can minimize the execution time from 5 minutes to less than 5 seconds. I am at this moment struggling with precisely such an application that is very badly desinged. in newer and latest oracles, the situation for a programmer is mich easier. Since the DB automagically calculates things like index statistics, and even (If I remember what I've been told) manages it's tablespaces automagially. All in all this means that you have much more freedom in how to write your queries than you had before. There are lots of "hidden", or at least non-obvious tricks to fully taking advantage of Oracle, but this situation is no different than using Zope. Anyways, we are not going to go away from Oracle any time soon. For one ting, we have invested knowlegde on Oracle that we cannot just throw away. Oracle provides us with services that we feel we cannot be without, such as warm backups, replication, etc, that are more or less ready to use OOTB, just to name one at random. I am not debating whether postgress is good or bad, mind. I am just expressing my need to have Zope work well in conjuntion with Oracle. And this is not only to satisfy my own current needs. I think it is good for Zope, and indirectly for all of us, if Oracle was on the list of things that Zope interacts solidly with. After all - in the very unlikely event that it would boil down to it costing too much to use Zope with Oracle, we are not going to drop Oracle... we probably would still be using Zope but not for all the more high-profile sites and applications we would be needing to create. So, I think i'll try to stick with both Oracle and Zope as long as possible :-) /dario -- -- ------------------------------------------------------------------- Dario Lopez-Kästen, IT Systems & Services Chalmers University of Tech.
On Wed, 1 Sep 2004 21:26:07 +0200, Dieter Maurer <dieter@handshake.de> wrote:
When I had to work with Oracle 3 years ago, I have been *VERY* disappointed (this was Oracle 8i). Despite its high costs (100.000 USD), it was almost unfunctional:
I agree that Oracle is extremely expensive, but there's a few key advantages: - It's designed for large systems and has very good tools for managing databases. - It's easy to find DBAs that are extremely knowledgable about Oracle. The second, for me, is key. Software developers should not also be trying to manage an RDBMS.
* memory corruption inside the Oracle client libraries brought our Zope down within minutes of use
Haven't seen this since we stopped using Oracle on Linux, and even then, it was far far less frequent than that - maybe once a month or so.
* the Oracle server died often, non-deterministically, after some amount of work
Haven't seen that with Oracle on Solaris, ever. On Linux, we'd _very_ _very_ infrequently get strange things going on - every couple of months.
* the Oracle client library caught SIGCHILD making "system" and "popen" unreliable -- this could be fixed with a special configuration option, deeply hidden in the tons of Oracle documentation
Can't see I've ever seen that.
* full-text reindexing after large imports often crashed non-deterministically
Again, haven't seen that, ever.
* upgrading from one Oracle version to the next Oracle subversion was a nightmare
See, I've found the opposite - one of the strengths of Oracle is that this stuff is _well_ understood and documented. We do upgrades, sidegrades, point fixes and the like, all the time. Well, when I say "we", I mean "our DBAs". I wouldn't let them write code, and in return I don't expect to manage the database. They're different jobs, requiring different skills.
* some combinations of relational and full text subqueries let Oracle forget about its indexes. For example: while a query "Q1 and Q2" took seconds (Oracle recognized that it had indexes) the query "Q1 or Q2" (with the same "Q1" and "Q2") took days (!) (because Oracle had forgotten about the indexes and used full table scans -- which is a bad idea with hundread of millions of records and hundreds of gigabyte of data).
Did you use the "explain plan" functionality? Was this with the rule-based or cost-based optimiser?
The Oracle support was unable to fix this problem within half a year. Oracle recommended to avoid the combination of relational and full text subqueries !
Oracle's support can be annoying, yes. Fortunately, I just let the DBAs deal with it - they know how to get the information from Oracle's TAC.
Now, I use Postgres -- and I am *MUCH* happier: I have at most 1 per cent of the problems I have had with Oracle. And this with no costs and no support (but the amount of data is smaller now and there are no full text queries (that were responsible for most of Oracles problems)).
I like postgres, but my main problem is finding DBAs with sufficient PG knowledge. This, to me, is a key thing. If the business _depends_ on the database, you really really need a high clue level in the people managing it. Having said all that, and lest I seem like a reflexive Oracle supporter, it often drives me screaming with frustration up a wall. But I have the same sorts of issues when using postgres - usually it's the query optimiser doing something stupid, and you can examine the optimiser and figure out what's going on. Usually it's just an index hint that's needed. And Oracle's approach of "Oh, that _obvious_ piece of missing functionality? You need enterprise edition for that" pisses me off, no end. EE costs an utterly utterly indefensible amount of money. Anthony
Anthony Baxter wrote at 2004-9-2 18:15 +1000:
On Wed, 1 Sep 2004 21:26:07 +0200, Dieter Maurer <dieter@handshake.de> wrote:
When I had to work with Oracle 3 years ago, I have been *VERY* disappointed (this was Oracle 8i). Despite its high costs (100.000 USD), it was almost unfunctional: ... * memory corruption inside the Oracle client libraries brought our Zope down within minutes of use
Haven't seen this since we stopped using Oracle on Linux, and even then, it was far far less frequent than that - maybe once a month or so.
Our Oracle run on Solaris. All reported problems have been for this combination.
...
* some combinations of relational and full text subqueries let Oracle forget about its indexes. For example: while a query "Q1 and Q2" took seconds (Oracle recognized that it had indexes) the query "Q1 or Q2" (with the same "Q1" and "Q2") took days (!) (because Oracle had forgotten about the indexes and used full table scans -- which is a bad idea with hundread of millions of records and hundreds of gigabyte of data).
Did you use the "explain plan" functionality? Was this with the rule-based or cost-based optimiser?
The "explain plan" revealed that Oracle used indexes for the "and" query but does full table scans for the "or" query. I know longer know the details -- it is almost three years back.
... I like postgres, but my main problem is finding DBAs with sufficient PG knowledge.
There is almost no knowledge necessary. Unlike Oracle which has hundreds or even thousands of tuning parameters, Postgres has a few dozens -- and even when the parameters are not well chosen, Postgres runs acceptably. When Oracle is not well customized, you get catastrophic behaviour. -- Dieter
Anthony Baxter wrote:
We don't see this. We run on Solaris, talking to Oracle 8.1.7, with a couple of dozen ZEO clients across the entire installation - they all have at least one connection to an Oracle database - in some cases, they have several different connections (for instance, SQLSession works far better if you give it it's own DB connection). The primary cluster has something like 9 ZEO clients, and it gets hammered 24x7. I cannot recall the last time we had problems with the Zope/Oracle interaction, it's at least a couple of years ago. This has been in use for something like 5+ years now.
Cool. Any chance you could try my branch? It fixes problems with lots of DA's that are infrequently used, but I'd like to make sure it works in fully loaded environments too :-) I KNOW it currently has some problems, but I'm looking for help to try and nail those down... cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
participants (4)
-
Anthony Baxter -
Chris Withers -
Dario Lopez-Kästen -
Dieter Maurer