On Nov 14, 2005, at 9:21 PM, Dennis Allison wrote:
ZEO 3.4.2 Zope 2.8.4 ZODB 3.4.2 as relesed with Zope 2.8.4B Python 2.4.2 or 2.3.5 MySQL 4.0.20 MySQL-Python 1.2.0 MYSQLDA 2.0.9
We have just moved from Zope 2.7.6 to Zope 2.8.4 motivated, in part, but the ability to avoid read conflicts under ZODB 3.4.2. We have been having a lot of problems: more conflict errors, release of unreleased locks in MySQL transactions, loss of session variables, and others.
Deja vu. ;-) Except for the InnoDB tables.
We use session variables and MySQL very heavily. All database tables are now transactional (innodb) migrated from the MyISAM tables in use earlier.
The connection string, by the way, does not use either the '-' or '+' prefix but relies on a run-time test for the table to be determined to be transactional.
All our Zope instances run with ZEO and use a Temporary Store instance for session variables. The ZEO configuration is:
<filestorage 1> path $INSTANCE/var/Data.fs </filestorage>
# temporary storage has to be ZEO side %import tempstorage <temporarystorage temp> name temporary storage for sessioning </temporarystorage>
and is matched to the Zope configuration:
<zodb_db main> mount-point / # ZODB cache, in number of objects cache-size 5000 <zeoclient> server 192.168.0.92:8301 storage 1 var $INSTANCE/var # ZEO client cache, in bytes cache-size 20MB # Uncomment to have a persistent disk cache client group1-zeo </zeoclient> </zodb_db> # # ZEO temporary storage # <zodb_db temporary> <zeoclient> server 192.168.0.92:8301 storage temp name zeostorage var $INSTANCE/var </zeoclient> mount-point /temp_folder container-class Products.TemporaryFolder.TemporaryContainer </zodb_db>
Are these configurations correct?
Looks fine to me.
The snippet of log shown below show the problem that plagues us:
------ 2005-11-14T06:20:40 ERROR txn.4100 Error in tpc_abort() on manager <MultiObjectResourceAdapter for <Products.ZMySQLDA.db.DB instance at 0x42c6a52c> at 1186514860> Traceback (most recent call last): File "/usr/local/src/zope/Zope2.8/lib/python/transaction/ _transaction.py", line 462, in _cleanup rm.tpc_abort(self) File "/usr/local/src/zope/Zope2.8/lib/python/transaction/ _transaction.py", line 548, in tpc_abort self.manager.tpc_abort(txn) File "/usr/local/src/zope/Zope2.8/lib/python/Shared/DC/ZRDB/ TM.py", line 64, in abort try: self._abort() File "/opt/zope/zproducts/standard/ZMySQLDA/db.py", line 389, in _abort self._tlock.release() error: release unlocked lock ------ 2005-11-14T06:20:40 INFO ZODB conflict error at /classes/class_applications/classaccess/topic_html (658 conflicts since startup at 2005-11-11T01:26:40) ------ 2005-11-14T06:20:40 ERROR txn.4100 Failed to abort object: Surrogate oid=<object object at 0x4009d470> Traceback (most recent call last): File "/usr/local/src/zope/Zope2.8/lib/python/transaction/ _transaction.py", line 562, in abort self.manager.abort(o, txn) File "/usr/local/src/zope/Zope2.8/lib/python/Shared/DC/ZRDB/ TM.py", line 64, in abort try: self._abort() File "/opt/zope/zproducts/standard/ZMySQLDA/db.py", line 389, in _abort self._tlock.release() error: release unlocked lock ------ 2005-11-14T06:20:40 ERROR txn.4100 Failed to abort resource manager: <MultiObjectResourceAdapter for <Products.ZMySQLDA.db.DB instance at 0x42c6a52c> at 1186514860> Traceback (most recent call last): File "/usr/local/src/zope/Zope2.8/lib/python/transaction/ _transaction.py", line 489, in abort rm.abort(self) File "/usr/local/src/zope/Zope2.8/lib/python/transaction/ _transaction.py", line 562, in abort self.manager.abort(o, txn) File "/usr/local/src/zope/Zope2.8/lib/python/Shared/DC/ZRDB/ TM.py", line 64, in abort try: self._abort() File "/opt/zope/zproducts/standard/ZMySQLDA/db.py", line 389, in _abort self._tlock.release() error: release unlocked lock
which is a pattern typical of the many such entries: a log entry for an abort which raises a lock error, a report of a conflict error, a report of a failure to abort a Surrogate oid, followed by a second abort of the first instance.
Is this reproducible on demand or does it only occur under some load?
We also see frequent KeyError failures in the session variable data structure which suggests that the copy-out/modify/copy-in mechanism used to tickle the persistence mechanism does not run to completion and so the Zope persistence fail and the session variable structure is pruned or removed at some random later time. Some diagnostics are seen which report that the REQUEST['SESSION'] is occasionally removed.
You'll need to provide some error messages for interpretation here. Otherwise it's guesswork. You also likely want to solve the abort problem you explain above first, because at least the first symptom you note above could be a symptom of that (although not the last).
Questions: ~~~~~~~~~~
What is the cause of the locking problem? Does ZMySQLDA/MySQL- Python/Zope have some path which should have called begin to get a lock and did not?
No idea, sorry. I assume this worked OK under 2.7.X and the only real meaningful difference is a move to 2.8?
What needs to be done to make session variables stable?
Making transactions work properly by tracking down the "abort" symptom above is the best start just so you don't go on a wild goose chase.
Does the Temporary Storage method need to be extended to allow for rolling back a transaction?
I don't think so. TemporaryStorage is already transactional.
Any other help or advice would be appreciated.
I'm afraid that the only way this "problem" (or set of two problems, or three, or however many it turns out to be) will be solved is to for someone to be able to replicate the symptom. I don't suppose you can tell us how to do that, can you? - C