Following discussions with Kapil, Christian and Martin I've developed zope.sqlalchemy. The aim is to provide a common base for transaction integration. It does not attempt to define any particular way to handle database configuration as there is not yet consensus on the best way to handle it. I've uploaded it to zope svn and pypi. See http://pypi.python.org/pypi/zope.sqlalchemy Currently it depends on a development version of SQLAlchemy. I hope to make a release following the 0.4.6 release of SQLAlchemy. See pypi or the readme for details, but briefly usage is something like: >>> engine = create_engine('sqlite:///') >>> Session = scoped_session(sessionmaker( ... bind=engine, transactional=True, autoflush=True, ... extension=ZopeTransactionExtension())) >>> session = Session() >>> session.save(User(name='bob')) >>> transaction.commit() Any comments appreciated. Laurence
--On 6. Mai 2008 02:26:58 +0100 Laurence Rowe <l@lrowe.co.uk> wrote:
Following discussions with Kapil, Christian and Martin I've developed zope.sqlalchemy. The aim is to provide a common base for transaction integration. It does not attempt to define any particular way to handle database configuration as there is not yet consensus on the best way to handle it.
I've uploaded it to zope svn and pypi. See http://pypi.python.org/pypi/zope.sqlalchemy
Currently it depends on a development version of SQLAlchemy. I hope to make a release following the 0.4.6 release of SQLAlchemy.
See pypi or the readme for details, but briefly usage is something like:
>>> engine = create_engine('sqlite:///') >>> Session = scoped_session(sessionmaker( ... bind=engine, transactional=True, autoflush=True, ... extension=ZopeTransactionExtension())) >>> session = Session() >>> session.save(User(name='bob')) >>> transaction.commit()
Any comments appreciated.
Looks great (on the paper :-)). Trying to integrate it with z3c.sqlalchemy over the weekend. Thanks Laurence. Andreas
--On 6. Mai 2008 06:01:14 +0200 Andreas Jung <lists@zopyx.com> wrote:
--On 6. Mai 2008 02:26:58 +0100 Laurence Rowe <l@lrowe.co.uk> wrote:
Following discussions with Kapil, Christian and Martin I've developed zope.sqlalchemy. The aim is to provide a common base for transaction integration. It does not attempt to define any particular way to handle database configuration as there is not yet consensus on the best way to handle it.
Looks great (on the paper :-)). Trying to integrate it with z3c.sqlalchemy over the weekend. Thanks Laurence.
I got a big stuck with the zope.sa integration in z3c.sa. Two tests of my testsuite fail (likely related to a session.save() operation): Error in test testXXMapperGetMapper (z3c.sqlalchemy.tests.testSQLAlchemy.WrapperTests) Traceback (most recent call last): File "/opt/python-2.4.4/lib/python2.4/unittest.py", line 260, in run testMethod() File "/Users/ajung/sandboxes/z3c.sqlalchemy/lib/python/z3c/sqlalchemy/tests/testSQLAlchemy.py", line 168, in testXXMapperGetMapper user = session.query(User).filter_by(firstname='foo')[0] File "/opt/python-2.4.4/lib/python2.4/site-packages/SQLAlchemy-0.4.5-py2.4.egg/sqlalchemy/orm/query.py", line 834, in __getitem__ return list(self[item:item+1])[0] File "/opt/python-2.4.4/lib/python2.4/site-packages/SQLAlchemy-0.4.5-py2.4.egg/sqlalchemy/orm/query.py", line 925, in __iter__ self.session._autoflush() File "/opt/python-2.4.4/lib/python2.4/site-packages/SQLAlchemy-0.4.5-py2.4.egg/sqlalchemy/orm/session.py", line 747, in _autoflush self.flush() File "/opt/python-2.4.4/lib/python2.4/site-packages/SQLAlchemy-0.4.5-py2.4.egg/sqlalchemy/orm/session.py", line 757, in flush self.uow.flush(self, objects) File "/opt/python-2.4.4/lib/python2.4/site-packages/SQLAlchemy-0.4.5-py2.4.egg/sqlalchemy/orm/unitofwork.py", line 236, in flush session.extension.after_flush(session, flush_context) File "/opt/python-2.4.4/lib/python2.4/site-packages/zope.sqlalchemy-0.1dev_r86482-py2.4.egg/zope/sqlalchemy/datamanager.py", line 187, in after_flush invalidate(session) File "/opt/python-2.4.4/lib/python2.4/site-packages/zope.sqlalchemy-0.1dev_r86482-py2.4.egg/zope/sqlalchemy/datamanager.py", line 170, in invalidate assert _SESSION_STATE[id(session)] is not STATUS_READONLY KeyError: 32500656 The integration code follows your documentation; <http://svn.zope.org/z3c.sqlalchemy/branches/zope.sqlalchemy-integration/src/z3c/sqlalchemy/base.py?rev=86543&view=auto> And the tests are here: <http://svn.zope.org/z3c.sqlalchemy/branches/zope.sqlalchemy-integration/src/z3c/sqlalchemy/tests/testSQLAlchemy.py?rev=86552&view=markup> Any idea? Andreas
You need an svn checkout of SQLAlchemy until 0.4.6 is released. Laurence Andreas Jung wrote:
--On 6. Mai 2008 06:01:14 +0200 Andreas Jung <lists@zopyx.com> wrote:
--On 6. Mai 2008 02:26:58 +0100 Laurence Rowe <l@lrowe.co.uk> wrote:
Following discussions with Kapil, Christian and Martin I've developed zope.sqlalchemy. The aim is to provide a common base for transaction integration. It does not attempt to define any particular way to handle database configuration as there is not yet consensus on the best way to handle it.
Looks great (on the paper :-)). Trying to integrate it with z3c.sqlalchemy over the weekend. Thanks Laurence.
I got a big stuck with the zope.sa integration in z3c.sa. Two tests of my testsuite fail (likely related to a session.save() operation):
Error in test testXXMapperGetMapper (z3c.sqlalchemy.tests.testSQLAlchemy.WrapperTests) Traceback (most recent call last): File "/opt/python-2.4.4/lib/python2.4/unittest.py", line 260, in run testMethod() File "/Users/ajung/sandboxes/z3c.sqlalchemy/lib/python/z3c/sqlalchemy/tests/testSQLAlchemy.py", line 168, in testXXMapperGetMapper user = session.query(User).filter_by(firstname='foo')[0] File "/opt/python-2.4.4/lib/python2.4/site-packages/SQLAlchemy-0.4.5-py2.4.egg/sqlalchemy/orm/query.py", line 834, in __getitem__ return list(self[item:item+1])[0] File "/opt/python-2.4.4/lib/python2.4/site-packages/SQLAlchemy-0.4.5-py2.4.egg/sqlalchemy/orm/query.py", line 925, in __iter__ self.session._autoflush() File "/opt/python-2.4.4/lib/python2.4/site-packages/SQLAlchemy-0.4.5-py2.4.egg/sqlalchemy/orm/session.py", line 747, in _autoflush self.flush() File "/opt/python-2.4.4/lib/python2.4/site-packages/SQLAlchemy-0.4.5-py2.4.egg/sqlalchemy/orm/session.py", line 757, in flush self.uow.flush(self, objects) File "/opt/python-2.4.4/lib/python2.4/site-packages/SQLAlchemy-0.4.5-py2.4.egg/sqlalchemy/orm/unitofwork.py", line 236, in flush session.extension.after_flush(session, flush_context) File "/opt/python-2.4.4/lib/python2.4/site-packages/zope.sqlalchemy-0.1dev_r86482-py2.4.egg/zope/sqlalchemy/datamanager.py", line 187, in after_flush invalidate(session) File "/opt/python-2.4.4/lib/python2.4/site-packages/zope.sqlalchemy-0.1dev_r86482-py2.4.egg/zope/sqlalchemy/datamanager.py", line 170, in invalidate assert _SESSION_STATE[id(session)] is not STATUS_READONLY KeyError: 32500656
The integration code follows your documentation;
And the tests are here:
Any idea?
Andreas
------------------------------------------------------------------------
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Hey, Laurence Rowe wrote:
See pypi or the readme for details, but briefly usage is something like:
>>> engine = create_engine('sqlite:///') >>> Session = scoped_session(sessionmaker( ... bind=engine, transactional=True, autoflush=True, ... extension=ZopeTransactionExtension())) >>> session = Session() >>> session.save(User(name='bob')) >>> transaction.commit()
One thing I understood from Christian Theune is that with scoped sessions, explicit session.save() is not always necessary. Since I see it being used here, could you perhaps comment on this? This is great news, also for megrok.rdb, which we started to develop at the Grokkerdam sprint. Regards, Martijn
Martijn Faassen wrote:
Hey,
Laurence Rowe wrote:
See pypi or the readme for details, but briefly usage is something like:
>>> engine = create_engine('sqlite:///') >>> Session = scoped_session(sessionmaker( ... bind=engine, transactional=True, autoflush=True, ... extension=ZopeTransactionExtension())) >>> session = Session() >>> session.save(User(name='bob')) >>> transaction.commit()
One thing I understood from Christian Theune is that with scoped sessions, explicit session.save() is not always necessary. Since I see it being used here, could you perhaps comment on this?
Registering a mapper with Session.mapper would work with this extension, but I'm reluctant to recommend it for two reasons: I don't know how it works with the declarative plugin; and it necessarily limits mapped classes to a single Session and therefor a single engine. In a zope context I think it's quite likely that you could have the same classes mapped to different databases (i.e. two instances of a single application).
This is great news, also for megrok.rdb, which we started to develop at the Grokkerdam sprint.
I read somewhere on one of the blog planets that you had discussed container implementations. Any more information/code about this? I'm quite hopeful that Zope 2.12 will let us share much more code now that Acquisition is being tamed. Laurence
On May 6, 2008, at 12:14 PM, Laurence Rowe wrote:
Martijn Faassen wrote:
One thing I understood from Christian Theune is that with scoped sessions, explicit session.save() is not always necessary. Since I see it being used here, could you perhaps comment on this?
Registering a mapper with Session.mapper would work with this extension, but I'm reluctant to recommend it for two reasons: I don't know how it works with the declarative plugin; and it necessarily limits mapped classes to a single Session and therefor a single engine. In a zope context I think it's quite likely that you could have the same classes mapped to different databases (i.e. two instances of a single application).
hi there - a little background on the "save_on_init" option of Session.mapper. This behavior has its roots way back in SQLAlchemy 0.1, when there was no Session or Query or anything like that, and objects, when instantiated, went directly to a thread-local registry automatically. When SQLA 0.2 came out, we introduced all the additional constructs like Session and such which are familiar today, but extensions were provided which, when enabled, would re-enable the 0.1 behavior of "everything threadlocal/automatic" in a similar way. Ultimately thats where Session.mapper comes from. Like all typing-savers, "save on init" by then was used by dozens of Pylons users who swore by it and would scream and yell at any hint of removing this already legacy feature. At the same time, new users who were using Pylons tutorials (and therefore save_on_init, without really knowing it) in conjunction with creating their own Session objects were baffled by error messages like "object X is already in session Y". By the time 0.4 came out, we had started automating Session a lot more, adding autoflush capability to it. This feature immediately had issues with save_on_init for this reason: class MyClass(object): def __init__(self): self.some_variable = session.query(Foobar).filter(xyz).one() Where above, the query(Foobar) would fire off autoflush, MyClass would be flushed, and then an IntegrityError would be raised since MyClass would be missing some necessary state. Changing "save_on_init" to fire off *after* __init__ was a possibility there but then other things could break. So I've already not liked save_on_init for a couple of years due to its inherent intrusiveness, and because SA historically does not like being in the business of providing framework features (though we have decided to stay in that arena to some degree with declarative and scoped_session). The "Session.mapper" feature is stressed a whole lot less in the 0.4 docs, and as I work on the 0.5 docs this week I'm feeling very much like I'm going to remove it from the main documentation altogether. We''re consolidating the "save/update/save_or_update" names into just "add()" and "add_all()", so explicitly adding items to a Session should be a more pleasant experience which I wouldn't want anyone to miss. The aspect of Session.mapper which is somewhat reduntant vs. declarative is that they both want to add an automatic __init__(**kwargs) method which assigns all given keyword values to the instance. They are not incompatible because Session.mapper only adds an __init__ if none is available already. The final feature of Session.mapper which is more reasonable is the "query" attribute. This feature allows you to say: MyClass.query as an equivalent for session.query(MyClass). For that specific attribute, instead of using Session.mapper, its functionality has been exported into its own descriptor-producing method, like so: class MyBaseClass(object): query = Session.query_property() So this is a way to get that one aspect without buying into the Session.mapper thing.
Hey Michael, Thanks for the input! Michael Bayer wrote: [snip]
So I've already not liked save_on_init for a couple of years due to its inherent intrusiveness, and because SA historically does not like being in the business of providing framework features (though we have decided to stay in that arena to some degree with declarative and scoped_session).
I'll try to summarize the discussion so I can find out whether I understand it. Basically you're saying you don't think save on instantiation is a good idea generally, and that we should be using session.save(). This is going to be changed to session.add() in the future. What would session.add_all() do? This ties into the mapper feature, which also offers other features. The one feature that will remain but in a new shape, without the mapper, is the ability to do MyClass.query. Is that a correct summary? Regards, Martijn
On May 6, 2008, at 5:24 PM, Martijn Faassen wrote:
Hey Michael,
Thanks for the input!
Michael Bayer wrote: [snip]
So I've already not liked save_on_init for a couple of years due to its inherent intrusiveness, and because SA historically does not like being in the business of providing framework features (though we have decided to stay in that arena to some degree with declarative and scoped_session).
I'll try to summarize the discussion so I can find out whether I understand it. Basically you're saying you don't think save on instantiation is a good idea generally, and that we should be using session.save(). This is going to be changed to session.add() in the future. What would session.add_all() do?
session.add_all() is just: session.add_all([obj1, obj2, obj3, ...]) Also session.save()/update/save_or_update will remain throughout 0.5 at least.
This ties into the mapper feature, which also offers other features. The one feature that will remain but in a new shape, without the mapper, is the ability to do MyClass.query.
Is that a correct summary?
Session.mapper and save_on_init has no plans of going away in 0.5, but I plan to de-emphasize it. The "query" descriptor function is also available in 0.4. - mike
Hi, On Tue, May 06, 2008 at 12:53:05PM -0400, Michael Bayer wrote:
On May 6, 2008, at 12:14 PM, Laurence Rowe wrote:
Martijn Faassen wrote:
One thing I understood from Christian Theune is that with scoped sessions, explicit session.save() is not always necessary. Since I see it being used here, could you perhaps comment on this?
Registering a mapper with Session.mapper would work with this extension, but I'm reluctant to recommend it for two reasons: I don't know how it works with the declarative plugin; and it necessarily limits mapped classes to a single Session and therefor a single engine. In a zope context I think it's quite likely that you could have the same classes mapped to different databases (i.e. two instances of a single application).
hi there -
a little background on the "save_on_init" option of Session.mapper. This behavior has its roots way back in SQLAlchemy 0.1, when there was no Session or Query or anything like that, and objects, when instantiated, went directly to a thread-local registry automatically. When SQLA 0.2 came out, we introduced all the additional constructs like Session and such which are familiar today, but extensions were provided which, when enabled, would re-enable the 0.1 behavior of "everything threadlocal/automatic" in a similar way. Ultimately thats where Session.mapper comes from.
That's interesting, thanks for the heads-up. For some reason I didn't hit that and was quite happy with save on init (I obviously only use one database at a time ...) but the reasons and concerns given tell me that our core transaction integration shouldn't force this onto people and we might not want to use it for grok at all. In fact at the sprint we discussed the similarities and differences of RDB versus ODB and found that the step of 'add an object to the db' is actually two functions (in one gesture) in the ODB: relate object B to object A and, therefore add it to the same database as object A is in. Whereas (due to the mapping of classes to tables) in RDB we only need to tell which database to go to. Those are similar but differen gestures and I'm not sure we had a definitive result when discussing this. Christian -- gocept gmbh & co. kg - forsterstrasse 29 - 06112 halle (saale) - germany www.gocept.com - ct@gocept.com - phone +49 345 122 9889 7 - fax +49 345 122 9889 1 - zope and plone consulting and development
Hi there (especially Christian), I think we can work with explicits saves. In many cases the user won't have to worry about it anyway as the container object will do it for them (besides making the relation), or this 'query container' we spoke of will do it for them (but just the 'save' bit). One point is that the scoped session approach itself doesn't work very well for using multiple databases in the same app. We could consider passing the session along in the containers during object graph wakling (or traversal) so an app can easily traverse into multiple databases. I'm not sure whether we can make the ORM do this for us though; does it initialize the mapping with a session? Regards, Martijn
Martijn Faassen wrote:
Hi there (especially Christian),
I think we can work with explicits saves. In many cases the user won't have to worry about it anyway as the container object will do it for them (besides making the relation), or this 'query container' we spoke of will do it for them (but just the 'save' bit).
One point is that the scoped session approach itself doesn't work very well for using multiple databases in the same app. We could consider passing the session along in the containers during object graph wakling (or traversal) so an app can easily traverse into multiple databases. I'm not sure whether we can make the ORM do this for us though; does it initialize the mapping with a session?
Registering a ScopedSession as a utility seems a good approach. I'm experimenting with ways of registering engines as local utilities. Hopefully the combination will allow something along the lines of:
Session = scoped_session(sessionmaker(bind=LookupEngine('my-engine')...)) provideUtility(Session, IScopedSession, 'my-app') engine = EngineUtility(url='sqlite:///') provideUtility(engine, IConnectable, 'my-engine') # but normally a local utility registration
The code would get a session through:
Session = getUtility(IScopedSession, 'my-app') session = Session()
Mappers are registered with the metadata, so nothing special need be done here. Laurence
Hi there, Laurence Rowe wrote: [snip]
The code would get a session through:
Session = getUtility(IScopedSession, 'my-app') session = Session()
The drawback is that this is more typing. You do a utility lookup and an instantiation as opposed to simply importing the scoped session when needed: from myapplication import session session.query(...) One topic we ran into during the megrok.kss discussion is doing multiple instances of the same application, each writing to a different database. You can't hardcode your database name in your application. I think sessions as local utilities would help us solve that problem, right? What would be nice if we could do the 'from myapp import session' pattern and have it use the local utility infrastructure underneath somehow... Possible? Regards, Martijn
Martijn Faassen wrote:
Hi there,
Laurence Rowe wrote: [snip]
The code would get a session through:
Session = getUtility(IScopedSession, 'my-app') session = Session()
The drawback is that this is more typing. You do a utility lookup and an instantiation as opposed to simply importing the scoped session when needed:
from myapplication import session
session.query(...)
One topic we ran into during the megrok.kss discussion is doing multiple instances of the same application, each writing to a different database. You can't hardcode your database name in your application. I think sessions as local utilities would help us solve that problem, right?
What would be nice if we could do the 'from myapp import session' pattern and have it use the local utility infrastructure underneath somehow... Possible?
We'll have to stick with scoped sesssions because of threading, but the engine as local utility pattern should still work. #myapplication/__init__.py Session = scoped_session(sessionmaker(bind=LookupEngine('my-engine')...)) engine = EngineUtility(url='sqlite:///') provideUtility(engine, IConnectable, 'my-engine') # but normally a local utility registration #myapplication/foo.py from myapplication import Session session = Session() One (perhaps the only) advantage I can see with looking up the scoped session as a utility is that it gives the integrator control over whether to use one or two phase commit, as this is set in the session configuration. Normally one would prefer one-phase commit as it is faster, but if an integrator arranged for two applications to be modified in a single transaction she would want to configure two-phase commit. Laurence
Hey Laurence, Laurence Rowe wrote: [snip]
We'll have to stick with scoped sesssions because of threading, but the engine as local utility pattern should still work.
#myapplication/__init__.py Session = scoped_session(sessionmaker(bind=LookupEngine('my-engine')...)) engine = EngineUtility(url='sqlite:///') provideUtility(engine, IConnectable, 'my-engine') # but normally a local utility registration
#myapplication/foo.py from myapplication import Session session = Session()
Here one still needs to instantiate the session each time you use it. Couldn't you simply do: #myapplication/__init__.py ... [what you had] session = Session() # myapplication/foo.py from myapplication import session or wouldn't that be possible?
One (perhaps the only) advantage I can see with looking up the scoped session as a utility is that it gives the integrator control over whether to use one or two phase commit, as this is set in the session configuration. Normally one would prefer one-phase commit as it is faster, but if an integrator arranged for two applications to be modified in a single transaction she would want to configure two-phase commit.
How common would it be that the integrator would do this without the code itself needing to be changed for other reasons then too? A WSGI setup, perhaps? I imagine we could arrange something where we allow both. Provide the engine as local utility scenario, but let people register sessions as local utilities should they want to. Regards, Martijn
Martijn Faassen wrote:
Hey Laurence,
Laurence Rowe wrote: [snip]
We'll have to stick with scoped sesssions because of threading, but the engine as local utility pattern should still work.
#myapplication/__init__.py Session = scoped_session(sessionmaker(bind=LookupEngine('my-engine')...)) engine = EngineUtility(url='sqlite:///') provideUtility(engine, IConnectable, 'my-engine') # but normally a local utility registration
#myapplication/foo.py from myapplication import Session session = Session()
Here one still needs to instantiate the session each time you use it. Couldn't you simply do:
#myapplication/__init__.py ... [what you had] session = Session()
# myapplication/foo.py from myapplication import session
or wouldn't that be possible?
No. It would be similar to doing: txn = transaction.get() (if we imagined for a moment that transactions were recyclable objects) Individual sessions are not thread safe. The point of a scoped session is that you get a different session depending on which thread you are in.
One (perhaps the only) advantage I can see with looking up the scoped session as a utility is that it gives the integrator control over whether to use one or two phase commit, as this is set in the session configuration. Normally one would prefer one-phase commit as it is faster, but if an integrator arranged for two applications to be modified in a single transaction she would want to configure two-phase commit.
How common would it be that the integrator would do this without the code itself needing to be changed for other reasons then too? A WSGI setup, perhaps?
How long is a piece of string ;-) Elsewhere in this thread I have an imaginary departmental address book, one instance of the app per department. In this example the integrator would not have to change anything in the Address Book app. But then they want to create another app that allows them to search and replace across all address books. For this to work correctly they should reconfigure the address book app to use multiple two phase commit.
I imagine we could arrange something where we allow both. Provide the engine as local utility scenario, but let people register sessions as local utilities should they want to.
Maybe this should be configured somewhere else than a local utility. I wander how Pylons does it. Laurence
Laurence Rowe wrote:
Martijn Faassen wrote:
Hey Laurence,
Laurence Rowe wrote: [snip]
We'll have to stick with scoped sesssions because of threading, but the engine as local utility pattern should still work.
#myapplication/__init__.py Session = scoped_session(sessionmaker(bind=LookupEngine('my-engine')...)) engine = EngineUtility(url='sqlite:///') provideUtility(engine, IConnectable, 'my-engine') # but normally a local utility registration
#myapplication/foo.py from myapplication import Session session = Session()
Would session = ISession(context) be a reasonable way for grok to handle this? Making transactions span multiple instances of a single app seems impossible otherwise, though maybe that is an edge case that need not be supported. Laurence
On May 7, 2008, at 7:08 AM, Martijn Faassen wrote:
Hi there (especially Christian),
I think we can work with explicits saves. In many cases the user won't have to worry about it anyway as the container object will do it for them (besides making the relation), or this 'query container' we spoke of will do it for them (but just the 'save' bit).
One point is that the scoped session approach itself doesn't work very well for using multiple databases in the same app. We could consider passing the session along in the containers during object graph wakling (or traversal) so an app can easily traverse into multiple databases. I'm not sure whether we can make the ORM do this for us though; does it initialize the mapping with a session?
SQLAlchemy's Session does support multiple engine binds itself, which most easily can be associated with particular mapped classes (i.e. vertical partitioning), so that a single session (or a scoped_session) can read and write data to the appropriate tables transparently (although things like joins across multiple databases will raise errors). Theres a horizontally-partitioning version of Session as well which obviously has a lot more caveats. Using multiple sessions, one per DB is a valid approach as well. I'm not sure if Grok has other things going on when mulitple DBs are in use but SA's multi-bind capability is something to be aware of.
Michael Bayer wrote:
On May 7, 2008, at 7:08 AM, Martijn Faassen wrote:
Hi there (especially Christian),
I think we can work with explicits saves. In many cases the user won't have to worry about it anyway as the container object will do it for them (besides making the relation), or this 'query container' we spoke of will do it for them (but just the 'save' bit).
One point is that the scoped session approach itself doesn't work very well for using multiple databases in the same app. We could consider passing the session along in the containers during object graph wakling (or traversal) so an app can easily traverse into multiple databases. I'm not sure whether we can make the ORM do this for us though; does it initialize the mapping with a session?
SQLAlchemy's Session does support multiple engine binds itself, which most easily can be associated with particular mapped classes (i.e. vertical partitioning), so that a single session (or a scoped_session) can read and write data to the appropriate tables transparently (although things like joins across multiple databases will raise errors). Theres a horizontally-partitioning version of Session as well which obviously has a lot more caveats.
Using multiple sessions, one per DB is a valid approach as well. I'm not sure if Grok has other things going on when mulitple DBs are in use but SA's multi-bind capability is something to be aware of.
I'm thinking more about having the same classes mapped to different databases at different points in the application. Imagine a departmental address book app. Intstances of the departmental address book are created for each department, each with a different databases: http://addressbook/sales -> postgres:///sales http://addressbook/engineering -> postgres:///engineering The way I imagine this working is to have a proxy engine object that looks up the real engine through a local utility. Each application would be a `site` and capable of local utility registrations. /sales would have Engine('postgres:///sales') registered and /engineering Engine('postgres:///engineering'). Only a single ScopedSession would be required. This would be bound to proxy that performs the utility lookup. So when in the /sales context the proxy would point to the sales engine and when in the /engineering context to the engineering engine. The limitation of this approach is that it would not be possible to mix objects from /sales and objects from /engineering into the same transaction. So really we need a session per application instance. Perhaps this can be achieved through a custom scoping function: def scopefunc(): return thread.get_ident(), id(zope.component.getSiteManager()) Laurence
On May 7, 2008, at 1:29 PM, Laurence Rowe wrote:
I'm thinking more about having the same classes mapped to different databases at different points in the application. Imagine a departmental address book app. Intstances of the departmental address book are created for each department, each with a different databases:
http://addressbook/sales -> postgres:///sales http://addressbook/engineering -> postgres:///engineering
The way I imagine this working is to have a proxy engine object that looks up the real engine through a local utility. Each application would be a `site` and capable of local utility registrations. /sales would have Engine('postgres:///sales') registered and /engineering Engine('postgres:///engineering').
Only a single ScopedSession would be required. This would be bound to proxy that performs the utility lookup. So when in the /sales context the proxy would point to the sales engine and when in the / engineering context to the engineering engine.
The limitation of this approach is that it would not be possible to mix objects from /sales and objects from /engineering into the same transaction. So really we need a session per application instance. Perhaps this can be achieved through a custom scoping function:
def scopefunc(): return thread.get_ident(), id(zope.component.getSiteManager())
If you want to mix tables (and optionally engines) for the *same* class, we actually have a feature for that too. Its sort of a feature I've wanted to remove but Jason keeps arguing that its worthy. It's called "entity_name" and it allows multiple primary mappers to be created for a single class. The entity_name has to be specified when you add the element to the session (yet another reason explicit add() is a good thing). This feature is taken directly from the Hibernate feature of the same name. The limitation with entity_name which needs some more fixing in 0.5 is that only one mapper gets to define the attribute instrumentation for the entity. If you are storing the same class in three different tables (across three different databases or just one), the attributes defined on the class need to be compatible with all three. This is reasonable since a class can only have one descriptor per attribute name. Querying is also slightly challenging since the descriptors need to be qualified for a particular mapper (i.e. you cant just say query.filter(Address.id==5)...which "id" are we talking about ?) The reason I'm not totally keen on this feature is that it seems to be a very exotic way of getting around making simple subclasses, and I've yet to see the use case for it where simple subclasses don't work, except for "cosmetic" reasons which I have a hard time swallowing (even if the reasons are "cosmetic", you can still create subclasses that are all "named" the same). So I will ask you, why can't your application simply have a SalesAddress and an EngineeringAddress class ? You could even produce them transparently using a custom __new__() method, i.e. class Address(object): def __new__(cls, *args, **kwargs): if my_scoped_thing.context == 'sales': return object.__new__(SalesAddress) elif my_scoped_thing.context == 'engineering': return object.__new__(EngineeringAddress) this seems extremely straightforward to me as each object, once instantiated is now bound for a specific destination. It doesnt seem like youd want the *same* Address to be stored in one and then the other in a different instance (that seems *extremely* complex for no good reason). Isnt explicit better than implicit ?
Michael Bayer wrote:
So I will ask you, why can't your application simply have a SalesAddress and an EngineeringAddress class ? You could even produce them transparently using a custom __new__() method, i.e.
class Address(object): def __new__(cls, *args, **kwargs): if my_scoped_thing.context == 'sales': return object.__new__(SalesAddress) elif my_scoped_thing.context == 'engineering': return object.__new__(EngineeringAddress)
this seems extremely straightforward to me as each object, once instantiated is now bound for a specific destination. It doesnt seem like youd want the *same* Address to be stored in one and then the other in a different instance (that seems *extremely* complex for no good reason). Isnt explicit better than implicit ?
When the generic address book application is built you don't know what the departments will be called or indeed how many departments there are. An address book is not be a great example, but I know of intranet portal sites where this is a requirement. You want to delegate control to each department so you give each department their own instance of the portal. You only want to maintain one code base though, and you don't want to change it every time someone adds another departmental portal. I'd like to be able to create an add form that has fields for application name and database url. This probably seems like an odd requirement -- why not just run multiple processes with different configurations -- but it's the way zope has traditionally worked. A single process can serve multiple instances of the same application (or `site`). When you get up to running tens of sites, the memory footprint of Zope2 and Plone (before the object cache) becomes significant. Laurence
On May 7, 2008, at 2:29 PM, Laurence Rowe wrote:
When the generic address book application is built you don't know what the departments will be called or indeed how many departments there are. An address book is not be a great example, but I know of intranet portal sites where this is a requirement. You want to delegate control to each department so you give each department their own instance of the portal. You only want to maintain one code base though, and you don't want to change it every time someone adds another departmental portal. I'd like to be able to create an add form that has fields for application name and database url.
This probably seems like an odd requirement -- why not just run multiple processes with different configurations -- but it's the way zope has traditionally worked. A single process can serve multiple instances of the same application (or `site`). When you get up to running tens of sites, the memory footprint of Zope2 and Plone (before the object cache) becomes significant.
If you are running different instances each connected to a different engine within one process you wouldn't need any awareness of engines at the object level (therefore no entity_name) and also no engine proxying - you should have separate Session instances for each "process" managed by scoped_session(), which was designed to handle this. Multiple apps on one codebase within one process was an original requirement of Pylons as well, though nobody has ever used it. The easiest way to do it is to set up the engine at the request level: Session = scoped_session() # start of request engine = get_appropriate_engine() Session(bind=engine) try: # do request finally: Session.remove() If that isnt granular enough, then you use a custom scope func which maintains Session per-"process"-per-thread.
Laurence Rowe wrote: [snip]
This is great news, also for megrok.rdb, which we started to develop at the Grokkerdam sprint.
I read somewhere on one of the blog planets that you had discussed container implementations. Any more information/code about this? I'm quite hopeful that Zope 2.12 will let us share much more code now that Acquisition is being tamed.
Our sketchy code is in grokapps/rdbexample. The 'megrok.rdb' is a package in there right now. It's all cobbled together in unpolished form for the time being. We just basically reuse the mapping implementation SQLAlchemy already offers and give it an IContainer interface. It's not done yet but it's a large part of the solution. I'm writing a report on the grokkerdam sprint, but it isn't out yet. Regards, Martijn
Laurence Rowe wrote:
I've uploaded it to zope svn and pypi. See http://pypi.python.org/pypi/zope.sqlalchemy
Another comment: Could you please use proper release numbers instead of the 0.1dev-r86482 stuff? And follow the official release procedure, which doesn't seem to be actually published on the web outside of grok.zope.org: http://grok.zope.org/documentation/how-to/releasing-software I realize that the package is very young, but I'm saying this for a reason; while doing it quickly seems convenience we've had a lot of troubles in the past building on such quick & dirty releases. Regards, Martijn
Martijn Faassen wrote:
Laurence Rowe wrote:
I've uploaded it to zope svn and pypi. See http://pypi.python.org/pypi/zope.sqlalchemy
And yet another setup.py comment: the 'url' section right now points to http://svn.zope.org/zope.sqlalchemy'. I'm not sure, but I thought we had a policy not to use our SVN as the homepage URL in setup.py. Comments from others please? Regards, Martijn P.S. sorry for all the criticisms, they're intended to be constructive. I'm very excited this project is going ahead.
participants (6)
-
Andreas Jung -
Brian Sutherland -
Christian Theune -
Laurence Rowe -
Martijn Faassen -
Michael Bayer