There seems to be some interest on the use of SQL databases with Zope. Lovelysystems is now using SQL databases as the primary storage for their applications. We use Zope and Postgres with Storm as ORM. The main reason for switching to SQL database were speed issues with queries. Here is a short summary of my thougt's and experiences while using Storm and Zope for about 3 Month now. RelStorage: Relstorage doesn't solve the speed problems. Doing queries with SQL is much faster than doing it with ZODB. If you work with a lot and with large BTrees you need to load them all into the memory of each Zope client. This has to be done with Relstorage too. Indexes: You don't need to implement catalog indexes, this is all done on the database side. When implementing and using your content types, at first you don't need to think about indexes, later you optimize the database without touching your python code. A speed example : We had to find similar users based on items a user has collected. Doing this with ZODB took minutes to calculate for users with a lot of items. We had to implement a lot of code to do the calculation asynchronously to not block the users request. Doing the same with SQL was possible with a single (of course complex) query within 300ms, no async things needed, just implement the query and optimize the indexes on the server, finished ! Relstorage will not help you here. Content implementation: While we are porting our existing ZODB based packages to SQL, we found that implementing them with Storm is as easy as using ZODB. We still can use the full power of Zope's component architecture. This is because Storm objects are extremely easy to implement. You can implement a storm object like a Persistent object, just derive from Storm instead of Persistent, add __storm_table__ and define the properties as Storm properties. For me a big mistake when switching from ZODB to SQL is trying to use the container pattern at any cost. A container is nothing but a 1:N relation and this is exactly what an SQL database provides : Relations class Content(Storm): id = Int(primary=True) content = ReferenceSet(id, 'Contained.somethingId') c = Content() Now you can - add data : c.content.add(content) - iterate : for a in c.content: - search : c.content.find(...) - sort : c.content.find().sort_by(...) - do anything a Storm ResultSet is providing But of course it is possible to put an adapter around the Content class which will provide IContainer. Annotation: Annotations are 1:1 relations, so it's as easy as the above. We use annotations like simple adapters to other tables. class ToBeAnnotated(Storm): interface.implements(ICanHaveData) id = Int(primary=True) Note that the "annotated" storm table is implemented as an adapter : class Data(Storm): interface.implements(IData) interface.adapts(ICanHaveData) id = Int(primary=True) __parent__ = Reference(id, ToBeAnnotated.id) def _init__(self, context): # a dummy to make the adapter happy pass We can now register "Data" as an adapter. We use a special adapter factory like zope.annotation.factory to autocreate adapted content. def contentAdapter(table, autocreate=True): # an adapter on content for content contained in other tables. Just like # the annotation adapter, an instance is created if autocreate is True. adapts = component.adaptedBy(table) if adapts is None: raise TypeError("Missing 'zope.component.adapts' on table") @component.adapter(list(adapts)[0]) @interface.implementer(list(component.implementedBy(table))[0]) def getAdapter(context): unsafeContext = removeSecurityProxy(context) obj = getStore().find(table, table.__parent__ == unsafeContext).one() if obj is None and autocreate: obj = table(context) obj.__parent__ = context return obj return getAdapter Now you can define a factory for the adapter: dataFactory = contentAdapter(Data) And register "dataFactory" as an adapter. DublinCore: If you want to use the full DublinCore implementation from Zope you need to do a generic implementation. Usually only parts of the DublinCore interface is used. We usually implement IDCTimes and IDCDescriptiveProperties. All you need to do for this is : class DCStormContent(Storm): interface.implements(IDCTimes, IDCDescriptiveProperties) created = DateTime() modified = DateTime() title = Unicode() description = Unicode() That's it! You can now use IDCTimes and IDCDescriptiveProperties for your formlib form_fields. There are two way's to update "modified" : - write an event handler for ObjectModifiedEvent - do it on the database side with triggers I prefer using the event handler because the database trigger is doing the update only when writing to the database which can be to late. Schema's and Storm objects: We don't use schema's to create our Storm objects or the database table from it. Right now for us it is not worth the time to implement such a feature. Traversing and URL's: Usually our customers whant to have special URL's for their pages. In any way (ZODB or SQL) we need to implement special traverser's to provide the URL's. Usually we use z3c.traverser to do this. Because of the special URL's we also need to implement absolute URL adapters. Transaction handling: Storm already has a DataManager for zope's transaction package. All you need to do is to register a utility for each of the database you want to use. ZMI: Hmm, don't work out of the box. If really needed we build traversers for the ZMI. Data transparency: At any time you can use any database administration tool you like to directly view and/or manipulate you data in the database. Jürgen
--On 24. Mai 2008 10:47:54 +0200 Jürgen kartnaller <juergen.kartnaller@gmail.com> wrote:
There seems to be some interest on the use of SQL databases with Zope.
Lovelysystems is now using SQL databases as the primary storage for their applications. We use Zope and Postgres with Storm as ORM. The main reason for switching to SQL database were speed issues with queries.
Thanks for this interesting mail. Some questions: - how to you deal with binary large objects? - how do you deal with changes in the data model and related migrations? - how do you deal with hierarchical or semi-structured data (e.g. SGML, XML)? We are working and building hybrid solutions on top of Zope 2 since years where parts of the data is stored within the ZODB and metadata is stored within Postgres and this worked out pretty well. I would disagree that a RDBMS is the best solution for every particular usecase. The decision for the backend storage must be made for every project - either use the ZODB, a RDBMS or a mixture of both. Andreas
On Sat, May 24, 2008 at 11:28:37AM +0200, Andreas Jung wrote:
- how do you deal with hierarchical or semi-structured data (e.g. SGML, XML)?
As a side note, PostgreSQL 8.3 has a native XML column type. It's fairly basic still, but you can create indexes based on xpath queries. I think future versions will be able to use a schema as a constraint for a column. -- Brian Sutherland
--On 24. Mai 2008 17:57:17 +0200 Brian Sutherland <brian@vanguardistas.net> wrote:
On Sat, May 24, 2008 at 11:28:37AM +0200, Andreas Jung wrote:
- how do you deal with hierarchical or semi-structured data (e.g. SGML, XML)?
As a side note, PostgreSQL 8.3 has a native XML column type. It's fairly basic still, but you can create indexes based on xpath queries.
Jup. Postgres already had in version 7 XML support as an add-on...but pretty much unusable for the production and the functionality of the XML type in Postgres 8 looks similarly (feature-wise). Andreas
On Sat, May 24, 2008 at 06:23:01PM +0200, Andreas Jung wrote:
--On 24. Mai 2008 17:57:17 +0200 Brian Sutherland <brian@vanguardistas.net> wrote:
On Sat, May 24, 2008 at 11:28:37AM +0200, Andreas Jung wrote:
- how do you deal with hierarchical or semi-structured data (e.g. SGML, XML)?
As a side note, PostgreSQL 8.3 has a native XML column type. It's fairly basic still, but you can create indexes based on xpath queries.
Jup. Postgres already had in version 7 XML support as an add-on...but pretty much unusable for the production
Interesting, why not? I'm finding it incredibly useful, but then maybe my needs are very simple.
and the functionality of the XML type in Postgres 8 looks similarly (feature-wise).
Andreas
-- Brian Sutherland
--On 24. Mai 2008 18:51:17 +0200 Brian Sutherland <brian@vanguardistas.net> wrote:
On Sat, May 24, 2008 at 06:23:01PM +0200, Andreas Jung wrote:
--On 24. Mai 2008 17:57:17 +0200 Brian Sutherland <brian@vanguardistas.net> wrote:
On Sat, May 24, 2008 at 11:28:37AM +0200, Andreas Jung wrote:
- how do you deal with hierarchical or semi-structured data (e.g. SGML, XML)?
As a side note, PostgreSQL 8.3 has a native XML column type. It's fairly basic still, but you can create indexes based on xpath queries.
Jup. Postgres already had in version 7 XML support as an add-on...but pretty much unusable for the production
Interesting, why not?
I'm finding it incredibly useful, but then maybe my needs are very simple.
An important functionality would be being able to modify nodes of an XML document in-place. Storing and querying using xpath is only one part of the story. Andreas
Hi Jurgen. Thank you for this informative post. I am particularly interested in how this fits into the refactoring of legacy code. I appreciate your sharing your experiences. I also think everyone has been paying much attention to the insight Lovely has been sharing on scaling and efficient delivery of Zope over the past couple of years. As a means of mapping one or more backends without changing the logic or code with backend logic, schemas play an important role. I can see the benefit of providing plain SQL statements since they are clearly understood. The concern I have about not using schemas is the loss of integration potential for the different backends using a common pattern of mapping zope schema to xml, rdf, rdb, whatever ... In your opinion, is this abstraction simply costing too much, unnecessary, or a matter of application development and runtime speed. For me, the crux of the rdb approach for legacy code is the container, location and traversal. You have been very generous with your examples. I am really hoping for a clearer idea of handling Container, OrderedContainer, and Location which is prevalent in legacy code. Overall, I can say that I quite the innovation here in getting to a 'leaner' concept of Zope. Regards, David Jürgen kartnaller wrote:
There seems to be some interest on the use of SQL databases with Zope.
Lovelysystems is now using SQL databases as the primary storage for their applications. We use Zope and Postgres with Storm as ORM. The main reason for switching to SQL database were speed issues with queries.
Here is a short summary of my thougt's and experiences while using Storm and Zope for about 3 Month now.
RelStorage: Relstorage doesn't solve the speed problems. Doing queries with SQL is much faster than doing it with ZODB. If you work with a lot and with large BTrees you need to load them all into the memory of each Zope client. This has to be done with Relstorage too.
Indexes: You don't need to implement catalog indexes, this is all done on the database side. When implementing and using your content types, at first you don't need to think about indexes, later you optimize the database without touching your python code.
A speed example : We had to find similar users based on items a user has collected. Doing this with ZODB took minutes to calculate for users with a lot of items. We had to implement a lot of code to do the calculation asynchronously to not block the users request. Doing the same with SQL was possible with a single (of course complex) query within 300ms, no async things needed, just implement the query and optimize the indexes on the server, finished ! Relstorage will not help you here.
Content implementation: While we are porting our existing ZODB based packages to SQL, we found that implementing them with Storm is as easy as using ZODB. We still can use the full power of Zope's component architecture. This is because Storm objects are extremely easy to implement. You can implement a storm object like a Persistent object, just derive from Storm instead of Persistent, add __storm_table__ and define the properties as Storm properties.
For me a big mistake when switching from ZODB to SQL is trying to use the container pattern at any cost. A container is nothing but a 1:N relation and this is exactly what an SQL database provides : Relations
class Content(Storm): id = Int(primary=True) content = ReferenceSet(id, 'Contained.somethingId') c = Content()
Now you can - add data : c.content.add(content) - iterate : for a in c.content: - search : c.content.find(...) - sort : c.content.find().sort_by(...) - do anything a Storm ResultSet is providing
But of course it is possible to put an adapter around the Content class which will provide IContainer.
Annotation: Annotations are 1:1 relations, so it's as easy as the above. We use annotations like simple adapters to other tables.
class ToBeAnnotated(Storm): interface.implements(ICanHaveData) id = Int(primary=True)
Note that the "annotated" storm table is implemented as an adapter :
class Data(Storm): interface.implements(IData) interface.adapts(ICanHaveData) id = Int(primary=True) __parent__ = Reference(id, ToBeAnnotated.id) def _init__(self, context): # a dummy to make the adapter happy pass
We can now register "Data" as an adapter. We use a special adapter factory like zope.annotation.factory to autocreate adapted content.
def contentAdapter(table, autocreate=True): # an adapter on content for content contained in other tables. Just like # the annotation adapter, an instance is created if autocreate is True. adapts = component.adaptedBy(table) if adapts is None: raise TypeError("Missing 'zope.component.adapts' on table") @component.adapter(list(adapts)[0]) @interface.implementer(list(component.implementedBy(table))[0]) def getAdapter(context): unsafeContext = removeSecurityProxy(context) obj = getStore().find(table, table.__parent__ == unsafeContext).one() if obj is None and autocreate: obj = table(context) obj.__parent__ = context return obj return getAdapter
Now you can define a factory for the adapter:
dataFactory = contentAdapter(Data)
And register "dataFactory" as an adapter.
DublinCore: If you want to use the full DublinCore implementation from Zope you need to do a generic implementation. Usually only parts of the DublinCore interface is used. We usually implement IDCTimes and IDCDescriptiveProperties. All you need to do for this is :
class DCStormContent(Storm): interface.implements(IDCTimes, IDCDescriptiveProperties) created = DateTime() modified = DateTime() title = Unicode() description = Unicode()
That's it! You can now use IDCTimes and IDCDescriptiveProperties for your formlib form_fields.
There are two way's to update "modified" : - write an event handler for ObjectModifiedEvent - do it on the database side with triggers I prefer using the event handler because the database trigger is doing the update only when writing to the database which can be to late.
Schema's and Storm objects: We don't use schema's to create our Storm objects or the database table from it. Right now for us it is not worth the time to implement such a feature.
Traversing and URL's: Usually our customers whant to have special URL's for their pages. In any way (ZODB or SQL) we need to implement special traverser's to provide the URL's. Usually we use z3c.traverser to do this. Because of the special URL's we also need to implement absolute URL adapters.
Transaction handling: Storm already has a DataManager for zope's transaction package. All you need to do is to register a utility for each of the database you want to use.
ZMI: Hmm, don't work out of the box. If really needed we build traversers for the ZMI.
Data transparency: At any time you can use any database administration tool you like to directly view and/or manipulate you data in the database.
Jürgen
------------------------------------------------------------------------
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
On Sat, May 24, 2008 at 2:24 PM, David Pratt <fairwinds@eastlink.ca> wrote:
Hi Jurgen. Thank you for this informative post. I am particularly interested in how this fits into the refactoring of legacy code. I appreciate your sharing your experiences. I also think everyone has been paying much attention to the insight Lovely has been sharing on scaling and efficient delivery of Zope over the past couple of years.
As a means of mapping one or more backends without changing the logic or code with backend logic, schemas play an important role. I can see the benefit of providing plain SQL statements since they are clearly understood. The concern I have about not using schemas is the loss of integration potential for the different backends using a common pattern of mapping zope schema to xml, rdf, rdb, whatever ... In your opinion, is this abstraction simply costing too much, unnecessary, or a matter of application development and runtime speed.
I'm not sure what you mean with schema/xml in context with SQL. You can still use your schema as you already do. I just wrote a CSV importer based on schema's together with formlib's Editform and AddForm. The only diference you have is, that it is not possible to use "FieldProperty" in Storm classes.
For me, the crux of the rdb approach for legacy code is the container, location and traversal. You have been very generous with your examples. I am really hoping for a clearer idea of handling Container, OrderedContainer, and Location which is prevalent in legacy code. Overall, I can say that I quite the innovation here in getting to a 'leaner' concept of Zope.
If you have your legacy code you have a clear definition what you need for your container. So it should be straight forward to implement IContainer and ILocated. Without going too deep into this here is some code which should be usable: class Container(Storm): interface.implements(IContainer) __storm_table__ = 'containers' id = Int(primary=True) content = ReferenceSet(id, 'Content.id') def __iter__(self): return self.content def __getitem__(self, name): item = self.content.find(Content.name == name).one() if item is None: raise KeyError return item self __setitem__(self, name, item): item.name = name # add namechooser things here item.parent = self def __len__(self): return self.content.count() class Content(Storm): id = Int(primary=True) name = Unicode() parent = Reference(id, Container.id) Don't kill me if something is wrong here, this is an untested quick hack to demonstrate what's possible. Also the IContainer interface is not fully implemented.
Regards, David
Jürgen kartnaller wrote:
There seems to be some interest on the use of SQL databases with Zope.
Lovelysystems is now using SQL databases as the primary storage for their applications. We use Zope and Postgres with Storm as ORM. The main reason for switching to SQL database were speed issues with queries.
Here is a short summary of my thougt's and experiences while using Storm and Zope for about 3 Month now.
RelStorage: Relstorage doesn't solve the speed problems. Doing queries with SQL is much faster than doing it with ZODB. If you work with a lot and with large BTrees you need to load them all into the memory of each Zope client. This has to be done with Relstorage too.
Indexes: You don't need to implement catalog indexes, this is all done on the database side. When implementing and using your content types, at first you don't need to think about indexes, later you optimize the database without touching your python code.
A speed example : We had to find similar users based on items a user has collected. Doing this with ZODB took minutes to calculate for users with a lot of items. We had to implement a lot of code to do the calculation asynchronously to not block the users request. Doing the same with SQL was possible with a single (of course complex) query within 300ms, no async things needed, just implement the query and optimize the indexes on the server, finished ! Relstorage will not help you here.
Content implementation: While we are porting our existing ZODB based packages to SQL, we found that implementing them with Storm is as easy as using ZODB. We still can use the full power of Zope's component architecture. This is because Storm objects are extremely easy to implement. You can implement a storm object like a Persistent object, just derive from Storm instead of Persistent, add __storm_table__ and define the properties as Storm properties.
For me a big mistake when switching from ZODB to SQL is trying to use the container pattern at any cost. A container is nothing but a 1:N relation and this is exactly what an SQL database provides : Relations
class Content(Storm): id = Int(primary=True) content = ReferenceSet(id, 'Contained.somethingId') c = Content()
Now you can - add data : c.content.add(content) - iterate : for a in c.content: - search : c.content.find(...) - sort : c.content.find().sort_by(...) - do anything a Storm ResultSet is providing
But of course it is possible to put an adapter around the Content class which will provide IContainer.
Annotation: Annotations are 1:1 relations, so it's as easy as the above. We use annotations like simple adapters to other tables.
class ToBeAnnotated(Storm): interface.implements(ICanHaveData) id = Int(primary=True)
Note that the "annotated" storm table is implemented as an adapter :
class Data(Storm): interface.implements(IData) interface.adapts(ICanHaveData) id = Int(primary=True) __parent__ = Reference(id, ToBeAnnotated.id) def _init__(self, context): # a dummy to make the adapter happy pass
We can now register "Data" as an adapter. We use a special adapter factory like zope.annotation.factory to autocreate adapted content.
def contentAdapter(table, autocreate=True): # an adapter on content for content contained in other tables. Just like # the annotation adapter, an instance is created if autocreate is True. adapts = component.adaptedBy(table) if adapts is None: raise TypeError("Missing 'zope.component.adapts' on table") @component.adapter(list(adapts)[0]) @interface.implementer(list(component.implementedBy(table))[0]) def getAdapter(context): unsafeContext = removeSecurityProxy(context) obj = getStore().find(table, table.__parent__ == unsafeContext).one() if obj is None and autocreate: obj = table(context) obj.__parent__ = context return obj return getAdapter
Now you can define a factory for the adapter:
dataFactory = contentAdapter(Data)
And register "dataFactory" as an adapter.
DublinCore: If you want to use the full DublinCore implementation from Zope you need to do a generic implementation. Usually only parts of the DublinCore interface is used. We usually implement IDCTimes and IDCDescriptiveProperties. All you need to do for this is :
class DCStormContent(Storm): interface.implements(IDCTimes, IDCDescriptiveProperties) created = DateTime() modified = DateTime() title = Unicode() description = Unicode()
That's it! You can now use IDCTimes and IDCDescriptiveProperties for your formlib form_fields.
There are two way's to update "modified" : - write an event handler for ObjectModifiedEvent - do it on the database side with triggers I prefer using the event handler because the database trigger is doing the update only when writing to the database which can be to late.
Schema's and Storm objects: We don't use schema's to create our Storm objects or the database table from it. Right now for us it is not worth the time to implement such a feature.
Traversing and URL's: Usually our customers whant to have special URL's for their pages. In any way (ZODB or SQL) we need to implement special traverser's to provide the URL's. Usually we use z3c.traverser to do this. Because of the special URL's we also need to implement absolute URL adapters.
Transaction handling: Storm already has a DataManager for zope's transaction package. All you need to do is to register a utility for each of the database you want to use.
ZMI: Hmm, don't work out of the box. If really needed we build traversers for the ZMI.
Data transparency: At any time you can use any database administration tool you like to directly view and/or manipulate you data in the database.
Jürgen
------------------------------------------------------------------------
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Hi Jurgen. Thank you for keeping this thread going. It is helpful and good to experiment with this further for light applications. I have not given much attention to storage with orm since the early days of sqlalchemy. There have been many improvements since then to the various implementations. I believe Storm may be a bit more precise as to what it wants to be. That translates into a bit more speed from all accounts. Jürgen kartnaller wrote:
On Sat, May 24, 2008 at 2:24 PM, David Pratt <fairwinds@eastlink.ca <mailto:fairwinds@eastlink.ca>> wrote:
Hi Jurgen. Thank you for this informative post. I am particularly interested in how this fits into the refactoring of legacy code. I appreciate your sharing your experiences. I also think everyone has been paying much attention to the insight Lovely has been sharing on scaling and efficient delivery of Zope over the past couple of years.
As a means of mapping one or more backends without changing the logic or code with backend logic, schemas play an important role. I can see the benefit of providing plain SQL statements since they are clearly understood. The concern I have about not using schemas is the loss of integration potential for the different backends using a common pattern of mapping zope schema to xml, rdf, rdb, whatever ... In your opinion, is this abstraction simply costing too much, unnecessary, or a matter of application development and runtime speed.
Schemas provide the potential of mapping the object structure to other storage backends with different notions of persistence - so not a direct reference for xml to sql. As much as possible I do not want to change an app to use a different backend. With adaptation and classes that behave the same way in CA, it ought to be possible to only have to change import statements to different implementations of Contained, Location etc and generally change what you are subclassing from to persist data.
I'm not sure what you mean with schema/xml in context with SQL. You can still use your schema as you already do. I just wrote a CSV importer based on schema's together with formlib's Editform and AddForm. The only diference you have is, that it is not possible to use "FieldProperty" in Storm classes.
For me, the crux of the rdb approach for legacy code is the container, location and traversal. You have been very generous with your examples. I am really hoping for a clearer idea of handling Container, OrderedContainer, and Location which is prevalent in legacy code. Overall, I can say that I quite the innovation here in getting to a 'leaner' concept of Zope.
If you have your legacy code you have a clear definition what you need for your container. So it should be straight forward to implement IContainer and ILocated.
Yes, you are right. I'll experiment with what you have provided and see if I can get something basic working with the z3c.traverser package. BTW, I am assuming at this point that everything is registered in the global site manager for the app. I am not sure if it is possible to have a notion of local sites without the ZODB. This of course would required changes in legacy applications as well.
Without going too deep into this here is some code which should be usable:
class Container(Storm): interface.implements(IContainer) __storm_table__ = 'containers' id = Int(primary=True) content = ReferenceSet(id, 'Content.id') def __iter__(self): return self.content def __getitem__(self, name): item = self.content.find(Content.name == name).one() if item is None: raise KeyError return item self __setitem__(self, name, item): item.name <http://item.name> = name # add namechooser things here item.parent = self def __len__(self): return self.content.count()
class Content(Storm): id = Int(primary=True) name = Unicode() parent = Reference(id, Container.id)
No worry about killing anyone :-) I appreciate this sketch. I believe Martijn and Brian were discussing a generic implementation of Container for the different alchemy flavors a few weeks back - an implementation that could live outside of these packages. It would be useful to have a decent generic implementation for rdb implementations. Many thanks. Regards, David
Don't kill me if something is wrong here, this is an untested quick hack to demonstrate what's possible. Also the IContainer interface is not fully implemented.
Regards, David
Jürgen kartnaller wrote:
There seems to be some interest on the use of SQL databases with Zope.
Lovelysystems is now using SQL databases as the primary storage for their applications. We use Zope and Postgres with Storm as ORM. The main reason for switching to SQL database were speed issues with queries.
Here is a short summary of my thougt's and experiences while using Storm and Zope for about 3 Month now.
RelStorage: Relstorage doesn't solve the speed problems. Doing queries with SQL is much faster than doing it with ZODB. If you work with a lot and with large BTrees you need to load them all into the memory of each Zope client. This has to be done with Relstorage too.
Indexes: You don't need to implement catalog indexes, this is all done on the database side. When implementing and using your content types, at first you don't need to think about indexes, later you optimize the database without touching your python code.
A speed example : We had to find similar users based on items a user has collected. Doing this with ZODB took minutes to calculate for users with a lot of items. We had to implement a lot of code to do the calculation asynchronously to not block the users request. Doing the same with SQL was possible with a single (of course complex) query within 300ms, no async things needed, just implement the query and optimize the indexes on the server, finished ! Relstorage will not help you here.
Content implementation: While we are porting our existing ZODB based packages to SQL, we found that implementing them with Storm is as easy as using ZODB. We still can use the full power of Zope's component architecture. This is because Storm objects are extremely easy to implement. You can implement a storm object like a Persistent object, just derive from Storm instead of Persistent, add __storm_table__ and define the properties as Storm properties.
For me a big mistake when switching from ZODB to SQL is trying to use the container pattern at any cost. A container is nothing but a 1:N relation and this is exactly what an SQL database provides : Relations
class Content(Storm): id = Int(primary=True) content = ReferenceSet(id, 'Contained.somethingId') c = Content()
Now you can - add data : c.content.add(content) - iterate : for a in c.content: - search : c.content.find(...) - sort : c.content.find().sort_by(...) - do anything a Storm ResultSet is providing
But of course it is possible to put an adapter around the Content class which will provide IContainer.
Annotation: Annotations are 1:1 relations, so it's as easy as the above. We use annotations like simple adapters to other tables.
class ToBeAnnotated(Storm): interface.implements(ICanHaveData) id = Int(primary=True)
Note that the "annotated" storm table is implemented as an adapter :
class Data(Storm): interface.implements(IData) interface.adapts(ICanHaveData) id = Int(primary=True) __parent__ = Reference(id, ToBeAnnotated.id) def _init__(self, context): # a dummy to make the adapter happy pass
We can now register "Data" as an adapter. We use a special adapter factory like zope.annotation.factory to autocreate adapted content.
def contentAdapter(table, autocreate=True): # an adapter on content for content contained in other tables. Just like # the annotation adapter, an instance is created if autocreate is True. adapts = component.adaptedBy(table) if adapts is None: raise TypeError("Missing 'zope.component.adapts' on table") @component.adapter(list(adapts)[0]) @interface.implementer(list(component.implementedBy(table))[0]) def getAdapter(context): unsafeContext = removeSecurityProxy(context) obj = getStore().find(table, table.__parent__ == unsafeContext).one() if obj is None and autocreate: obj = table(context) obj.__parent__ = context return obj return getAdapter
Now you can define a factory for the adapter:
dataFactory = contentAdapter(Data)
And register "dataFactory" as an adapter.
DublinCore: If you want to use the full DublinCore implementation from Zope you need to do a generic implementation. Usually only parts of the DublinCore interface is used. We usually implement IDCTimes and IDCDescriptiveProperties. All you need to do for this is :
class DCStormContent(Storm): interface.implements(IDCTimes, IDCDescriptiveProperties) created = DateTime() modified = DateTime() title = Unicode() description = Unicode()
That's it! You can now use IDCTimes and IDCDescriptiveProperties for your formlib form_fields.
There are two way's to update "modified" : - write an event handler for ObjectModifiedEvent - do it on the database side with triggers I prefer using the event handler because the database trigger is doing the update only when writing to the database which can be to late.
Schema's and Storm objects: We don't use schema's to create our Storm objects or the database table from it. Right now for us it is not worth the time to implement such a feature.
Traversing and URL's: Usually our customers whant to have special URL's for their pages. In any way (ZODB or SQL) we need to implement special traverser's to provide the URL's. Usually we use z3c.traverser to do this. Because of the special URL's we also need to implement absolute URL adapters.
Transaction handling: Storm already has a DataManager for zope's transaction package. All you need to do is to register a utility for each of the database you want to use.
ZMI: Hmm, don't work out of the box. If really needed we build traversers for the ZMI.
Data transparency: At any time you can use any database administration tool you like to directly view and/or manipulate you data in the database.
Jürgen
------------------------------------------------------------------------
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org <mailto:Zope-Dev@zope.org> http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Jürgen kartnaller schrieb:
On Sat, May 24, 2008 at 2:24 PM, David Pratt <fairwinds@eastlink.ca <mailto:fairwinds@eastlink.ca>> wrote:
Hi Jurgen. Thank you for this informative post. I am particularly interested in how this fits into the refactoring of legacy code. I appreciate your sharing your experiences. I also think everyone has been paying much attention to the insight Lovely has been sharing on scaling and efficient delivery of Zope over the past couple of years.
As a means of mapping one or more backends without changing the logic or code with backend logic, schemas play an important role. I can see the benefit of providing plain SQL statements since they are clearly understood. The concern I have about not using schemas is the loss of integration potential for the different backends using a common pattern of mapping zope schema to xml, rdf, rdb, whatever ... In your opinion, is this abstraction simply costing too much, unnecessary, or a matter of application development and runtime speed.
I'm not sure what you mean with schema/xml in context with SQL. You can still use your schema as you already do. I just wrote a CSV importer based on schema's together with formlib's Editform and AddForm. The only diference you have is, that it is not possible to use "FieldProperty" in Storm classes.
For me, the crux of the rdb approach for legacy code is the container, location and traversal. You have been very generous with your examples. I am really hoping for a clearer idea of handling Container, OrderedContainer, and Location which is prevalent in legacy code. Overall, I can say that I quite the innovation here in getting to a 'leaner' concept of Zope.
If you have your legacy code you have a clear definition what you need for your container. So it should be straight forward to implement IContainer and ILocated.
Without going too deep into this here is some code which should be usable:
class Container(Storm): interface.implements(IContainer) __storm_table__ = 'containers' id = Int(primary=True) content = ReferenceSet(id, 'Content.id') def __iter__(self): return self.content def __getitem__(self, name): item = self.content.find(Content.name == name).one() if item is None: raise KeyError return item self __setitem__(self, name, item): item.name <http://item.name> = name # add namechooser things here item.parent = self def __len__(self): return self.content.count()
class Content(Storm): id = Int(primary=True) name = Unicode() parent = Reference(id, Container.id)
Don't kill me if something is wrong here, this is an untested quick hack to demonstrate what's possible. Also the IContainer interface is not fully implemented.
Regards, David
Hi, i have released a zope-stromcontainer on pypi http://pypi.python.org/pypi/nva.stormcontainer/0.2 Christian
Jürgen kartnaller wrote:
There seems to be some interest on the use of SQL databases with Zope.
Lovelysystems is now using SQL databases as the primary storage for their applications. We use Zope and Postgres with Storm as ORM. The main reason for switching to SQL database were speed issues with queries.
Here is a short summary of my thougt's and experiences while using Storm and Zope for about 3 Month now.
RelStorage: Relstorage doesn't solve the speed problems. Doing queries with SQL is much faster than doing it with ZODB. If you work with a lot and with large BTrees you need to load them all into the memory of each Zope client. This has to be done with Relstorage too.
Indexes: You don't need to implement catalog indexes, this is all done on the database side. When implementing and using your content types, at first you don't need to think about indexes, later you optimize the database without touching your python code.
A speed example : We had to find similar users based on items a user has collected. Doing this with ZODB took minutes to calculate for users with a lot of items. We had to implement a lot of code to do the calculation asynchronously to not block the users request. Doing the same with SQL was possible with a single (of course complex) query within 300ms, no async things needed, just implement the query and optimize the indexes on the server, finished ! Relstorage will not help you here.
Content implementation: While we are porting our existing ZODB based packages to SQL, we found that implementing them with Storm is as easy as using ZODB. We still can use the full power of Zope's component architecture. This is because Storm objects are extremely easy to implement. You can implement a storm object like a Persistent object, just derive from Storm instead of Persistent, add __storm_table__ and define the properties as Storm properties.
For me a big mistake when switching from ZODB to SQL is trying to use the container pattern at any cost. A container is nothing but a 1:N relation and this is exactly what an SQL database provides : Relations
class Content(Storm): id = Int(primary=True) content = ReferenceSet(id, 'Contained.somethingId') c = Content()
Now you can - add data : c.content.add(content) - iterate : for a in c.content: - search : c.content.find(...) - sort : c.content.find().sort_by(...) - do anything a Storm ResultSet is providing
But of course it is possible to put an adapter around the Content class which will provide IContainer.
Annotation: Annotations are 1:1 relations, so it's as easy as the above. We use annotations like simple adapters to other tables.
class ToBeAnnotated(Storm): interface.implements(ICanHaveData) id = Int(primary=True)
Note that the "annotated" storm table is implemented as an adapter :
class Data(Storm): interface.implements(IData) interface.adapts(ICanHaveData) id = Int(primary=True) __parent__ = Reference(id, ToBeAnnotated.id) def _init__(self, context): # a dummy to make the adapter happy pass
We can now register "Data" as an adapter. We use a special adapter factory like zope.annotation.factory to autocreate adapted content.
def contentAdapter(table, autocreate=True): # an adapter on content for content contained in other tables. Just like # the annotation adapter, an instance is created if autocreate is True. adapts = component.adaptedBy(table) if adapts is None: raise TypeError("Missing 'zope.component.adapts' on table") @component.adapter(list(adapts)[0]) @interface.implementer(list(component.implementedBy(table))[0]) def getAdapter(context): unsafeContext = removeSecurityProxy(context) obj = getStore().find(table, table.__parent__ == unsafeContext).one() if obj is None and autocreate: obj = table(context) obj.__parent__ = context return obj return getAdapter
Now you can define a factory for the adapter:
dataFactory = contentAdapter(Data)
And register "dataFactory" as an adapter.
DublinCore: If you want to use the full DublinCore implementation from Zope you need to do a generic implementation. Usually only parts of the DublinCore interface is used. We usually implement IDCTimes and IDCDescriptiveProperties. All you need to do for this is :
class DCStormContent(Storm): interface.implements(IDCTimes, IDCDescriptiveProperties) created = DateTime() modified = DateTime() title = Unicode() description = Unicode()
That's it! You can now use IDCTimes and IDCDescriptiveProperties for your formlib form_fields.
There are two way's to update "modified" : - write an event handler for ObjectModifiedEvent - do it on the database side with triggers I prefer using the event handler because the database trigger is doing the update only when writing to the database which can be to late.
Schema's and Storm objects: We don't use schema's to create our Storm objects or the database table from it. Right now for us it is not worth the time to implement such a feature.
Traversing and URL's: Usually our customers whant to have special URL's for their pages. In any way (ZODB or SQL) we need to implement special traverser's to provide the URL's. Usually we use z3c.traverser to do this. Because of the special URL's we also need to implement absolute URL adapters.
Transaction handling: Storm already has a DataManager for zope's transaction package. All you need to do is to register a utility for each of the database you want to use.
ZMI: Hmm, don't work out of the box. If really needed we build traversers for the ZMI.
Data transparency: At any time you can use any database administration tool you like to directly view and/or manipulate you data in the database.
Jürgen
------------------------------------------------------------------------
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org <mailto:Zope-Dev@zope.org> http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
------------------------------------------------------------------------
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Hi there, David Pratt wrote: [snip]
For me, the crux of the rdb approach for legacy code is the container, location and traversal. You have been very generous with your examples. I am really hoping for a clearer idea of handling Container, OrderedContainer, and Location which is prevalent in legacy code. Overall, I can say that I quite the innovation here in getting to a 'leaner' concept of Zope.
I myself wouldn't be inclined to call Container and Location only relevant to legacy code. It's very convenient to have a RDB-backed object have a location, as that allows you to get the URL for it in the regular manner. For Container, in the megrok.rdb prototype code we actually make containers be SQLAlchemy 'MappedCollection' objects. MappedCollection happens to implement the container API already. This way we can make relations be containers by writing something like: class Departments(rdb.Container): # this is a MappedCollection rdb.key('title') class Faculty(rdb.Model): # rdb.table_name('faculty') is the default __tablename__ = 'faculty' grok.traversable('departments') departments = relation('Department', backref='faculty', collection_class=Departments) The 'departments' attribute is a container you can traverse into this way. You can hook up views to Departments just like you would to any container. Regards, Martijn
Martijn Faassen wrote:
David Pratt wrote: [snip]
For me, the crux of the rdb approach for legacy code is the container, location and traversal. You have been very generous with your examples. I am really hoping for a clearer idea of handling Container, OrderedContainer, and Location which is prevalent in legacy code. Overall, I can say that I quite the innovation here in getting to a 'leaner' concept of Zope.
I myself wouldn't be inclined to call Container and Location only relevant to legacy code.
It's very convenient to have a RDB-backed object have a location, as that allows you to get the URL for it in the regular manner.
Right. ILocation makes sense everywhere you want to place any sort of objects (whether ZODB or not) in a hierarchy. Such hierarchy can then be used by URL generation, security policies (for acquiring security settings), etc. Also, containers make a lot of sense since they essentially are like tables or table views.
For Container, in the megrok.rdb prototype code we actually make containers be SQLAlchemy 'MappedCollection' objects. MappedCollection happens to implement the container API already.
Note that naively speaking, the IContainer and dict APIs are the same. But technically, they're not entirely. dict keys can be anything hashable. IContainer keys must be unicode or str objects.
This way we can make relations be containers by writing something like:
class Departments(rdb.Container): # this is a MappedCollection rdb.key('title')
I think here it would be great if the grokker for rdb.Container could somehow find out whether 'title' was a text column. If it's not, it's not a valid container key, unless perhaps rdb.Container contained some conversion magic.
class Faculty(rdb.Model): # rdb.table_name('faculty') is the default __tablename__ = 'faculty'
grok.traversable('departments')
departments = relation('Department', backref='faculty', collection_class=Departments)
The 'departments' attribute is a container you can traverse into this way. You can hook up views to Departments just like you would to any container.
Yep. Very nice :)
Hey, Philipp von Weitershausen wrote: [snip]
For Container, in the megrok.rdb prototype code we actually make containers be SQLAlchemy 'MappedCollection' objects. MappedCollection happens to implement the container API already.
Note that naively speaking, the IContainer and dict APIs are the same. But technically, they're not entirely. dict keys can be anything hashable. IContainer keys must be unicode or str objects.
Good point.
This way we can make relations be containers by writing something like:
class Departments(rdb.Container): # this is a MappedCollection rdb.key('title')
I think here it would be great if the grokker for rdb.Container could somehow find out whether 'title' was a text column. If it's not, it's not a valid container key, unless perhaps rdb.Container contained some conversion magic.
I think some form of key conversion magic is warranted in this case. Often enough we'd just want to expose the primary key in the URL, and these are typically integers. I don't think we can do any checks during grok time, by the way, as the database might not exist yet at that point. Regards, Martijn
Philipp von Weitershausen wrote at 2008-5-26 18:15 +0200:
... Also, containers make a lot of sense since they essentially are like tables or table views.
This note is surprizing: In fact, few tables are like containers. And not only tables or table views make sense (such that it is unnecessary to claim "likelyness" to tables/table views for "making sense"). -- Dieter
On 26 May 2008, at 20:51 , Dieter Maurer wrote:
Philipp von Weitershausen wrote at 2008-5-26 18:15 +0200:
... Also, containers make a lot of sense since they essentially are like tables or table views.
This note is surprizing: In fact, few tables are like containers.
I never said that. What I said was the reverse: Containers often are like tables (Employees table contains employee rows, Folder table contains document rows, etc.). I do realize that there are kinds of tables that do not map to containers.
Martijn. Actually I think I said prevalent, not relevant :-). I agree about relevance in both circumstances so highlighted container, location and traversal as key issues for refactoring legacy code. I am concerned about retaining heirarchical arrangements and being able to use existing traversal methods. I really like what you are doing here and will look at your implementation. I appreciate you for pointing this out. Many thanks. Regards, David Martijn Faassen wrote:
Hi there,
David Pratt wrote: [snip]
For me, the crux of the rdb approach for legacy code is the container, location and traversal. You have been very generous with your examples. I am really hoping for a clearer idea of handling Container, OrderedContainer, and Location which is prevalent in legacy code. Overall, I can say that I quite the innovation here in getting to a 'leaner' concept of Zope.
I myself wouldn't be inclined to call Container and Location only relevant to legacy code.
It's very convenient to have a RDB-backed object have a location, as that allows you to get the URL for it in the regular manner.
For Container, in the megrok.rdb prototype code we actually make containers be SQLAlchemy 'MappedCollection' objects. MappedCollection happens to implement the container API already. This way we can make relations be containers by writing something like:
class Departments(rdb.Container): # this is a MappedCollection rdb.key('title')
class Faculty(rdb.Model): # rdb.table_name('faculty') is the default __tablename__ = 'faculty'
grok.traversable('departments')
departments = relation('Department', backref='faculty', collection_class=Departments)
The 'departments' attribute is a container you can traverse into this way. You can hook up views to Departments just like you would to any container.
Regards,
Martijn
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
David Pratt wrote:
Martijn. Actually I think I said prevalent, not relevant :-).
True, you did, it's just that you connected these topics particularly with refactoring legacy code, while I've been thinking more about them in the context of new code.
I agree about relevance in both circumstances so highlighted container, location and traversal as key issues for refactoring legacy code. I am concerned about retaining heirarchical arrangements and being able to use existing traversal methods. I really like what you are doing here and will look at your implementation. I appreciate you for pointing this out. Many thanks.
The sketchy implementation is here: http://svn.zope.org/grokapps/rdbexample/ (the declarative-directives branch is the only one that currently actually "works", by the way). In particular the interesting bits are in the sub-package 'megrok.rdb'. We're hoping to convert this code to use zope.sqlalchemy, and also start implementing some of the configuration details we discussed in another thread. Right now we're focusing on making this work for Grok, but you know we have the habit of splitting off more commonly useful code over time, and external contributions can of course accelerate that work. :) Regards, Martijn
Hi Martijn. This is very helpful and I appreciate your post. I'll be taking a closer look at the example for sure. I quite like what is happening with these solutions. I like the z3c.dobbin's slick zope schema implementation and would like to attempt a container implementation much along the lines of what you are demonstrating.. It would make rdb quite transparent. Another interesting point about rdb and wsgi is what to do about site managers without zodb. Typically you set up your authentication and core utilities either within gsm or local site manager. When you are registering utilities, many of the utilities set up folders within the gsm or local site manager. Thinking about authenticators here but there are many others also .It has me asking whether I ought to be doing along these lines to refactor. I am aware of things such as repoze.who and authkit that could be of benefit here. I am hoping to hear of experience from folks moving in this direction to see how this is being handled. Many thanks. Regards, David ----- Original Message ----- From: Martijn Faassen <faassen@startifact.com> Date: Monday, May 26, 2008 1:54 pm Subject: [Zope-dev] Re: Zope and Storm (SQL)
David Pratt wrote:
Martijn. Actually I think I said prevalent, not relevant :-).
True, you did, it's just that you connected these topics particularly with refactoring legacy code, while I've been thinking more about them in the context of new code.
I agree about relevance in both circumstances so highlighted container, location and traversal as key issues for refactoring legacy code. I am concerned about retaining heirarchical arrangements and being able to use existing traversal methods. I really like what you are doing here and will look at your implementation. I appreciate you for pointing this out. Many thanks.
The sketchy implementation is here:
http://svn.zope.org/grokapps/rdbexample/
(the declarative-directives branch is the only one that currently actually "works", by the way).
In particular the interesting bits are in the sub-package 'megrok.rdb'. We're hoping to convert this code to use zope.sqlalchemy, and also start implementing some of the configuration details we discussed in another thread. Right now we're focusing on making this work for Grok, but you know we have the habit of splitting off more commonly useful code over time, and external contributions can of course accelerate that work. :)
Regards,
Martijn
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
participants (9)
-
Andreas Jung -
Brian Sutherland -
Christian Klinger -
David Pratt -
Dieter Maurer -
fairwinds@eastlink.ca -
Jürgen kartnaller -
Martijn Faassen -
Philipp von Weitershausen