This was on the Zope list and I've move this to Zope-dev as I feel it is more appropriate.
I have been making notes to myself for awhile with ideas about things I would like to explore with Zope Storages when I am able. I have not posted any of them since talk is cheap. But I guess since I'm full of cheap talk I'll let it go. :)
I'll also post this to the ZODB zwiki after I see what's left standing after analysis. :)
As far as usage is concerned I generally like the ZODB best because it is reasonably transparent to the building of a web app with Zope. However there are areas in which it does not currently excel that which if your site requires these skills then alternatives must be used. A couple of areas are data size and heavy writes.
Some people use an RDBMS to solve these issues. While this will work it does expect more from the developer. Some do not have the skills or the tools. Even if one does it still requires leaving the transparency of developing with the ZODB.
Multiple file storage for the ZODB has been proposed as a solution and there are 2 proposals currently on the ZODB ZWiki. I will add another.
Class/Object based db files.
Each class gets it's own db file. This could be similar to the current ZODB file except specific to a class. As objects are created they are appended to the db file for their class. This could be somewhat analogous to tables in an RDBMS.
Advantages would be spreading out the data space over multiple files which would help with some oses. Also I think that each class has different characteristics which would be able to be managed better if separate.
Example: AutoParts You have an AutoParts class. The objects will change very little once created. However there are a lot of objects and news added periodically. This file will need packed seldom. It will also be simple to backup and not need backed up often as changes are periodic and regular.
RetailStore In a retail store the product objects are very volatile. Vendors can change. Prices do change. A productObject file would have different usage characteristics than the AutoParts object.
Some classes are perfect for few writes and many reads. Others less so.
Earlier Andrew Kuchling was wanting to walk the object tree. This would provide a relatively easy way to walk the object tree.
This could be implemented with some support classes which have to be inherited from to create a class.db file. Any class not so doing would go into the standard ZODB. This could help provide desired management features for the characteristics of each. It would be nice if in the management you could set the path to the file. This would allow for multiple disks or partitions for data storage. This too would help with backups and such.
Just a few ideas. They may not stand up to examination, but that's okay. I just thought I would put them on the table.
Jimmie Houchin
Chris McDonough wrote:
All,
I've put some stuff about a proposed RelationalStorage in the ZODB wiki at http://www.zope.org/Members/jim/ZODB
Its goal is to allow you to use a SQL database as a Storage, which somewhat coincidentally would also get around the single-file 2GB limitation.
I would appreciate comments (in the wiki or here). The table structure I'm not sure on, it's only a skeleton right now...
Jason Spisak wrote:
Jonothan:
Shouldn't be too difficult. (I know, famous last words.) I'd be interested in banging out a prototype.
I looked at FileStorage and the BasicStorage yesterday. I am trying to get a feel for it.
I'll have to see when I can get to it after responding to all these emails about LocalFS that piled up while I was on vacation.
That's because it's an amazing product.
All my best,
Jason Spisak CIO HireTechs.com 6151 West Century Boulevard Suite 900 Los Angeles, CA 90045 P. 310.665.3444 F. 310.665.3544
Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
-- Chris McDonough Digital Creations Publishers of Zope - http://www.zope.org
Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
Jimmie:
This was on the Zope list and I've move this to Zope-dev as I feel it is more appropriate.
I agree.
I have been making notes to myself for awhile with ideas about things I would like to explore with Zope Storages when I am able. I have not posted any of them since talk is cheap.
That never stopped me.
I'll also post this to the ZODB zwiki after I see what's left standing after analysis. :)
As far as usage is concerned I generally like the ZODB best because it is reasonably transparent to the building of a web app with Zope.
I agree.
However there are areas in which it does not currently excel that which if your site requires these skills then alternatives must be used. A couple of areas are data size and heavy writes.
Yes.
Some people use an RDBMS to solve these issues. While this will work it does expect more from the developer. Some do not have the skills or the tools. Even if one does it still requires leaving the transparency of developing with the ZODB.
Which works for those with RDBMS experience or existing installations, but not for others.
Multiple file storage for the ZODB has been proposed as a solution and there are 2 proposals currently on the ZODB ZWiki. I will add another.
Class/Object based db files.
Maybe we are talking about the same thing here. Are you talking a db (like Berkeley) implementation of the MultipleFileStorage?
Each class gets it's own db file. This could be similar to the current ZODB file except specific to a class. As objects are created they are appended to the db file for their class. This could be somewhat analogous to tables in an RDBMS.
Advantages would be spreading out the data space over multiple files which would help with some oses. Also I think that each class has different characteristics which would be able to be managed better if separate.
<snip Examples>
This could be implemented with some support classes which have to be inherited from to create a class.db file. Any class not so doing would go into the standard ZODB.
I think this is a terrific way to integrate it.
This could help provide desired management features for the characteristics of each. It would be nice if in the management you could set the path to the file.
I am wondering about keeping it OS/single file-system non-specific. Will this kind of thing work for the Global File System and /or distributed file systems? Probably.
This would allow for multiple disks or partitions for data storage. This too would help with backups and such.
Just a few ideas.
A few good ones.
All my best,
Jason Spisak CIO HireTechs.com 6151 West Century Boulevard Suite 900 Los Angeles, CA 90045 P. 310.665.3444 F. 310.665.3544
Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
At 11:16 PM +0000 5/3/00, Jason Spisak wrote: [snip]
Multiple file storage for the ZODB has been proposed as a solution and there are 2 proposals currently on the ZODB ZWiki. I will add another.
Class/Object based db files.
Maybe we are talking about the same thing here. Are you talking a db (like Berkeley) implementation of the MultipleFileStorage?
Some type of MultipleFileStorage. A primary difference in our ideas from my understanding is in my idea it is a single file per class instead of a file per object. Which is what I described below.
Each class gets it's own db file. This could be similar to the current ZODB file except specific to a class. As objects are created they are appended to the db file for their class. This could be somewhat analogous to tables in an RDBMS.
[snip]
Thanks.
Jimmie Houchin
Jimmie Houchin:
At 11:16 PM +0000 5/3/00, Jason Spisak wrote: [snip]
Multiple file storage for the ZODB has been proposed as a solution and there are 2 proposals currently on the ZODB ZWiki. I will add another.
Class/Object based db files.
Maybe we are talking about the same thing here. Are you talking a db (like Berkeley) implementation of the MultipleFileStorage?
Some type of MultipleFileStorage. A primary difference in our ideas from my understanding is in my idea it is a single file per class instead of a file per object. Which is what I described below.
What is the primary reasoning behind the per class? Is that how the ZODB works now? I think it's transaction based. I think I understand now. It's like a row in an RDBMS. Yes, we are talking about two different animals. I think the important thing is the transparency of as Phillip said "persistance providers". A place to get you persistance, whether stored by class/object/transaction, they should all give Zope what it needs...an object with a current transaction/version state.
Each class gets it's own db file. This could be similar to the current ZODB file except specific to a class. As objects are created they are appended to the db file for their class. This could be somewhat analogous to tables in an RDBMS.
[snip]
Thanks.
Jimmie Houchin
All my best
Jason Spisak CIO HireTechs.com 6151 West Century Boulevard Suite 900 Los Angeles, CA 90045 P. 310.665.3444 F. 310.665.3544
Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
At 3:20 PM +0000 5/4/00, Jason Spisak wrote:
Jimmie Houchin:
[snip]
What is the primary reasoning behind the per class? Is that how the ZODB works now? I think it's transaction based. I think I understand now. It's like a row in an RDBMS. Yes, we are talking about two different animals. I think the important thing is the transparency of as Phillip said "persistance providers". A place to get you persistance, whether stored by class/object/transaction, they should all give Zope what it needs...an object with a current transaction/version state.
Actually it would be similar to a table in a RDBMS with the table being in it's own file. As per a reply from Philip, it really isn't necessary for this to be in it's own file. This doesn't really have anything to do with transactions per se. Transactions are somewhat independent of this idea and of storages. My idea had ZODB operating exactly as it currently does except with multiple files based on classes because certain objects will have different usage characteristics.
That said. I like better what Philip is proposing with Racks as it fully satisfies my thoughts and provides an overall framework and philosophy for development which much more versatile and extensable.
Hope this helps.
Jimmie Houchin
Each class gets it's own db file. This could be similar to the current ZODB file except specific to a class. As objects are created they are appended to the db file for their class. This could be somewhat analogous to tables in an RDBMS.
[snip]
Thanks.
Jimmie Houchin
All my best
Jason Spisak CIO HireTechs.com 6151 West Century Boulevard Suite 900 Los Angeles, CA 90045 P. 310.665.3444 F. 310.665.3544
Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
Jimmie Houchin:
At 3:20 PM +0000 5/4/00, Jason Spisak wrote:
Jimmie Houchin:
[snip]
What is the primary reasoning behind the per class? Is that how the ZODB works now? I think it's transaction based. I think I understand now. It's like a row in an RDBMS. Yes, we are talking about two different animals. I think the important thing is the transparency of as Phillip said "persistance providers". A place to get you persistance, whether stored by class/object/transaction, they should all give Zope what it needs...an object with a current transaction/version state.
Actually it would be similar to a table in a RDBMS with the table being in it's own file. As per a reply from Philip, it really isn't necessary for this to be in it's own file. This doesn't really have anything to do with transactions per se. Transactions are somewhat independent of this idea and of storages. My idea had ZODB operating exactly as it currently does except with multiple files based on classes because certain objects will have different usage characteristics.
That said. I like better what Philip is proposing with Racks as it fully satisfies my thoughts and provides an overall framework and philosophy for development which much more versatile and extensable.
Hope this helps.
Jimmie Houchin
I think the RIPP model will work fine as well. It's a solution to the propertysheet/instance data source anonimity problem, but not the ZODB storage issue. Once you put things in Racks it really ceases to matter that the ZODB contains your implementation/classes, etc... because it's the instance data that is a hog. I'm going to use RIPP as well.
All my best,
Jason Spisak CIO HireTechs.com 6151 West Century Boulevard Suite 900 Los Angeles, CA 90045 P. 310.665.3444 F. 310.665.3544
Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
At 05:27 PM 5/3/00 -0500, Jimmie Houchin wrote:
As far as usage is concerned I generally like the ZODB best because it is reasonably transparent to the building of a web app with Zope. However there are areas in which it does not currently excel that which if your site requires these skills then alternatives must be used. A couple of areas are data size and heavy writes.
Agreed.
Some people use an RDBMS to solve these issues. While this will work it does expect more from the developer. Some do not have the skills or the tools. Even if one does it still requires leaving the transparency of developing with the ZODB.
Ty and I are working on fixing that. As of today, the Specialist+Rack subsystems of the ZPatterns library are now capable of managing objects with mixed-source attributes and property sheets. That is, objects stored in a Rack can have attributes and property sheets whose data comes from non-ZODB sources. Currently, no non-ZODB sources are implemented, alas, and the mechanism is only applicable to rack-mounted objects. In future releases, we'll be adding SQL and LDAP data sources, and the ability for any Zope object to take advantage of these abilities (as long as it subclasses RackMountable, and knows how to find the "designated Rack" which will provide support for the desired features).
Class/Object based db files.
Each class gets it's own db file. This could be similar to the current ZODB file except specific to a class. As objects are created they are appended to the db file for their class. This could be somewhat
analogous to tables in an RDBMS.
Advantages would be spreading out the data space over multiple files which would help with some oses. Also I think that each class has different characteristics which would be able to be managed better if separate.
Basically, what you've described could be considered an instance of the Specialist and Rack patterns of the RIPP model, but with a less flexible implementation. That is, you can do what you just described using the Specialist/Rack patterns, but what you described won't do everything Specialist/Rack can (such as mixed-source data for the same object or a sensible place to put class extent methods (e.g. find, listAll, that sort of thing)).
The only thing missing from Rack today that would be needed to implement a "ZODB per class" approach, is the ability to use an alternative "persistence provider" for storage. I'd like to see some discussion on a mechanism for providing such things. My thought is that they would sort of look like SQL Connection objects: something you can add, that can be acquired, and which Racks can select from a dropdown. (The dropdown even already exists in the about-to-be-released ZPatterns 0.3.0, it just is empty except for an option to use the primary ZODB.)
The thing about these "persistence providers", though, is that they need to provide some kind of root for each thing that wants to use them. I'm assuming each provider may be shared by more than one Rack or rack-like object, that security constraints need to exist on who can create root branches in these DB's, and that the branches need to be able to be removed when their owners go away. If there were a design (or implementation, better yet) for gizmos like these, hooking up Racks to use them would be a snap, since all Racks need to do is store a single BTree.
(By the way, Racks can be written to store the "base" object data in an RDBMS, db file, LDAP, or anything else that you can implment createItem(), retrieveItem(), and deleteItem() methods for.)
Phillip J. Eby:
Ty and I are working on fixing that. As of today, the Specialist+Rack subsystems of the ZPatterns library are now capable of managing objects with mixed-source attributes and property sheets.
Can you use that code and extend it to allow Zope to pull it's object data from multiple sources?
That is, objects stored in a Rack can have attributes and property sheets whose data comes from non-ZODB sources. Currently, no non-ZODB sources are implemented, alas, and the mechanism is only applicable to rack-mounted objects. In future releases, we'll be adding SQL and LDAP data sources, and the ability for any Zope object to take advantage of these abilities (as long as it subclasses RackMountable, and knows how to find the "designated Rack" which will provide support for the desired features).
All my best,
Jason Spisak CIO HireTechs.com 6151 West Century Boulevard Suite 900 Los Angeles, CA 90045 P. 310.665.3444 F. 310.665.3544
Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
At 03:08 PM 5/4/00 +0000, Jason Spisak wrote:
Phillip J. Eby:
Ty and I are working on fixing that. As of today, the Specialist+Rack subsystems of the ZPatterns library are now capable of managing objects with mixed-source attributes and property sheets.
Can you use that code and extend it to allow Zope to pull it's object data from multiple sources?
Not really. It operates at a different level of abstraction than that. It can, however, be extended to allow objects stored in a "primary" ZODB Folder or ObjectManager the ability to have mixed-source attributes and propertysheets. But there has to be some kind of object to "start with".
But that level of extension may be all you need, since most "application"-oriented usages lend themselves well to using Racks directly, and "content"-oriented usages can still have some of their attributes and propertysheets handled elsewhere.
There are some key issues that are NOT dealt with in the Rack model yet, however. Chief amongst them is Undo management. Jim Fulton made a change to ZODB to allow the use of a custom Undo manager, which would allow prevention of Undo for transactions that modified "outside" sources, or alternatively allow them to be rolled back if there was a source for the undo data. However, we have not yet begun to integrate this into the ZPatterns framework. Another, related issue is garbage collection, or making sure that objects are deleted in "both" places. I plan to add a GC function to Racks soon to allow them to clean up ZODB persistent data associated with non-ZODB objects which no longer exist, but that's only a beginning.
Phillip J. Eby:
At 03:08 PM 5/4/00 +0000, Jason Spisak wrote:
Phillip J. Eby:
Ty and I are working on fixing that. As of today, the Specialist+Rack subsystems of the ZPatterns library are now capable of managing objects with mixed-source attributes and property sheets.
Can you use that code and extend it to allow Zope to pull it's object data from multiple sources?
Not really. It operates at a different level of abstraction than that. It can, however, be extended to allow objects stored in a "primary" ZODB Folder or ObjectManager the ability to have mixed-source attributes and propertysheets. But there has to be some kind of object to "start with".
This does go a long way to solving the high-write and scalability problem just by itself. There is nothing wrong with the FileStorage for class data because most applcations don't change those very often. It's the instance data that get's huge and changed a lot. Maybe I'm trying to address an issue that gets addressed by ZEO. Maybe you are not the one to answer this, but with Racks and ZEO could I have 2 different Zope installations accessing the same Rack for the same object, as long as it's class definition was the same? For example I have a 2 Squishdots, one is in a Zope that has CyberCash in it, and one doesn't. Can I have the Squishdot's using the same Rack for their information?
But that level of extension may be all you need, since most "application"-oriented usages lend themselves well to using Racks directly, and "content"-oriented usages can still have some of their attributes and propertysheets handled elsewhere.
That may be true.
There are some key issues that are NOT dealt with in the Rack model yet, however. Chief amongst them is Undo management.
That's a hariy monster.
Jim Fulton made a change to ZODB to allow the use of a custom Undo manager, which would allow prevention of Undo for transactions that modified "outside" sources, or alternatively allow them to be rolled back if there was a source for the undo data. However, we have not yet begun to integrate this into the ZPatterns framework.
On step at a time.
Another, related issue is garbage collection, or making sure that objects are deleted in "both" places.
Doesn't ZEO have a facility for this in it's framework. Invalidation messages etc...?
I plan to add a GC function to Racks soon to allow them to clean up ZODB persistent data associated with non-ZODB objects which no longer exist, but that's only a beginning.
All my best,
Jason Spisak CIO HireTechs.com 6151 West Century Boulevard Suite 900 Los Angeles, CA 90045 P. 310.665.3444 F. 310.665.3544
Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
At 03:54 PM 5/4/00 +0000, Jason Spisak wrote:
data that get's huge and changed a lot. Maybe I'm trying to address an issue that gets addressed by ZEO. Maybe you are not the one to answer this, but with Racks and ZEO could I have 2 different Zope installations accessing the same Rack for the same object, as long as it's class definition was the same?
A Rack lives in a ZODB. It's just like any other Zope object. So if it's in a ZEO, it will be accessible like any other part of the same application.
For example I have a 2 Squishdots, one is in a Zope that has CyberCash in it, and one doesn't. Can I have the Squishdot's using the same Rack for their information?
If they share the same ZODB. Of course, keep in mind that the Rack may not store its data in ZODB at all - e.g. an SQL database. If so, then ZEO is irrelevant; you can have as many Zopes with copies of that Rack as you want, all accessing the data, so long as any additional sheets or attributes also come from shared external sources.
Another, related issue is garbage collection, or making sure that objects are deleted in "both" places.
Doesn't ZEO have a facility for this in it's framework. Invalidation messages etc...?
That's cache invalidation. Unrelated issue.
Jason Spisak:
[...] Maybe I'm trying to address an issue that gets addressed by ZEO. [...] Doesn't ZEO have a facility for this in it's framework. Invalidation messages etc...?
Any juicy rumors as to the progress of ZEO? (According to the Open-Source announcement, it should be nearing release. I haven't heard a peep since this announcement.)
As an aside, I have to say that this discussion is why I'm on the zope-dev mailing list...(Whereas most of the traffic has been about developing apps IN Zope, not developing architectural components FOR Zope.) Any possibility that there could be a different mailing list (perhaps "zope-internals" or somesuch?)
Phillip J. Eby:
At 03:54 PM 5/4/00 +0000, Jason Spisak wrote:
data that get's huge and changed a lot. Maybe I'm trying to address an issue that gets addressed by ZEO. Maybe you are not the one to answer this, but with Racks and ZEO could I have 2 different Zope installations accessing the same Rack for the same object, as long as it's class definition was the same?
A Rack lives in a ZODB. It's just like any other Zope object. So if it's in a ZEO, it will be accessible like any other part of the same application.
You're previous post enlightened me in this respect.
For example I have a 2 Squishdots, one is in a Zope that has CyberCash in it, and one doesn't. Can I have the Squishdot's using the same Rack for their information?
If they share the same ZODB. Of course, keep in mind that the Rack may not store its data in ZODB at all - e.g. an SQL database. If so, then ZEO is irrelevant; you can have as many Zopes with copies of that Rack as you want, all accessing the data, so long as any additional sheets or attributes also come from shared external sources.
This is the best argument for keeping you data in an SQL database I've seen yet. I would love to see that hold true for FileStorage, and something that doesn't have the DB overhead. Just to have a choice.
Another, related issue is garbage collection, or making sure that objects are deleted in "both" places.
Doesn't ZEO have a facility for this in it's framework. Invalidation messages etc...?
That's cache invalidation. Unrelated issue.
Righty-o.
Thanks again for taking you time to clear some things up. I really appreciate it.
All my best,
Jason Spisak CIO HireTechs.com 6151 West Century Boulevard Suite 900 Los Angeles, CA 90045 P. 310.665.3444 F. 310.665.3544
Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
Phillip,
Just a quick comment: I re-read this posting today (more than a month old) and realized that, according to my understanding, mounted databases fulfill your requirement for an alternative persistent provider. Am I correct?
Shane
"Phillip J. Eby" wrote:
The only thing missing from Rack today that would be needed to implement a "ZODB per class" approach, is the ability to use an alternative "persistence provider" for storage. I'd like to see some discussion on a mechanism for providing such things. My thought is that they would sort of look like SQL Connection objects: something you can add, that can be acquired, and which Racks can select from a dropdown. (The dropdown even already exists in the about-to-be-released ZPatterns 0.3.0, it just is empty except for an option to use the primary ZODB.)
The thing about these "persistence providers", though, is that they need to provide some kind of root for each thing that wants to use them. I'm assuming each provider may be shared by more than one Rack or rack-like object, that security constraints need to exist on who can create root branches in these DB's, and that the branches need to be able to be removed when their owners go away. If there were a design (or implementation, better yet) for gizmos like these, hooking up Racks to use them would be a snap, since all Racks need to do is store a single BTree.
(By the way, Racks can be written to store the "base" object data in an RDBMS, db file, LDAP, or anything else that you can implment createItem(), retrieveItem(), and deleteItem() methods for.)
Jimmie Houchin wrote:
As far as usage is concerned I generally like the ZODB best because it is reasonably transparent to the building of a web app with Zope. However there are areas in which it does not currently excel that which if your site requires these skills then alternatives must be used. A couple of areas are data size and heavy writes.
This is often thought to be a deficiency in ZODB, but the root of this particular problem is really in FileStorage. Other Storages could implement much more write intensive abilities.
Some people use an RDBMS to solve these issues.
Right.
While this will work it does expect more from the developer. Some do not have the skills or the tools. Even if one does it still requires leaving the transparency of developing with the ZODB.
To get an orthogonal benefit, for write intensity you use a realtional database, instead of using a releational database to solve relational tasks.
Multiple file storage for the ZODB has been proposed as a solution
to a different problem. The question is not using multiple file storages in Zope, but just using multiple storages. This way you could use a FileStorage when you want its properties, BerkeleyStorage when you want something different, or some sort of storage based on a relational database at the same time in the same Zope system. An analogy of allowing Zope to 'mount' various storages into the object tree has been proposed, but it's a very tricky problem and not the one that creating a relational based storage will solve.
and there are 2 proposals currently on the ZODB ZWiki. I will add another.
Class/Object based db files.
Each class gets it's own db file. This could be similar to the current ZODB file except specific to a class. As objects are created they are appended to the db file for their class. This could be somewhat analogous to tables in an RDBMS.
Advantages would be spreading out the data space over multiple files which would help with some oses. Also I think that each class has different characteristics which would be able to be managed better if separate.
I'm not any thing of an expert on RDBMS, but we have thought this pretty though in-house, and this is not really the model we came up with.
The interface of a storage object makes no assumptions about how objects are stored. A Storage is, in simple terms, a mapping from object id to a pickle. This can easily be a relational database table that is keyed on object id and contains CLOBs or BLOBs or whatever that represent the pickle. When ZODB needs to resolve a reference to an object id into an object is selects out of the object table the pickle (or pickles, who knows) it is looking for. When 'writes' are done, ZODB inserts new pickles with certain object ids. Some extra columns containing backlink references could allow undoing (and thus, sharing the same quick growing behavior file storage).
Also, the fact that FileStorage *can't* be not undo-able (and not grow so rapidly) is because the FileStorage just appends to an end of a file. A relational database does not have any such restriction, and an relational Storage could be either undoable or not undoable.
I'm not sure how your suggestion would be better than this, I don't know much about RDBMS, do they assume one file per table or database? That soulds like an implementation issue. Also, objects in classes would not be very fairly distributed. There would be gobs of ImplicitAcquirerWrappers and very few OFS.Application.Application. On Zope.org there are hundreds of classes, and over one hundred thousand objects in the database, not including previous versions. There could very well be over a million objects and their previous revisions.
Example: AutoParts You have an AutoParts class. The objects will change very little once created. However there are a lot of objects and news added periodically. This file will need packed seldom. It will also be simple to backup and not need backed up often as changes are periodic and regular.
RetailStore In a retail store the product objects are very volatile. Vendors can change. Prices do change. A productObject file would have different usage characteristics than the AutoParts object.
Some classes are perfect for few writes and many reads. Others less so.
Earlier Andrew Kuchling was wanting to walk the object tree. This would provide a relatively easy way to walk the object tree.
This could be implemented with some support classes which have to be inherited from to create a class.db file. Any class not so doing would go into the standard ZODB. This could help provide desired management features for the characteristics of each. It would be nice if in the management you could set the path to the file. This would allow for multiple disks or partitions for data storage. This too would help with backups and such.
Just a few ideas. They may not stand up to examination, but that's okay. I just thought I would put them on the table.
It does absolutly make sense to analyze your problems like this. The solutions to your needs here could be met if Zope supported multiple storages. Currently, BerkeleyStorage provides has proven high-write intensity and FileStorage is, of course, wildly useful for the often-read seldom written objects.
A relational storage is also definatly needed.
I'm suprised that more people don't use BerkeleyStorage today. Is this because it doesn't undo? I don't immagine it would be too dificult to extent it to support undo. Ty, what do you think?
-Michel
At 11:43 PM 5/3/00 -0700, Michel Pelletier wrote:
I'm suprised that more people don't use BerkeleyStorage today. Is this because it doesn't undo?
It's probably also ecause the docs are sparse to nonexistent. :)
I don't immagine it would be too dificult to extent it to support undo. Ty, what do you think?
Difficulty isn't the question as much as time budget. I've been keeping Ty busy with LoginManager and ZPatterns stuff. BerkeleyStorage was more a proof-of-concept for us, a kind of insurance that we really could use a ZODB for our future apps. We will probably revisit it, and the undo issue, when we get closer to building an app that needs undo as well as high-volume writes. The key complication is that having a Berkeley ZODB doesn't solve the various issues associated with ZCatalog performance in that environment, which generates a high volume of Berkeley logfiles. The logfiles are far easier to clean up than a FileStorage is to pack, but they actually generate *more* filesystem write activity than a FileStorage.
Our long-term solution to this is to not put Z BTrees inside of Berkeley b-rees, but instead to use Berkeley directly for indexing. This should result in a dramatic lowering of logging activity, because Berkeley will then only be logging that an index key or value changed, not logging the entire page on which it was changed. But to implement this long term solution, we need to integrate an indexing plug-in mechanism to either ZCatalog (or more likely to the Specialist/Rack model), so that it's straightforward to, as an application developer, specify the right indexes for your application's needs.
Michel Pelletier wrote:
This is often thought to be a deficiency in ZODB, but the root of this particular problem is really in FileStorage. Other Storages could implement much more write intensive abilities.
Just to correct myself a bit here, some more advanced conflict resolution in ZODB would also increase the write performance of Zope.
-Michel