OK, this has been rolling around in my head for a while, thought I'd get it out and take it for a walk. For many needs, the ZODB fits much better than an RDBMS. Unfortunately, there are some issues. Linux x86 32 Bit VFS: Yes the 2sgb problem. Granted it can be fixed, but I believe there can be larger issues. Just Plain Size: A big Data.fs can be a pain to backup, or to load. You may not neccesarily want to staore everything in your 'main' ZODB. Maybe being able to use a seperate db for seperate uses would be handy, eg. an inventory that gives you ZODB benefits, but doesn't make your server 'forever' to initialize when it reaches larger sizes? I have this feeling it would totally rock. Does anyone else think this would be cool, or have I just spent too much time behind the keyboard? :-) It is true ( I think) that it would likely need a ZODBMethod rather than an SQL method :-( Of course, continuing the idea can lead to other cool concepts .... Roxen has the idea of mounted filesystems, such that you can have to physically seperate filesystems on the machine, mapping under each other ie. ... /foo (is /home/users/foo) /foo/ftp (is /home/ftp/users/foo) ...and all requests are dealt with accordingly, no Regex'ing needed. It strikes me as useful that, given the above, one could carry it a step further, and have a 'directory' be a seperate ZODB. Now, _that_ would be cool. Alas, I have doubts as to whether this would be a less than massive undertaking. But I like the idea :-) As I said, the forst Idea has been percolating in my head for a few weeks now, had to get it out somewhere :-) ... The second, well, it just came up ... just random musings ... Bill -- In flying I have learned that carelessness and overconfidence are usually far more dangerous than deliberately accepted risks. -- Wilbur Wright in a letter to his father, September 1900
Bill Anderson wrote:
OK, this has been rolling around in my head for a while, thought I'd get it out and take it for a walk.
For many needs, the ZODB fits much better than an RDBMS. Unfortunately, there are some issues.
Linux x86 32 Bit VFS: Yes the 2sgb problem. Granted it can be fixed, but I believe there can be larger issues.
Just Plain Size: A big Data.fs can be a pain to backup, or to load. You may not neccesarily want to staore everything in your 'main' ZODB. Maybe being able to use a seperate db for seperate uses would be handy, eg. an inventory that gives you ZODB benefits, but doesn't make your server 'forever' to initialize when it reaches larger sizes?
So there are two issues. 1. Concern over the approach of using a single file for the entire database. One of the benefits of ZODB is an open storage interface. This makes it possible to explore and implement the physical storage for the Zope database in a number os ways. So, if someone felt that the database should be stored in multiple files, or on top of some other storage manager, it is certainly possible and, if using an existing storage manager (e.g. RDBMS, BDB), even straightforward. 2. Separating data into multiple databases for organizational reasons and combining the multiple databases into a single logical object space. This is doable too and something we plan to do, however, it's harder than one might expect if one allows (non-trivial) cross-database object references. If someone is interested in working on this, possibly with simpler assumptions (like only supporting trivial cross-database references), contact me and I'll be happy to provide advice.
I have this feeling it would totally rock.
Maybe, but I think that there are more direct ways to address the issues above. Perhaps another approach that provides somewhat the same thing is to provide an XML-RPC or SOAP interface to another Zope.
Does anyone else think this would be cool, or have I just spent too much time behind the keyboard? :-) It is true ( I think) that it would likely need a ZODBMethod rather than an SQL method :-(
I think that there are already XML-RPC methods of some sort. Wouldn't this, and multiple Zopes give you what you want?
Of course, continuing the idea can lead to other cool concepts .... Roxen has the idea of mounted filesystems, such that you can have to physically seperate filesystems on the machine, mapping under each other ie. ...
/foo (is /home/users/foo) /foo/ftp (is /home/ftp/users/foo)
...and all requests are dealt with accordingly, no Regex'ing needed. It strikes me as useful that, given the above, one could carry it a step further, and have a 'directory' be a seperate ZODB. Now, _that_ would be cool. Alas, I have doubts as to whether this would be a less than massive undertaking. But I like the idea :-)
See issue 2 above. I wouldn't call it massive, but I wouldn't call it easy either.
As I said, the forst Idea has been percolating in my head for a few weeks now, had to get it out somewhere :-)
Whoooo. I bet that feels better. :)
... The second, well, it just came up ... just random musings ...
It is on the long-term list of things to do. Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
Jim Fulton wrote:
So there are two issues.
1. Concern over the approach of using a single file for the entire database. [...]
2. Separating data into multiple databases for organizational reasons and combining the multiple databases into a single logical object space. [...]
I'm interested in issues 1. and 2. I'm intending to become a zope ISP here in the very short future, and I've been looking at the issues. It would be nice to sell the user x amount of space, then logically partition my drive so that each user gets that x. However, this would require me giving each of them a data.fs file.
I think that there are already XML-RPC methods of some sort. Wouldn't this, and multiple Zopes give you what you want?
I was thinking about doing this. How much overhead do I take for allowing, say, 20 instantiations of zope on the same machine? And it's the python interpreter that has the lock problem, so I'd need 20 pythons, too, nyet? as to the "one zope, one data.fs" approach: I have to have some fs directory and ZFolder size comparator, to see that they're not using more than their alloted space. I have to have all sorts of strange rewrite rules (same as multiple zopes, I guess) I have to be the one installing Products. if I could have several data.fs files (and _zero_ linkages amongst data.fs'es, they should behave like different zopes), then I could eliminate all the problems except the rewrite rules for apache. -- Ethan "mindlace" Fremen you cannot abdicate responsibility for your ideology.
Ethan Fremen wrote:
Jim Fulton wrote:
So there are two issues.
1. Concern over the approach of using a single file for the entire database. [...]
2. Separating data into multiple databases for organizational reasons and combining the multiple databases into a single logical object space. [...]
I'm interested in issues 1. and 2.
I'm intending to become a zope ISP here in the very short future,
Yee ha. :)
and I've been looking at the issues. It would be nice to sell the user x amount of space,
Yes
then logically partition my drive so that
I assume that this is just a means to an end:
each user gets that x.
So you want to be able to control how much space a customer gets. That's the bottom line, right?
However, this would require me giving each of them a data.fs file.
Not necessarily. A better model, IMO, is to implement a storage with an accounting model. Extra meta-data was added to ZODB transactions to support this sort of thing. For example, suppose each user had an account number. When a transaction is committed, the storage updates the space used by the account. It can implement quota's failing to commit transactions if the quota is exceeded. Other models are possible too, like charging for usage wo setting a quota. Finally, you can make the quota orthoganal to location in the object system, so you don't have to limit a customer to one location and you can account for things like catalog space. This sort of scheme could be achieved with relatively minor extensions to user databases and storages. For example you could problably use an accounting storage that wrapped another storage (much like DemoStorage or Ty Sarna's CompressedStorage do).
I think that there are already XML-RPC methods of some sort. Wouldn't this, and multiple Zopes give you what you want?
I was thinking about doing this. How much overhead do I take for allowing, say, 20 instantiations of zope on the same machine?
It depends on what their doing and what kind of machine. Each will probably require a few megs to idle.
And it's the python interpreter that has the lock problem,
What lock problem?
so I'd need 20 pythons, too, nyet?
You'd have two Python processes for each Zope (or 1, if you disabled the process manager, or extended the process manager to manage multiple processes.) Actually, come to think of it, it is probably feasible to hack ZServer to serve multiple object spaces in the same process.
as to the "one zope, one data.fs" approach: I have to have some fs directory and ZFolder size comparator, to see that they're not using more than their alloted space.
Right, see my suggestion above.
I have to have all sorts of strange rewrite rules (same as multiple zopes, I guess)
The forthcoming site objects will take care of the Zope side of this, as will Evan Simpson's site objects.
I have to be the one installing Products.
Right, and you may want to limit use of external methods.
if I could have several data.fs files (and _zero_ linkages amongst data.fs'es, they should behave like different zopes), then I could eliminate all the problems except the rewrite rules for apache.
and except for the need to control products and external methods. Really, all you are gaining with multiple fs files (in this context) is the ability to set quotas. I think there is a better way to do this, however. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
On Wed, 8 Dec 1999, Jim Fulton wrote:
This sort of scheme could be achieved with relatively minor extensions to user databases and storages. For example you could problably use an accounting storage that wrapped another storage (much like DemoStorage or Ty Sarna's CompressedStorage do).
IMO an easy way to hook up multiple storages is still desirable. For one you can mix non-versioning storages with versioning ones or even more fancy things like read only (cd-rom based ones) with the regular FileStorage.
You'd have two Python processes for each Zope (or 1, if you disabled the process manager, or extended the process manager to manage multiple processes.)
ZSupervisor already has such functionality, including named pipes comminication between the manager and an external control program, so you can restart, monitor or shutdown individual processes. Pavlos
Pavlos Christoforou wrote:
On Wed, 8 Dec 1999, Jim Fulton wrote:
This sort of scheme could be achieved with relatively minor extensions to user databases and storages. For example you could problably use an accounting storage that wrapped another storage (much like DemoStorage or Ty Sarna's CompressedStorage do).
IMO an easy way to hook up multiple storages is still desirable. For one you can mix non-versioning storages with versioning ones or even more fancy things like read only (cd-rom based ones) with the regular FileStorage.
Absolutely, but having multiple DBs is harder than implementing an accounting model, so the best way to achiev *this* goal is with an accounting model. Furthermore, I think that an accounting-based approach has advantages over the multi-db approach. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
Jim Fulton wrote:
Pavlos Christoforou wrote:
On Wed, 8 Dec 1999, Jim Fulton wrote:
This sort of scheme could be achieved with relatively minor extensions to user databases and storages. For example you could problably use an accounting storage that wrapped another storage (much like DemoStorage or Ty Sarna's CompressedStorage do).
IMO an easy way to hook up multiple storages is still desirable. For one you can mix non-versioning storages with versioning ones or even more fancy things like read only (cd-rom based ones) with the regular FileStorage.
That could be used in any number of cool ideas, personal storage, offline editing etc..
Absolutely, but having multiple DBs is harder than implementing an accounting model, so the best way to achiev *this* goal is with an accounting model. Furthermore, I think that an accounting-based approach has advantages over the multi-db approach.
Jim
Having multiple DBs would allow much better control over diskspace then the accounting-based approach. Two main points are; non "packed" ZODBs can consume all the users qouta, and packing one large ZODB removes "date" oriented undo features for all the users. my two bits, David
-- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org
Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
Hi, We provide Zope services to support our sites. Now we have four sites on the same machine, and we had to have one installation per site. Each site uses its own ZServer, Zope.cgi file, var dir. We tried to get all of them to share the same library but we wre unable to do that. So we decided to provide Zope hosting based on the size of the Data.fs used. For example our hosting package allowed for 50 megs, now we can apply the same scheme to this without having to worry about who is doing what. It is also easier when configuring the Apache rewrite rules that way.. No need to use the SiteRoot/SiteAccess which gave us too much headache. Adonis On Wed, 8 Dec 1999, David Kankiewicz wrote:
Jim Fulton wrote:
Pavlos Christoforou wrote:
On Wed, 8 Dec 1999, Jim Fulton wrote:
This sort of scheme could be achieved with relatively minor extensions to user databases and storages. For example you could problably use an accounting storage that wrapped another storage (much like DemoStorage or Ty Sarna's CompressedStorage do).
IMO an easy way to hook up multiple storages is still desirable. For one you can mix non-versioning storages with versioning ones or even more fancy things like read only (cd-rom based ones) with the regular FileStorage.
That could be used in any number of cool ideas, personal storage, offline editing etc..
Absolutely, but having multiple DBs is harder than implementing an accounting model, so the best way to achiev *this* goal is with an accounting model. Furthermore, I think that an accounting-based approach has advantages over the multi-db approach.
Jim
Having multiple DBs would allow much better control over diskspace then the accounting-based approach. Two main points are; non "packed" ZODBs can consume all the users qouta, and packing one large ZODB removes "date" oriented undo features for all the users.
my two bits, David
-- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org
Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://lists.zope.org/mailman/listinfo/zope-dev No cross posts or HTML encoding! (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
Jim Fulton wrote:
I'm intending to become a zope ISP here in the very short future, Yee ha. :)
Since I have to do it for me, why not let other people in on it?
So you want to be able to control how much space a customer gets. That's the bottom line, right?
yes, that's my bottom line. That, and i'm using linux w/32bit CPU's, so I'm also a touch concerned about the size of the data.fs... there seems to be a patch that raises my limit to 4TB, which should be sufficient. I'm just worried about it's integrity... I might end up with raiserfs. I tried to price Alphas, but they're a little over my price range...
However, this would require me giving each of them a data.fs file.
Not necessarily. A better model, IMO, is to implement a storage with an accounting model. Extra meta-data was added to ZODB transactions to support this sort of thing. For example, suppose each user had an account number. When a transaction is committed, the storage updates the space used by the account. It can implement quota's failing to commit transactions if the quota is exceeded. Other models are possible too, like charging for usage wo setting a quota.
This would be nicer than hard partitioning limits- I can just send a bigger bill at the end of the month instead of having transactions or other things fail.
Finally, you can make the quota orthoganal to location in the object system, so you don't have to limit a customer to one location and you can account for things like catalog space.
This is nice. I also want them to be able to have non- ZODB files, since I'm also going to be offering icecasting of mp3 files. (http://www.icecast.org). This is why I'm worried about accounting models- I need to not only consider ZODB but the underlying fs. It's not sufficient to set quotas for folders/users, because the limit would vary according to their ZODB usage. Now, I could just have a fixed fs limit and a fixed ZODB limit, but that would seem to penalize people who were mostly zope / mostly fs.
This sort of scheme could be achieved with relatively minor extensions to user databases and storages. For example you could problably use an accounting storage that wrapped another storage (much like DemoStorage or Ty Sarna's CompressedStorage do).
Now, I haven't the foggiest idea how to go about implementing this, but that's ok.
I was thinking about doing this. How much overhead do I take for allowing, say, 20 instantiations of zope on the same machine?
It depends on what their doing and what kind of machine. Each will probably require a few megs to idle.
Well, basically, I'm going to allow 20 or so clients to a machine. Right now, I'm thinking 400mhz k6III with 384mb ram. A zope per client, while it's a very simple solution, seems overkill.
And it's the python interpreter that has the lock problem,
What lock problem?
sorry, I'm referring to the "global interpreter lock", which is apparently only a problem with >1 processors.
as to the "one zope, one data.fs" approach: I have to have some fs directory and ZFolder size comparator, to see that they're not using more than their alloted space.
Right, see my suggestion above.
I guess I don't understand how your suggestion allows me to track file-system usage and regulate it relative to ZODB usage.
I have to have all sorts of strange rewrite rules (same as multiple zopes, I guess)
The forthcoming site objects will take care of the Zope side of this, as will Evan Simpson's site objects.
The site objects sound interesting... for Evan, do you mean SiteAccess?
I have to be the one installing Products.
Right, and you may want to limit use of external methods.
I can understand this, and don't even mind it. PythonMethods should do for mundane stuff, and I don't mind installing other products. (and as an aside, if I could do SSL with zope, I'd have a "pure Zope" solution instead of apache/zope) -- Ethan "mindlace" Fremen you cannot abdicate responsibility for your ideology.
[Jim Fulton, on Wed, 08 Dec 1999] :: So there are two issues. :: :: 1. Concern over the approach of using a single file for the entire :: database. One of the benefits of ZODB is an open storage interface. :: This makes it possible to explore and implement the physical storage :: for the Zope database in a number os ways. So, if someone felt that :: the database should be stored in multiple files, or on top of :: some other storage manager, it is certainly possible and, if using :: an existing storage manager (e.g. RDBMS, BDB), even straightforward. :: :: 2. Separating data into multiple databases for organizational reasons :: and combining the multiple databases into a single logical object :: space. This is doable too and something we plan to do, however, :: it's harder than one might expect if one allows (non-trivial) :: cross-database object references. If someone is interested in :: working on this, possibly with simpler assumptions (like only :: supporting trivial cross-database references), contact me and :: I'll be happy to provide advice. Jim, <BLUE SKY> Have you ever given thought to implementing a Linux-only version of Zope optimized for use with the reiserfs? This is probably just hooey, but I wonder if the balanced-tree approach in the Reiser filesystem might not map well into ZODB. </BLUE SKY>
Patrick Phalen wrote:
(snip)
Have you ever given thought to implementing a Linux-only version of Zope optimized for use with the reiserfs?
Not until you mentioned it. I did spend some time looking at: http://devlinux.com/projects/reiserfs/
This is probably just hooey, but I wonder if the balanced-tree approach in the Reiser filesystem might not map well into ZODB.
How so? Would you imagine using the reiserfs as a ZODB storage? If one was going to use a file system as an object store, then the reiserfs would be attractive due to it's effective support for small files. Is this what you had in mind? The ZODB has a well-defined "storage interface" that should make implementation of a reiserfs- (or just an fs-) based storage reasonably straightforward, although getting transactional sematics right might be a tad tricky. Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
[Jim Fulton, on Tue, 14 Dec 1999] :: Would you imagine using the reiserfs as a ZODB storage? :: If one was going to use a file system as an object store, :: then the reiserfs would be attractive due to it's effective :: support for small files. Is this what you had in mind? exactly; and its support for journalling :: The ZODB has a well-defined "storage interface" that should :: make implementation of a reiserfs- (or just an fs-) :: based storage reasonably straightforward, although getting :: transactional sematics right might be a tad tricky. yes I posted to the list about this later, but to repeat in this context -- I asked Hans Reiser about this and he had this to say: """ Patrick, I think what you say has a lot of merit, and the areas in which we are extending our functionality will make this easier and easier over time. Features in our next version like item handlers, inheritance, lightweight files/objects, stem compression of filenames, etc., are all aimed at making this more feasible for folks like you to use us as a bottom layer. We would be more than happy to do work on the FS to accomodate your needs for this. That said, don't underestimate the scope of the task, it is a big one, and it is hard for me to promise specific features you will need on a short term basis, I can only say that what you want is in the scope of our long-term vision. I can also say that we welcome all good patches. """
participants (7)
-
Bill Anderson -
David Kankiewicz -
Ethan Fremen -
Jim Fulton -
Patrick Phalen -
Pavlos Christoforou -
technews@egsx.com