I am working through the design of my website as I study and learn Zope. I've started to work on calculating storage requirements to determine server needs as I am also working on building my server. My app can easily be divided into multiple datasets or databases. One of the datasets I am looking at has a potential of 4+ million objects with each object requiring 15-50kb minimum. This dataset can be subdivided. Initially I will not populate the database will all of the items in their full form, but will populate as requests for data come. However I need to develop as if and plan for complete population. This makes for a very large database and one that spans more than one hard drive. Can Zope create and use multiple ZODBs on multiple hard drives? If so, how will such a large dataset affect packing and the creation of the backup file? Will I need to use multiple other database backends such as MySQL or possibly MetaKit? Thanks for any help or thoughts. Jimmie Houchin
On Sat, 22 May 1999, Jimmie Houchin wrote:
I am working through the design of my website as I study and learn Zope.
I've started to work on calculating storage requirements to determine server needs as I am also working on building my server.
My app can easily be divided into multiple datasets or databases. One of the datasets I am looking at has a potential of 4+ million objects with each object requiring 15-50kb minimum. This dataset can be subdivided. Initially I will not populate the database will all of the items in their full form, but will populate as requests for data come. However I need to develop as if and plan for complete population.
This makes for a very large database and one that spans more than one hard drive.
Can Zope create and use multiple ZODBs on multiple hard drives? If so, how will such a large dataset affect packing and the creation of the backup file?
Will I need to use multiple other database backends such as MySQL or possibly MetaKit?
Thanks for any help or thoughts.
Jimmie Houchin
Currently, Zope 1 uses a single file to store its data. That means that it is subject to the ext2 fs limit of 2gb on 32 bit systems. On an alpha, no such limit (ok, there is, but its on the order of 8 million terabytes), but in either case, one file. Zope 2 will use ZODB3, which should permit multiple files to be used for one Zope site. Either way, MySQL or similar techniques can certainly be used to extend the holding capacity of either database. Indeed, the ZODB3 will be able to store its data in an RDBM. I'm sure the DC Gurus could tell you more, as they have worked with large datasets, but they are currently otherwise occupied.
_______________________________________________ Zope maillist - Zope@zope.org http://www.zope.org/mailman/listinfo/zope
(For developer-specific issues, use the companion list, zope-dev@zope.org - http://www.zope.org/mailman/listinfo/zope-dev )
-- Howard Clinton Shaw III - Grum St. Thomas High School #include "disclaimer.h"
At 07:49 AM 5/24/99 -0500, Howard Clinton Shaw III wrote:
On Sat, 22 May 1999, Jimmie Houchin wrote: [snip] Currently, Zope 1 uses a single file to store its data. That means that it is subject to the ext2 fs limit of 2gb on 32 bit systems. On an alpha, no such limit (ok, there is, but its on the order of 8 million terabytes), but in either case, one file. Zope 2 will use ZODB3, which should permit multiple files to be used for one Zope site. Either way, MySQL or similar techniques can certainly be used to extend the holding capacity of either database. Indeed, the ZODB3 will be able to store its data in an RDBM.
I'm sure the DC Gurus could tell you more, as they have worked with large datasets, but they are currently otherwise occupied. -- Howard Clinton Shaw III - Grum St. Thomas High School #include "disclaimer.h"
Thanks for your reply. I had forgotten about the 2gb limit. That definitely affects program design. To think I was only concerned with it outgrowing a single hard drive. :) This will require further thought. I knew the DC folks were at LinuxExpo, so I wasn't worried about a delayed reply. I imagined they are pretty swamped now, considering the well deserved response. :) Thanks again. Jimmie Houchin
Hi, first i'm sorry for my English :-I I wont use ZOPE for a big commercial use case, i must programming a trading system: 1.) Do you thing ZOPE is ready for big use cases ? 2.) I need big databases to ! thanks a lot Steph
Jimmie Houchin wrote:
I am working through the design of my website as I study and learn Zope.
I've started to work on calculating storage requirements to determine server needs as I am also working on building my server.
My app can easily be divided into multiple datasets or databases. One of the datasets I am looking at has a potential of 4+ million objects with each object requiring 15-50kb minimum. This dataset can be subdivided. Initially I will not populate the database will all of the items in their full form, but will populate as requests for data come. However I need to develop as if and plan for complete population.
This makes for a very large database and one that spans more than one hard drive.
Can Zope create and use multiple ZODBs on multiple hard drives?
ZODB 3 (in Zope 2) will be able to do this eventually. The file format used by the ZODB 3 FileStorage will also support very large files on systems with large file support (as described in http://www.python.org/doc/current/lib/posix-large-files.html#l2h-1441). So, perhaps your system will let you create large files split over multiple drives, in which case, you'll be able to use large FileStorage-based databases.
If so, how will such a large dataset affect packing and the creation of the backup file?
If you are using an OS with large file support, then this should not be a problem, as long as you have enough disk space. If you use multiple databases (when Zope supports them) you'll have to pack them individually.
Will I need to use multiple other database backends such as MySQL or possibly MetaKit?
For what? Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
At 10:48 AM 5/24/99 -0400, Jim Fulton wrote:
Jimmie Houchin wrote:
I am working through the design of my website as I study and learn Zope.
I've started to work on calculating storage requirements to determine server needs as I am also working on building my server.
My app can easily be divided into multiple datasets or databases. One of the datasets I am looking at has a potential of 4+ million objects with each object requiring 15-50kb minimum. This dataset can be subdivided. Initially I will not populate the database will all of the items in their full form, but will populate as requests for data come. However I need to develop as if and plan for complete population.
This makes for a very large database and one that spans more than one hard drive.
Can Zope create and use multiple ZODBs on multiple hard drives?
ZODB 3 (in Zope 2) will be able to do this eventually.
Will this be available in Zope 2.0 final, or in a subsequent 2.x release?
The file format used by the ZODB 3 FileStorage will also support very large files on systems with large file support (as described in http://www.python.org/doc/current/lib/posix-large-files.html#l2h-1441).
So, perhaps your system will let you create large files split over multiple drives, in which case, you'll be able to use large FileStorage-based databases.
Will this be available in Zope 2.0 final or even in the betas, or in a subsequent 2.x release? At the above URL it mentions these OSes/systems as being capable: AIX, HPUX, Irix and Solaris Do you know if RedHat Linux running on an AlphaServer such as DS10 would be capable?
If so, how will such a large dataset affect packing and the creation of the backup file?
If you are using an OS with large file support, then this should not be a problem, as long as you have enough disk space. If you use multiple databases (when Zope supports them) you'll have to pack them individually.
Will I need to use multiple other database backends such as MySQL or possibly MetaKit?
For what?
Maybe I wasn't clear here. If Zope didn't do multiple ZODBs then I could resort to using MySQL with multiple databases. Not my preference. It kind of looks like I mean't multiple MySQLs or MetaKits. I would like to stay within the Z System as much as possible. My Zope Zen may have much growing to do, but I'm sold on Zope. :) Y'all are doing great work. Thanks. Jimmie Houchin
Jim
-- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org
Jimmie Houchin wrote:
At 10:48 AM 5/24/99 -0400, Jim Fulton wrote:
Jimmie Houchin wrote:
I am working through the design of my website as I study and learn Zope.
I've started to work on calculating storage requirements to determine server needs as I am also working on building my server.
My app can easily be divided into multiple datasets or databases. One of the datasets I am looking at has a potential of 4+ million objects with each object requiring 15-50kb minimum. This dataset can be subdivided. Initially I will not populate the database will all of the items in their full form, but will populate as requests for data come. However I need to develop as if and plan for complete population.
This makes for a very large database and one that spans more than one hard drive.
Can Zope create and use multiple ZODBs on multiple hard drives?
ZODB 3 (in Zope 2) will be able to do this eventually.
Will this be available in Zope 2.0 final, or in a subsequent 2.x release?
Probably in a subsequent release.
The file format used by the ZODB 3 FileStorage will also support very large files on systems with large file support (as described in http://www.python.org/doc/current/lib/posix-large-files.html#l2h-1441).
So, perhaps your system will let you create large files split over multiple drives, in which case, you'll be able to use large FileStorage-based databases.
Will this be available in Zope 2.0 final or even in the betas, or in a subsequent 2.x release?
It's available now. (Zope 2.0 alpha 1.)
At the above URL it mentions these OSes/systems as being capable: AIX, HPUX, Irix and Solaris
Do you know if RedHat Linux running on an AlphaServer such as DS10 would be capable?
I have no idea.
If so, how will such a large dataset affect packing and the creation of the backup file?
If you are using an OS with large file support, then this should not be a problem, as long as you have enough disk space. If you use multiple databases (when Zope supports them) you'll have to pack them individually.
Will I need to use multiple other database backends such as MySQL or possibly MetaKit?
For what?
Maybe I wasn't clear here. If Zope didn't do multiple ZODBs then I could resort to using MySQL with multiple databases. Not my preference. It kind of looks like I mean't multiple MySQLs or MetaKits.
Hm. You could also try to extend the ZODB 3 FileStorage to split a storage over multiple files. This *might* not be that hard. Maybe a MultiFileStorage. I think this would be pretty worthwhile and would be willing to give alot of advice to make it happen.
I would like to stay within the Z System as much as possible.
You should. :)
My Zope Zen may have much growing to do, but I'm sold on Zope. :)
Yee ha! :) Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
At 12:25 PM 5/24/99 -0400, Jim Fulton wrote:
Jimmie Houchin wrote:
[snip]
Hm. You could also try to extend the ZODB 3 FileStorage to split a storage over multiple files. This *might* not be that hard. Maybe a MultiFileStorage. I think this would be pretty worthwhile and would be willing to give alot of advice to make it happen.
This is an interesting thought. I'll have to up my ZODB Zen substantially. Not a bad thing. This could be beneficial for large databases regardless of the OSes capacity for very large files or hard disk size. I might have to get CVS going on my system so I can look at this. :)
My Zope Zen may have much growing to do, but I'm sold on Zope. :)
Yee ha! :)
Golly, y'all say 'yee ha!' in Virginia too! :) Thanks. Jimmie Houchin [snip]
Jim
-- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org
In article <3.0.5.32.19990524102307.00822100@texoma.net>, Jimmie Houchin <jhouchin@texoma.net> wrote:
At 10:48 AM 5/24/99 -0400, Jim Fulton wrote:
The file format used by the ZODB 3 FileStorage will also support very large files on systems with large file support (as described in http://www.python.org/doc/current/lib/posix-large-files.html#l2h-1441).
At the above URL it mentions these OSes/systems as being capable: AIX, HPUX, Irix and Solaris
Do you know if RedHat Linux running on an AlphaServer such as DS10 would be capable?
If you're looking for an open source OS for a server with bigtime storage requirements and want to avoid silly limits like 2GB files and 127M swap partitions (though maybe that one was fixed in 2.2?), you might want to look at NetBSD. NetBSD has had 64 bit off_t's since 1.0 (1994) on both 32 and 64bit platforms. I know of a NetBSD box at NASA Ames with a ~155GB filesystem. I think they had one at ~600GB for a while, before splitting into multiple smaller ones up for testing. You can store a single 155GB file on it, if you want (though at the 30MB/s that the storage it's on gets, it would take you and hour and a half to write it!). That doesn't even hit the tripple-indirect-block code (at ~256TB), which has been tested (with sparse files) to work fine. While I was checking the details on this, someone else chimed in that he'd played with 18GB files on his machine, and someone made a 21G file on the spot just for fun :-) The other BSD's should at least theoretically support this as well, but I know it's actually been used on NetBSD.
At 08:04 PM 5/24/99 GMT, Ty Sarna wrote:
In article <3.0.5.32.19990524102307.00822100@texoma.net>, Jimmie Houchin <jhouchin@texoma.net> wrote:
At 10:48 AM 5/24/99 -0400, Jim Fulton wrote:
The file format used by the ZODB 3 FileStorage will also support very large files on systems with large file support (as described in http://www.python.org/doc/current/lib/posix-large-files.html#l2h-1441).
At the above URL it mentions these OSes/systems as being capable: AIX, HPUX, Irix and Solaris
Do you know if RedHat Linux running on an AlphaServer such as DS10 would be capable?
If you're looking for an open source OS for a server with bigtime storage requirements and want to avoid silly limits like 2GB files and 127M swap partitions (though maybe that one was fixed in 2.2?), you might want to look at NetBSD.
NetBSD has had 64 bit off_t's since 1.0 (1994) on both 32 and 64bit platforms. I know of a NetBSD box at NASA Ames with a ~155GB filesystem. I think they had one at ~600GB for a while, before splitting into multiple smaller ones up for testing. You can store a single 155GB file on it, if you want (though at the 30MB/s that the storage it's on gets, it would take you and hour and a half to write it!). That doesn't even hit the tripple-indirect-block code (at ~256TB), which has been tested (with sparse files) to work fine.
While I was checking the details on this, someone else chimed in that he'd played with 18GB files on his machine, and someone made a 21G file on the spot just for fun :-)
The other BSD's should at least theoretically support this as well, but I know it's actually been used on NetBSD.
Thanks, That's great information. I was really hope to build my server out of reasonably common hardware so as to keep costs down. I also was hoping to avoid proprietary hardware and software. I will look into the various BSDs. BSD is not nearly as well supported on the Mac outside of OS X. I use the Mac, LinuxPPC, for development. It would be nice if Linux would patch this tho'. Jimmie Houchin
participants (5)
-
Howard Clinton Shaw III -
Jim Fulton -
Jimmie Houchin -
Riedel -
tsarnaļ¼ endicor.com