any limits on object number?
We are builiding a large portal using Zope. We need to create a large number of objects. The data component of the objects is small, but each object carries lots of metadata. My question is: Is there any limit on the number of objects in a given folder? I am not planning to use any external RDBMS. Are there any known performace isssues when the numbeer of objects increase, particularly when we store them in the same folder? thanks in advance. Nagarjuna
On Wed, 14 Jul 2004 13:46:57 +0530 "Nagarjuna G." <nagarjun@gnowledge.org> wrote:
We are builiding a large portal using Zope. We need to create a large number of objects. The data component of the objects is small, but each object carries lots of metadata. My question is: Is there any limit on the number of objects in a given folder? I am not planning to use any external RDBMS. Are there any known performace isssues when the numbeer of objects increase, particularly when we store them in the same folder?
Normal zope folders should probably not be used to hold more than a few dozen items. They store a list of their children in a single ZODB record, and as their numbers increase so do the size of the transactions that change the folder. Also normal Zope folders do not handle concurrent updates and will thus not perform well when multiple users are adding items to the folder. Shane Hathaway's BTreeFolder2 product solves these problems. It is the thing to use when you want to store large numbers of objects in a single folder. It also handles concurrency much better. hth, -Casey
On Wed, 2004-07-14 at 18:55, Casey Duncan wrote:
On Wed, 14 Jul 2004 13:46:57 +0530 "Nagarjuna G." <nagarjun@gnowledge.org> wrote:
We are builiding a large portal using Zope. We need to create a large number of objects. The data component of the objects is small, but each object carries lots of metadata. My question is: Is there any limit on the number of objects in a given folder? I am not planning to use any external RDBMS. Are there any known performace isssues when the numbeer of objects increase, particularly when we store them in the same folder?
Normal zope folders should probably not be used to hold more than a few dozen items. They store a list of their children in a single ZODB record, and as their numbers increase so do the size of the transactions that change the folder. Also normal Zope folders do not handle concurrent updates and will thus not perform well when multiple users are adding items to the folder.
Shane Hathaway's BTreeFolder2 product solves these problems. It is the thing to use when you want to store large numbers of objects in a single folder. It also handles concurrency much better.
hth,
thanks to all the responses. I will test with BTreeFolder2, and subfoldering with first character of id as another subfolder2, so that working with ZMI will also be possible without further scripting or customized views. i will report the results back to the list. Nagarjuna
Hello all,
thanks to all the responses. I will test with BTreeFolder2, and subfoldering with first character of id as another subfolder2, so that working with ZMI will also be possible without further scripting or customized views. i will report the results back to the list.
just an additional info: I run into this problem recently... I've got something around 40k objects to insert into ZODB. I started doing some tests with normal Folders, but I discarded it completely in favor of BTreeFolders. I got information that BTreeFolder was already tested with more than 400k objects... and that made me happy! :-) But (there's always a 'but'!)... Even using BTreeFolder to store all objects I was getting >300s delay to show a single object (ok, it's an archetypes-based one, containing >50 fields, splitted into 7 schematas, with lots of fancy stuff...). So I made a directory hash structure based on UID from each object (an AT UID is md5, so we have a hex base). Using this hash structure, with 16 divisions, I got a better performance (but still far from acceptable): 'only' ~100s to show a single object... :-( So I decided to create another level for the hash structure: now each folder has another 16 folder inside it. This time I was getting a delay of ~20s... As you're already thinking... it's time for another hash level. Now, with 16^3 additional BTreeFolders to split all my objects I got an acceptable performance: ~3s... The path is uggly as hell, f/f0/f0a/f0a5aac38aeff101b3168f2592dd879b, but at least the system is usable... Resuming what I've learned: don't abuse of BTreeFolder, hash your content and live happy forever... ;-) PS: The server is a modest PIII 1.2GHz, 1GB RAM with a 160MB/s SCSI controller running only one instance of Zope 2.7.1 without ZEO. HTH, -- Dorneles Treméa Caxias do Sul - RS - Brasil +55 54 9114 9312 - UIN: 2413568 X3ng Web Technology <http://www.x3ng.com.br> -----BEGIN GEEK CODE BLOCK----- Version: 3.12 GCS/IT d- s:->: a25 C+++ UBL++++$ P--- L++ E-- W+++ N++ o? K? w+ O M+ V-- PS+ PE- Y-- PGP++ t+ 5 X++ R+ tv+ b(++) DI+ D++ G+>+++ e++>++++ h---- r+++ y+++** ------END GEEK CODE BLOCK------
On Mon, 2004-07-19 at 11:38, Dorneles Treméa wrote:
Resuming what I've learned: don't abuse of BTreeFolder, hash your content and live happy forever... ;-)
PS: The server is a modest PIII 1.2GHz, 1GB RAM with a 160MB/s SCSI controller running only one instance of Zope 2.7.1 without ZEO.
Thanks a lot. Your report that performance enhances due to hashing is encouraging the decision we took. In our case we already began using BTreeFolder2, havent tried creating the large object base yet (will do this week), but we doing something like what you did. we are creating alphabetized subfolders, by taking the first char of each id. Are there any good popular hashing algorithms? If we have such a algo then further subfoldering could be made automatic after a certain number of objects increase in each folder. This way large scalable databases can be created in Zope. Nagarjuna
Hi Nagarjuna,
Thanks a lot. Your report that performance enhances due to hashing is encouraging the decision we took. In our case we already began using BTreeFolder2, havent tried creating the large object base yet (will do this week), but we doing something like what you did. we are creating alphabetized subfolders, by taking the first char of each id. Are there any good popular hashing algorithms? If we have such a algo then further subfoldering could be made automatic after a certain number of objects increase in each folder. This way large scalable databases can be created in Zope.
well, a hash is a... hash! :-) You can use any deterministic function to populate your buckets... Just choose one who fits your needs. The current code that I'm using is available[1], just keep in mind that it was written after 2 badly slept nights... ;-) [1]http://cvs.x3ng.com.br/cgi-bin/viewcvs.cgi/Recria/utils.py?rev=HEAD&view=aut... Regards, -- Dorneles Treméa Caxias do Sul - RS - Brasil +55 54 9114 9312 - UIN: 2413568 X3ng Web Technology <http://www.x3ng.com.br> -----BEGIN GEEK CODE BLOCK----- Version: 3.12 GCS/IT d- s:->: a25 C+++ UBL++++$ P--- L++ E-- W+++ N++ o? K? w+ O M+ V-- PS+ PE- Y-- PGP++ t+ 5 X++ R+ tv+ b(++) DI+ D++ G+>+++ e++>++++ h---- r+++ y+++** ------END GEEK CODE BLOCK------
Even using BTreeFolder to store all objects I was getting >300s delay to show a single object (ok, it's an archetypes-based one, containing >50 fields, splitted into 7 schematas, with lots of fancy stuff...). So I made a directory hash structure based on UID from each object (an AT UID is md5, so we have a hex base).
My own experience with BTreeFolder2 does not support your diagnosis. I do not believe BTreeFolder2 is the problem here. On a CMF-based CMS that I helped develop the largest BTreeFolder2-based containers held ca. 60,000 items at the top level, meaning we did not use a directory structure to restrict the number of items per folder. There was zero delay retrieving singe items and even stepping into the ZMI where it shows 1000 ids at a time was sub-2 second response time. jens
thanks to all the responses. I will test with BTreeFolder2, and subfoldering with first character of id as another subfolder2, so that working with ZMI will also be possible without further scripting or customized views. i will report the results back to the list.
just an additional info: I run into this problem recently...
I've got something around 40k objects to insert into ZODB.
I started doing some tests with normal Folders, but I discarded it completely in favor of BTreeFolders. I got information that BTreeFolder was already tested with more than 400k objects... and that made me happy! :-)
But (there's always a 'but'!)...
Even using BTreeFolder to store all objects I was getting >300s delay to show a single object (ok, it's an archetypes-based one, containing >50 fields, splitted into 7 schematas, with lots of fancy stuff...). So I made a directory hash structure based on UID from each object (an AT UID is md5, so we have a hex base).
Using this hash structure, with 16 divisions, I got a better performance (but still far from acceptable): 'only' ~100s to show a single object... :-(
So I decided to create another level for the hash structure: now each folder has another 16 folder inside it. This time I was getting a delay of ~20s...
As you're already thinking... it's time for another hash level. Now, with 16^3 additional BTreeFolders to split all my objects I got an acceptable performance: ~3s...
The path is uggly as hell, f/f0/f0a/f0a5aac38aeff101b3168f2592dd879b, but at least the system is usable...
Resuming what I've learned: don't abuse of BTreeFolder, hash your content and live happy forever... ;-)
I very much doubt you solved the problem you think you solved. Access time to a single object in the hundred of seconds is not a BTreeFolder problem. BTreeFolder is designed to not be a bottleneck for concurrent access and large number of objects. You should have benched (using ZopeProfiler for instance) to find out where time really is spent. Maybe some of your/AT's code does a stupid loop on folder.objectIds() or something. Florent -- Florent Guillaume, Nuxeo (Paris, France) +33 1 40 33 79 87 http://nuxeo.com mailto:fg@nuxeo.com
Hi All, Not sure if this is quite the correct list to post to, but I'm having trouble with Hotfix20040714. We're running zope 2.7, plone 2.0.3, and zwiki 0.32.0. The specific error appears to be the same one that cropped up with the previous hotfix (20040713): METALError: macro "python:here.wikipage_macros().macros['quickaccesskeys']" has incompatible version '1.4', at line 11, column 5 (Also, an error occurred while attempting to render the standard error message.) Thanks for the help! Cliff The complete traceback follows: Site Error An error was encountered while publishing this resource. METALError Sorry, a site error occurred. Traceback (innermost last): * Module ZPublisher.Publish, line 163, in publish_module_standard * Module Products.PlacelessTranslationService.PatchStringIO, line 51, in new_publish * Module ZPublisher.Publish, line 127, in publish * Module Zope.App.startup, line 203, in zpublisher_exception_hook * Module ZPublisher.Publish, line 100, in publish * Module ZPublisher.mapply, line 88, in mapply * Module ZPublisher.Publish, line 40, in call_object * Module Products.ZWiki.ZWikiPage, line 220, in __call__ * Module Products.ZWiki.ZWikiPage, line 233, in render * Module Products.ZWiki.pagetypes.stx, line 94, in render * Module Products.ZWiki.UI, line 209, in addSkinTo * Module Shared.DC.Scripts.Bindings, line 306, in __call__ * Module Shared.DC.Scripts.Bindings, line 343, in _bindAndExec * Module Products.CMFCore.FSPageTemplate, line 191, in _exec * Module Products.CMFCore.FSPageTemplate, line 124, in pt_render * Module Products.PageTemplates.PageTemplate, line 96, in pt_render <FSPageTemplate at /lab/wikipage used for /lab/wiki/FrontPage> * Module TAL.TALInterpreter, line 189, in __call__ * Module TAL.TALInterpreter, line 233, in interpret * Module TAL.TALInterpreter, line 663, in do_useMacro * Module TAL.TALInterpreter, line 233, in interpret * Module TAL.TALInterpreter, line 408, in do_optTag_tal * Module TAL.TALInterpreter, line 393, in do_optTag * Module TAL.TALInterpreter, line 388, in no_tag * Module TAL.TALInterpreter, line 233, in interpret * Module TAL.TALInterpreter, line 694, in do_defineSlot * Module TAL.TALInterpreter, line 233, in interpret * Module TAL.TALInterpreter, line 408, in do_optTag_tal * Module TAL.TALInterpreter, line 393, in do_optTag * Module TAL.TALInterpreter, line 388, in no_tag * Module TAL.TALInterpreter, line 233, in interpret * Module TAL.TALInterpreter, line 642, in do_defineMacro * Module TAL.TALInterpreter, line 233, in interpret * Module TAL.TALInterpreter, line 686, in do_defineSlot * Module TAL.TALInterpreter, line 233, in interpret * Module TAL.TALInterpreter, line 656, in do_useMacro METALError: macro "python:here.wikipage_macros().macros['quickaccesskeys']" has incompatible version '1.4', at line 11, column 5 (Also, an error occurred while attempting to render the standard error message.) Troubleshooting Suggestions * The URL may be incorrect. * The parameters passed to this resource may be incorrect. * A resource that this resource relies on may be encountering an error. For more detailed information about the error, please refer to error log. If the error persists please contact the site maintainer. Thank you for your patience. On Mon, 19 Jul 2004, Florent Guillaume wrote:
thanks to all the responses. I will test with BTreeFolder2, and subfoldering with first character of id as another subfolder2, so that working with ZMI will also be possible without further scripting or customized views. i will report the results back to the list.
just an additional info: I run into this problem recently...
I've got something around 40k objects to insert into ZODB.
I started doing some tests with normal Folders, but I discarded it completely in favor of BTreeFolders. I got information that BTreeFolder was already tested with more than 400k objects... and that made me happy! :-)
But (there's always a 'but'!)...
Even using BTreeFolder to store all objects I was getting >300s delay to show a single object (ok, it's an archetypes-based one, containing >50 fields, splitted into 7 schematas, with lots of fancy stuff...). So I made a directory hash structure based on UID from each object (an AT UID is md5, so we have a hex base).
Using this hash structure, with 16 divisions, I got a better performance (but still far from acceptable): 'only' ~100s to show a single object... :-(
So I decided to create another level for the hash structure: now each folder has another 16 folder inside it. This time I was getting a delay of ~20s...
As you're already thinking... it's time for another hash level. Now, with 16^3 additional BTreeFolders to split all my objects I got an acceptable performance: ~3s...
The path is uggly as hell, f/f0/f0a/f0a5aac38aeff101b3168f2592dd879b, but at least the system is usable...
Resuming what I've learned: don't abuse of BTreeFolder, hash your content and live happy forever... ;-)
I very much doubt you solved the problem you think you solved. Access time to a single object in the hundred of seconds is not a BTreeFolder problem. BTreeFolder is designed to not be a bottleneck for concurrent access and large number of objects.
You should have benched (using ZopeProfiler for instance) to find out where time really is spent. Maybe some of your/AT's code does a stupid loop on folder.objectIds() or something.
Florent
-- Florent Guillaume, Nuxeo (Paris, France) +33 1 40 33 79 87 http://nuxeo.com mailto:fg@nuxeo.com _______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
On Mon, 19 Jul 2004 09:48:00 -0700 (PDT), C. Olmsted <cliffo@u.washington.edu> wrote:
Not sure if this is quite the correct list to post to, but I'm having trouble with Hotfix20040714. We're running zope 2.7, plone 2.0.3, and zwiki 0.32.0.
Can you test with the 2.7.2 release candidate? This very much sounds similar, but all occurances of TAL_VERSION in the Zope 2.7.0 and 2.7.1 sources are corrected in the 20040714 hotfix. The only ways I can think of to trigger this problem with the hotfix installed is to import compile templates containing macros before the hotfix is loaded during product initialization (reasonably possible), or for someone else to import TAL_VERSION before the hotfix is loaded and generate TAL bytecode themselves (highly unlikely!). It would be interesting to know if the macros are from a PageTemplateFile instance. -Fred -- Fred L. Drake, Jr. <fdrake at gmail.com>
participants (7)
-
C. Olmsted -
Casey Duncan -
Dorneles Treméa -
Florent Guillaume -
Fred Drake -
Jens Vagelpohl -
Nagarjuna G.