Re: [Zope] need advice on mass data processing
Sorry Andreas.. My product is rewritten in python placed in the product folder. So yes, it is a instance of a class. In the future, I will be storing more data into one instance from other dbase file (The total columns may vary). I'm still in the developing stage and this is just a test run to know that is the length of processing time I'm look at. I want each entry to be an instance because I'm planning to create other interactive functions (edit and query(ZCatalog maybe, I'm not sure yet) and more ) for particular info. I've never thought of using a BTree because I don't know enough about it. I'll look into it but will BTree still be a better choice than making instance if I'm going to make interactive functions? Any other suggestions? ----- Original Message ---- From: Andreas Jung <lists@zopyx.com> To: Allen Huang <swapp0@yahoo.com>; Zope <zope@zope.org> Sent: Tuesday, January 9, 2007 1:55:34 PM Subject: Re: [Zope] need advice on mass data processing --On 8. Januar 2007 19:28:32 -0800 Allen Huang <swapp0@yahoo.com> wrote:
I have a data file that has over 110000 entry of 3 column data (string, float, float) currently I have written my program so it will do an entry by entry processing with zope. This operation is like this 1. read data (the data file) 2. create product (a python product that store three field data: one string and two float data) 3. update product (update the three field entries)
Please name things the right way. A "Product" is basically a Zope/Python package that contains definitions of classes, scripts, templates etc. You mean instances of a particular class?
when I first tried it out with the first 1000 entries it took about 30 seconds. That means its going to take 50 ~ 60 minutes for 110000 entries.
You're creating 110k instances for storing a string and two floats? If yes, that's stupid idea. You can persistent large amounts of data within a single instances by using Zope BTrees.
It not every day that you have to process over 110000 data entries but processing over 60 minutes is still kind of long.
What kind of processing? -aj __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
--On 9. Januar 2007 01:05:54 -0800 Allen Huang <swapp0@yahoo.com> wrote:
Sorry Andreas..
My product is rewritten in python placed in the product folder. So yes, it is a instance of a class.
In the future, I will be storing more data into one instance from other dbase file (The total columns may vary).
I'm still in the developing stage and this is just a test run to know that is the length of processing time I'm look at. I want each entry to be an instance because I'm planning to create other interactive functions (edit and query(ZCatalog maybe, I'm not sure yet) and more ) for particular info.
I've never thought of using a BTree because I don't know enough about it. I'll look into it but will BTree still be a better choice than making instance if I'm going to make interactive functions?
Sorry, but you are misusing Zope. Put your data into a RDBMS and be the happiest man in the world. The ZODB is not a data toilet. Nothing more to add from my side on this particular issue. -aj
+-------[ Andreas Jung ]---------------------- | | Sorry, but you are misusing Zope. Put your data into a RDBMS and be the | happiest man in the world. The ZODB is not a data toilet. Nothing more | to add from my side on this particular issue. I concur. -- Andrew Milton akm@theinternet.com.au
----- Original Message ----- From: "Andrew Milton" <akm@theinternet.com.au> To: "Andreas Jung" <lists@zopyx.com> Cc: "Zope" <zope@zope.org> Sent: Tuesday, January 09, 2007 4:21 AM Subject: Re: [Zope] need advice on mass data processing
+-------[ Andreas Jung ]---------------------- | | Sorry, but you are misusing Zope. Put your data into a RDBMS and be the | happiest man in the world. The ZODB is not a data toilet. Nothing more | to add from my side on this particular issue.
I concur.
One caveat though... if you want to be able to do a text search on the string portion of your data, then ZCatalog would be an appropriate tool. Jonathan
--On 9. Januar 2007 08:21:18 -0500 Jonathan <dev101@magma.ca> wrote:
----- Original Message ----- From: "Andrew Milton" <akm@theinternet.com.au> To: "Andreas Jung" <lists@zopyx.com> Cc: "Zope" <zope@zope.org> Sent: Tuesday, January 09, 2007 4:21 AM Subject: Re: [Zope] need advice on mass data processing
+-------[ Andreas Jung ]---------------------- | | Sorry, but you are misusing Zope. Put your data into a RDBMS and be the | happiest man in the world. The ZODB is not a data toilet. Nothing more | to add from my side on this particular issue.
I concur.
One caveat though... if you want to be able to do a text search on the string portion of your data, then ZCatalog would be an appropriate tool.
Allmost all RDBMSes provide meanhile some kind of fulltext support out-of-the-box - even MySQL. Andreas
Aloha, Sounds like your data case/data model is better suited to a SQL database (Relational DB instead of Object DB). Zope works with MySQL, PostgreSQL etc. via ZSQLMethods. Look into using those, with your existing data going into a SQL DB (or maybe it is already?), if you want/need to access and manipulate the data from within zope applications. Then you also have the option of accessing that data via any app/platform that can talk to your SQL DB. I have a project I am working on where a portion of the data is very RDB-type data, while the rest is more zope/object content-management stuff. So the RDB-type data will be in a MySQL DB, and, accessible through the same zope-based platform that also provides the CMS for document/object management. Meanwhile if other stakeholders now or later want access to the RDB-type data they don't have to use zope, they can use whatever they want that will talk to MySQL. cheers, John S. Allen Huang wrote:
Sorry Andreas..
My product is rewritten in python placed in the product folder. So yes, it is a instance of a class.
In the future, I will be storing more data into one instance from other dbase file (The total columns may vary).
I'm still in the developing stage and this is just a test run to know that is the length of processing time I'm look at. I want each entry to be an instance because I'm planning to create other interactive functions (edit and query(ZCatalog maybe, I'm not sure yet) and more ) for particular info.
I've never thought of using a BTree because I don't know enough about it. I'll look into it but will BTree still be a better choice than making instance if I'm going to make interactive functions?
Any other suggestions?
----- Original Message ---- From: Andreas Jung <lists@zopyx.com> To: Allen Huang <swapp0@yahoo.com>; Zope <zope@zope.org> Sent: Tuesday, January 9, 2007 1:55:34 PM Subject: Re: [Zope] need advice on mass data processing
--On 8. Januar 2007 19:28:32 -0800 Allen Huang <swapp0@yahoo.com> wrote:
I have a data file that has over 110000 entry of 3 column data (string, float, float) currently I have written my program so it will do an entry by entry processing with zope. This operation is like this 1. read data (the data file) 2. create product (a python product that store three field data: one string and two float data) 3. update product (update the three field entries)
Please name things the right way. A "Product" is basically a Zope/Python package that contains definitions of classes, scripts, templates etc.
You mean instances of a particular class?
when I first tried it out with the first 1000 entries it took about 30 seconds. That means its going to take 50 ~ 60 minutes for 110000 entries.
You're creating 110k instances for storing a string and two floats? If yes, that's stupid idea.
You can persistent large amounts of data within a single instances by using Zope BTrees.
It not every day that you have to process over 110000 data entries but processing over 60 minutes is still kind of long.
What kind of processing?
-aj
__________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
------------------------------------------------------------------------
_______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
-- John Schinnerer - MA, Whole Systems Design ------------------------------------------ - Eco-Living - Whole Systems Design Services People - Place - Learning - Integration john@eco-living.net http://eco-living.net
Allen Huang wrote at 2007-1-9 01:05 -0800:
... I've never thought of using a BTree because I don't know enough about it. I'll look into it but will BTree still be a better choice than making instance if I'm going to make interactive functions?
BTrees provide an efficient means to store mappings key --> anything In another message, I strongly recommended that you use a "BTreeFolder2" (rather than a standard "Folder") to contain your mass data. It uses internally a "BTree" (in fact, two of them) as the name suggests. If you have huge amounts of simple data, then it can be wise to store the bare data and wrap it with an intelligent class only when accessed. In this case, the most efficient way woult be that "anything" above were a tuple of just your data. Instead of a raw BTree, you would use your own container type that uses a "BTree" to store the mass data but wraps the data with your intelligent wrapper class on access. You can look at "BTreeFolder2" on how this may work. It does not wrap the data in a new (intelligent) class (which is trivial) but it wraps it in the acquisition context. -- Dieter
participants (6)
-
Allen Huang -
Andreas Jung -
Andrew Milton -
Dieter Maurer -
John Schinnerer -
Jonathan