Re: [Zope-dev] Memory problems with ParsedXML ?
Godefroid Chapelle <gotcha@swing.be> writes:
Hi,
I have imported about 2500 xml files in ParsedXML objects through the use of :
manage_addProduct['ParsedXML'].manage_addParsedXML(id, '', xmlString)
This works perfectly.
But when trying to browse the folder containing the ParsedXML instances, a whole lot of memory is used by Zope and it took my machine down.
Have I done something wrong or does this info help you debug the product ?
Hm, it's been pointed out to me that the inefficiency of the get_size method of ParsedXML is compounded with many instances because the standard management interface uses it. It's possible that some stuff isn't getting garbage collected until the transaction is done, and calling get_size on many ParsedXML instances in one transaction is the problem. I'll look into it. I didn't think that anyone would want to see a management screen with 2500 ParsedXML instances - or 2500 instances of anything - at one time. Have you tried using a BTree folder? Another thing to keep in mind is that the ParsedXML *product* is a DOM tree and a management interface around it. The management interface only gives you a convenient UI and a place to store a few bits that the DOM doesn't know about (content type, namespace usage, etc.). It's always more efficient to just use the DOM tree and related utilities (parsing, printing), although it is less convenient - you have to know a little bit more about what you're doing. Do you need each of those 2500 product instances? See the createDOMDocument method to create a DOM document without the management interface. It's easy to wrap a ParsedXML product around an existing DOM tree - see the initFromDOMDocument method. Currently, you have to create the persistent ParsedXML product which wraps the DOM Document node, and install that product in a Zope ObjectManager (or subclass like Folder) somewhere, to avoid a few warts. Someday I'd like to be able to just instantiate a management wrapper when I want a management interface, and then just throw it away, without bothering to add it to a folder. -- Karl Anderson karl@digicool.com
Karl Anderson a écrit :
Hm, it's been pointed out to me that the inefficiency of the get_size method of ParsedXML is compounded with many instances because the standard management interface uses it.
It's possible that some stuff isn't getting garbage collected until the transaction is done, and calling get_size on many ParsedXML instances in one transaction is the problem. I'll look into it.
I didn't think that anyone would want to see a management screen with 2500 ParsedXML instances - or 2500 instances of anything - at one time. Have you tried using a BTree folder?
Another thing to keep in mind is that the ParsedXML *product* is a DOM tree and a management interface around it. The management interface only gives you a convenient UI and a place to store a few bits that the DOM doesn't know about (content type, namespace usage, etc.). It's always more efficient to just use the DOM tree and related utilities (parsing, printing), although it is less convenient - you have to know a little bit more about what you're doing. Do you need each of those 2500 product instances? See the createDOMDocument method to create a DOM document without the management interface.
It's easy to wrap a ParsedXML product around an existing DOM tree - see the initFromDOMDocument method. Currently, you have to create the persistent ParsedXML product which wraps the DOM Document node, and install that product in a Zope ObjectManager (or subclass like Folder) somewhere, to avoid a few warts. Someday I'd like to be able to just instantiate a management wrapper when I want a management interface, and then just throw it away, without bothering to add it to a folder.
-- Karl Anderson karl@digicool.com
Thanks answering so quickly... I am still a newbie and will check BTree Folder and the API you point to me. -- Godefroid Chapelle BubbleNet sprl rue Victor Horta, 30 1348 Louvain-la-Neuve Belgium Tel 010 457490 Mob 0477 363942 TVA 467 093 008 RC Niv 49849
Karl Anderson a écrit :
I didn't think that anyone would want to see a management screen with 2500 ParsedXML instances - or 2500 instances of anything - at one time. Have you tried using a BTree folder?
BTree Folder is working very nicely...
Another thing to keep in mind is that the ParsedXML *product* is a DOM tree and a management interface around it. The management interface only gives you a convenient UI and a place to store a few bits that the DOM doesn't know about (content type, namespace usage, etc.). It's always more efficient to just use the DOM tree and related utilities (parsing, printing), although it is less convenient - you have to know a little bit more about what you're doing. Do you need each of those 2500 product instances? See the createDOMDocument method to create a DOM document without the management interface.
I am trying to use createDOMDocument, calling it from an external method: from Products.ParsedXML import ParsedXML def accessXml(document): return ParsedXML.createDOMDocument(document) When using the document returned in a python script, it seems that Zope security prevents me from accessing any DOM attributes. Am I doing something wrong or am I forced to do everything through external methods ?
-- Karl Anderson karl@digicool.com
Thanks -- Godefroid Chapelle BubbleNet sprl rue Victor Horta, 30 1348 Louvain-la-Neuve Belgium Tel 010 457490 Mob 0477 363942 TVA 467 093 008 RC Niv 49849
participants (2)
-
Godefroid Chapelle -
Karl Anderson