I would think ParsedXML+DOM in Python Scripts+DTML/ZPT would work here??? Contrary to popular belief, I don't think ParsedXML is "dead" - it works quite fine, thank you, for all the DOM L2 stuff I do, which is likely why there hasn't been a lot of development... and I've seen that Martijn Faassen has been doing some small improvements in CVS in the last few days. All this assumes small documents like forms and news-articles, not book-sized docs, since nodes in a ParsedXML doc are not manageable, so each edit reqires the whole DOM from the document object downward... I've used this type of setup for interactive form applications as well as XML-stored content applications and been quite happy, but I've been doing a lot of DOM-in-python-scripts coding; it is a bit tedious to code for, there is a lot of recursion to think about, etc.... but once you get it right, it just works... Alternately, I think you could put any Python XSLT on top of this with a few external methods, though I think it's a bit hackish since you would be printing XML, outputting it to a string, and having it parsed again (but, hey, if it works...) - as long as you are not indexing the results of XSLT in a Catalog index for a bunch of docs! As an aside, it sounds like you need to make a lot of revisions to your documents, and if you store them in the ODB it might create bloat in a versioned ODB unless you pack frequently. A brief thought for you: create something that will do application-level versioning and use non-undo mounted storage for a repository for your documents... Consider creating a folderish type whose instances contain a few specialized objects: one is a ParsedXML DOM object that contains the current XML for your document; your folderish base object could optionally be coded to act as a proxy to this contained DOM. Also, create a simple type that contains previous revisions of the XML as string-based properties, rather than storing a full DOM for each revision. Alternately, use compression methods to compress these archived revisions XML and use file objects containing the compressed text. Then, keep track of things with a state table of some kind. The downside to this is that you would have overhead in converting XML to DOM every time you loaded up a previous revision, but from a storage perspective, this should work quite well. A caveat: I haven't done exactly this before, but, after serious thought, this is the approach I plan to take with a project I am starting work on for XML-based content types in our content management system, and my hunch is that this will work quite well. Cheers, Sean -----Original Message----- From: Dan Shafer [mailto:pydan@danshafer.com] Sent: Wednesday, May 15, 2002 9:45 AM To: Chris Withers Cc: Dan Shafer; zope@zope.org Subject: Re: [Zope] XML in Zope OK, I'll be happy to describe the problem. I'll try to be brief but clear. In the briefest terms I can come up with: I need a way of storing a document-like object in the ZODB which will be sufficiently structured to allow the selective display and replacement of its elements while allowing for the immediate display of its contents as a nicely formatted document in a browser. Now for the longer-winded version for those with the patience or curiosity to read more. I am building an application which is quite document-centric. When the client interacts with the application, he is building a "record" of an interaction with a patient. This interaction can extend over a period of days, so it needs to be able to be resumed. The application now consists entirely of HTML forms which trigger Python scripts that initially create and then update a DTML Method object. I embed formatting into the DTML Method so that when the user wants to see a report of the patient interaction, all I have to do is give him a way to view the document. The client *loves* the application as it has developed so far, in part because as the code updates the DTML Method during the patient interaction, a supervising clinician can see the progress of the interaction as it develops. So I prefer not to lose the ability to allow the supervisor to see the document the code is generating, but I may have to do so. When the interaction with the patient proceeds linearly and in one session - which is most often the case - this system works like a charm. But if for any reason the interaction gets interrupted, my scripts, which always merely *append* information to the end of the document as each step proceeds, are not now capable of a smooth resumption in place. This would require, e.g., that information now stored in the DTML Method be parsed out to supply initial values to fields on the interactive form that have already been completed. That is the minimum I need, but as I have been thinking about the design, I've decided I actually need much more than that. I believe the best way to accomplish that objective of maximum flexibility for the user is to use a more structured approach to the document. As I see it, I had three alternatives: (1) parse the text in the DTML Method to pull out and rewrite pieces as needed; (2) use properties of the DTML Method in addition to text content and link the two; (3) go to XML to take advantage of readily available Python techniques for managing the structured data while being able to use XSLT to retain the ability to show the document in process immediately on demand. I spent some time looking at the first option and decided it was going to be terribly inefficient. I would in essence have to define my own start/end tags for each element of the document (which I could use comments for) and then manually parse them. Ugly. The second approach also felt inefficient because I would be storing all of the information in a document twice, once as a document (so that it could be displayed quickly on demand) and once as properties (so they could be retrieved and replaced as needed). I have not completely discarded that approach but XML seems more promising. I am fully aware of the horrendous overhead associated with XML documents but these documents are quite small (3-4 pages on average, printed) and always identically structured. With Python's excellent XML support, I have already gotten some very good routines written to parse such documents. My problem is that when I try to translate that code to Zope External Methods, they don't work and debugging them in Zope is a nightmare. I am learning, and I am open to any suggestions or corrections to my thinking. I appreciate the group's patience with me and any ideas you have. At 09:18 AM 5/15/2002 +0100, Chris Withers wrote:
Dan Shafer wrote:
I had in mind to use DTML documents as the storage mechanism for the
main
documents at the core of an application I'm building for a client, but it turns out they are going to need to do some things that would be pretty cumbersome in a DTML document or method. So I'm investigating using XML for these documents.
Searching zope.org turns up a lot of stuff about XML but, as far as I can tell, only one product: XMLDocument. But it describes itself as out of date and replaced by ParsedXML, which, as far as I can tell, hasn't had a product release yet.
I can't tell from the ZQR which XML implementation it is documenting.
XMLDocument is out of date. The ZQR is no longer maintained. ParsedXML is dead as it currently has no maintainer (well, bar Martijn, who's busy on the Zope 3 effort and not subscribed to this list...)
To be honest, if you explained the problem we might be able to suggest an alternate solution...
cheers,
Chris - XML: the world's most inefficient data transfer format
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
Sean.... Thanks for the in-depth answer. My comments are inline. At 03:45 PM 5/18/2002 -0700, sean.upton@uniontrib.com wrote:
I would think ParsedXML+DOM in Python Scripts+DTML/ZPT would work here??? Contrary to popular belief, I don't think ParsedXML is "dead" - it works quite fine, thank you, for all the DOM L2 stuff I do, which is likely why there hasn't been a lot of development... and I've seen that Martijn Faassen has been doing some small improvements in CVS in the last few days. All this assumes small documents like forms and news-articles, not book-sized docs, since nodes in a ParsedXML doc are not manageable, so each edit reqires the whole DOM from the document object downward...
These documents are in fact quite small. Never more than 3-4 pages of fairly sparse text when displayed in HTML and printed.
I've used this type of setup for interactive form applications as well as XML-stored content applications and been quite happy, but I've been doing a lot of DOM-in-python-scripts coding; it is a bit tedious to code for, there is a lot of recursion to think about, etc.... but once you get it right, it just works...
It will undoubtedly take me some time to come up to speed on all this syntactical stuff, but I'm sure I can grok it. The biggest problem I have trying to code Python scripts in Zope is the virtual absence of debugging tools in the environment.
Alternately, I think you could put any Python XSLT on top of this with a few external methods, though I think it's a bit hackish since you would be printing XML, outputting it to a string, and having it parsed again (but, hey, if it works...) - as long as you are not indexing the results of XSLT in a Catalog index for a bunch of docs!
Yeah, probably not essential. If I store the document objects in XML form, I'll have to display them in HTML but there's nothing fancy in them.
As an aside, it sounds like you need to make a lot of revisions to your documents, and if you store them in the ODB it might create bloat in a versioned ODB unless you pack frequently. A brief thought for you: create something that will do application-level versioning and use non-undo mounted storage for a repository for your documents...
Actually, there are not a lot of revisions made to the documents. Typically, there are none! About 10% of the documents involve processes that are for one reason or another disrupted and need to be resumed. Only rarely is editing of previously captured content necessary or even appropriate. So I think the versioning won't be an issue. I've discovered and downloaded XMLKit. I'm just starting to explore it but if anyone has a viewpoint about it I'd love to hear it. I may *still* end up just deciding to use a fodlerish object (actually, a Zope folder as it turns out) with DMTL Methods inside it for the contents of the process record rather than trying to stuff it all into one document. Then I just have the complexity of figuring out how to create a nicely formatted HTML document containing the data in the folder's documents. I suspect that, at least, is manageable. All of this because ZClasses aren't what I'd hoped they'd be. I started in Zope because I really love creating OO designs and apps. Folders as containers and documents as objects containing properties will work OK for this app but in the long run I either have to switch platforms or figure out how to get around whatever the problems and limitations are with ZClasses. I remain teachable!
-----Original Message----- From: Dan Shafer [mailto:pydan@danshafer.com] Sent: Wednesday, May 15, 2002 9:45 AM To: Chris Withers Cc: Dan Shafer; zope@zope.org Subject: Re: [Zope] XML in Zope
OK, I'll be happy to describe the problem. I'll try to be brief but clear.
In the briefest terms I can come up with:
I need a way of storing a document-like object in the ZODB which will be sufficiently structured to allow the selective display and replacement of its elements while allowing for the immediate display of its contents as a nicely formatted document in a browser.
Now for the longer-winded version for those with the patience or curiosity to read more.
I am building an application which is quite document-centric. When the client interacts with the application, he is building a "record" of an interaction with a patient. This interaction can extend over a period of days, so it needs to be able to be resumed.
The application now consists entirely of HTML forms which trigger Python scripts that initially create and then update a DTML Method object. I embed formatting into the DTML Method so that when the user wants to see a report of the patient interaction, all I have to do is give him a way to view the document. The client *loves* the application as it has developed so far, in part because as the code updates the DTML Method during the patient interaction, a supervising clinician can see the progress of the interaction as it develops. So I prefer not to lose the ability to allow the supervisor to see the document the code is generating, but I may have to do so.
When the interaction with the patient proceeds linearly and in one session - which is most often the case - this system works like a charm. But if for any reason the interaction gets interrupted, my scripts, which always merely *append* information to the end of the document as each step proceeds, are not now capable of a smooth resumption in place. This would require, e.g., that information now stored in the DTML Method be parsed out to supply initial values to fields on the interactive form that have already been completed. That is the minimum I need, but as I have been thinking about the design, I've decided I actually need much more than that. I believe the best way to accomplish that objective of maximum flexibility for the user is to use a more structured approach to the document.
As I see it, I had three alternatives: (1) parse the text in the DTML Method to pull out and rewrite pieces as needed; (2) use properties of the DTML Method in addition to text content and link the two; (3) go to XML to take advantage of readily available Python techniques for managing the structured data while being able to use XSLT to retain the ability to show the document in process immediately on demand.
I spent some time looking at the first option and decided it was going to be terribly inefficient. I would in essence have to define my own start/end tags for each element of the document (which I could use comments for) and then manually parse them. Ugly. The second approach also felt inefficient because I would be storing all of the information in a document twice, once as a document (so that it could be displayed quickly on demand) and once as properties (so they could be retrieved and replaced as needed). I have not completely discarded that approach but XML seems more promising. I am fully aware of the horrendous overhead associated with XML documents but these documents are quite small (3-4 pages on average, printed) and always identically structured. With Python's excellent XML support, I have already gotten some very good routines written to parse such documents. My problem is that when I try to translate that code to Zope External Methods, they don't work and debugging them in Zope is a nightmare.
I am learning, and I am open to any suggestions or corrections to my thinking. I appreciate the group's patience with me and any ideas you have.
At 09:18 AM 5/15/2002 +0100, Chris Withers wrote:
Dan Shafer wrote:
I had in mind to use DTML documents as the storage mechanism for the
main
documents at the core of an application I'm building for a client, but it turns out they are going to need to do some things that would be pretty cumbersome in a DTML document or method. So I'm investigating using XML for these documents.
Searching zope.org turns up a lot of stuff about XML but, as far as I can tell, only one product: XMLDocument. But it describes itself as out of date and replaced by ParsedXML, which, as far as I can tell, hasn't had a product release yet.
I can't tell from the ZQR which XML implementation it is documenting.
XMLDocument is out of date. The ZQR is no longer maintained. ParsedXML is dead as it currently has no maintainer (well, bar Martijn, who's busy on the Zope 3 effort and not subscribed to this list...)
To be honest, if you explained the problem we might be able to suggest an alternate solution...
cheers,
Chris - XML: the world's most inefficient data transfer format
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
Dan, <snip>
It will undoubtedly take me some time to come up to speed on all this syntactical stuff, but I'm sure I can grok it. The biggest problem I have trying to code Python scripts in Zope is the virtual absence of debugging tools in the environment.
don't know if this is quite what you meant, but I had some trouble a while back trying to debug python scripts in zope. The problem I found was not being able to get hold of tracebacks at certain points. I made minimal product with just the following in the __init__.py ---- from Products.PythonScripts.Utility import allow_module, allow_class from AccessControl import ModuleSecurityInfo, ClassSecurityInfo from Globals import InitializeClass ModuleSecurityInfo('sys').declarePublic('exc_info') ModuleSecurityInfo('traceback').declarePublic('extract_tb') ---- Now, I can import the relevant stuff into python scripts and get hold of tracebacks. I've got things setup in the various scripts so the import only occurs if I've got debugging turned on. Something like ---- if debug == 1: from sys import exc_info import traceback ---- Just thought that might help. tim
Very helpful, Tim. Thanks! At 12:39 AM 5/19/2002 +0100, Tim Hicks wrote:
Dan,
<snip>
It will undoubtedly take me some time to come up to speed on all this syntactical stuff, but I'm sure I can grok it. The biggest problem I have trying to code Python scripts in Zope is the virtual absence of debugging tools in the environment.
don't know if this is quite what you meant, but I had some trouble a while back trying to debug python scripts in zope. The problem I found was not being able to get hold of tracebacks at certain points. I made minimal product with just the following in the __init__.py
---- from Products.PythonScripts.Utility import allow_module, allow_class from AccessControl import ModuleSecurityInfo, ClassSecurityInfo from Globals import InitializeClass
ModuleSecurityInfo('sys').declarePublic('exc_info') ModuleSecurityInfo('traceback').declarePublic('extract_tb') ----
Now, I can import the relevant stuff into python scripts and get hold of tracebacks. I've got things setup in the various scripts so the import only occurs if I've got debugging turned on. Something like
---- if debug == 1: from sys import exc_info import traceback ----
Just thought that might help.
tim
I have a Python method as follows: results=[] for object in context.objectValues('DTML Document'): results=results.append(object) return results Calling this (from the test tab) results in: Error Type: AttributeError Error Value: 'None' object has no attribute 'append' And yet this is the same as the example on page 170/171 of the Zope book! Have I gone mad, or stupid? I'm using the latest version of Zope Traceback (innermost last): File C:\PROGRA~1\JulianK\lib\python\ZPublisher\Publish.py, line 150, in publish_module File C:\PROGRA~1\JulianK\lib\python\ZPublisher\Publish.py, line 114, in publish File C:\PROGRA~1\JulianK\lib\python\Zope\__init__.py, line 158, in zpublisher_exception_hook (Object: list) File C:\PROGRA~1\JulianK\lib\python\ZPublisher\Publish.py, line 98, in publish File C:\PROGRA~1\JulianK\lib\python\ZPublisher\mapply.py, line 88, in mapply (Object: entryListEntries) File C:\PROGRA~1\JulianK\lib\python\ZPublisher\Publish.py, line 39, in call_object (Object: entryListEntries) File C:\PROGRA~1\JulianK\lib\python\Shared\DC\Scripts\Bindings.py, line 252, in __call__ (Object: entryListEntries) File C:\PROGRA~1\JulianK\lib\python\Shared\DC\Scripts\Bindings.py, line 283, in _bindAndExec (Object: entryListEntries) File C:\PROGRA~1\JulianK\lib\python\Products\PythonScripts\PythonScript.py, line 291, in _exec (Object: entryListEntries) (Info: ({'script': <PythonScript instance at 015545B0>, 'context': <Folder instance at 016B8360>, 'container': <Folder instance at 016B8360>, 'traverse_subpath': []}, (), {}, None)) File Script (Python), line 4, in entryListEntries File C:\PROGRA~1\JulianK\lib\python\AccessControl\ZopeGuards.py, line 47, in guarded_getattr AttributeError: (see above) -- Julian Knight, Senior Consultant WRDC Ltd, Leeds, UK. +44 (0) 113 245 4788 http://www.wrdc.com
Julian Knight wrote:
I have a Python method as follows:
results=[] for object in context.objectValues('DTML Document'): results=results.append(object) return results
wouldn't that be: results=[] for object in context.objectValues('DTML Document'): results.append(object) return results or: results=[] for object in context.objectValues('DTML Document'): results += [object] return results or even with list comprehensions: return [object for object in context.objectValues('DTML Document')] regards Max M
Hi, On Wed, Jun 05, 2002 at 11:12:47AM +0200, Max M wrote:
Julian Knight wrote:
I have a Python method as follows:
results=[] for object in context.objectValues('DTML Document'): results=results.append(object) return results
did I miss something, or the following is ok ? --- CUT --- return context.objectValues(['DTML Document']) --- CUT --- since objectValues already returns a list... bye, Jerome Alet
On Wed, 2002-06-05 at 10:39, Julian Knight wrote:
I have a Python method as follows:
results=[] for object in context.objectValues('DTML Document'): results=results.append(object) return results
Calling this (from the test tab) results in:
Error Type: AttributeError Error Value: 'None' object has no attribute 'append'
And yet this is the same as the example on page 170/171 of the Zope book!
hi, you're wrong... ZopeBook is a little bit different: results=[] for object in context.objectValues('DTML Document'): results.append(object) <= Change this line!!! return results reason: results.append(object) returns None, but appends a object to results! greetings, maik -- maik jablonski \ -;~~;- / www.dzug.org universitaet bielefeld > (@@) < www.zfl.uni-bielefeld.de zentrum fuer lehrerbildung / RUN <> GNU \ www.sachunterricht-online.de
participants (7)
-
Dan Shafer -
Jerome Alet -
Julian Knight -
Maik Jablonski -
Max M -
sean.upton@uniontrib.com -
Tim Hicks