I would think ParsedXML+DOM in Python Scripts+DTML/ZPT would work here??? Contrary to popular belief, I don't think ParsedXML is "dead" - it works quite fine, thank you, for all the DOM L2 stuff I do, which is likely why there hasn't been a lot of development... and I've seen that Martijn Faassen has been doing some small improvements in CVS in the last few days. All this assumes small documents like forms and news-articles, not book-sized docs, since nodes in a ParsedXML doc are not manageable, so each edit reqires the whole DOM from the document object downward... I've used this type of setup for interactive form applications as well as XML-stored content applications and been quite happy, but I've been doing a lot of DOM-in-python-scripts coding; it is a bit tedious to code for, there is a lot of recursion to think about, etc.... but once you get it right, it just works... Alternately, I think you could put any Python XSLT on top of this with a few external methods, though I think it's a bit hackish since you would be printing XML, outputting it to a string, and having it parsed again (but, hey, if it works...) - as long as you are not indexing the results of XSLT in a Catalog index for a bunch of docs! As an aside, it sounds like you need to make a lot of revisions to your documents, and if you store them in the ODB it might create bloat in a versioned ODB unless you pack frequently. A brief thought for you: create something that will do application-level versioning and use non-undo mounted storage for a repository for your documents... Consider creating a folderish type whose instances contain a few specialized objects: one is a ParsedXML DOM object that contains the current XML for your document; your folderish base object could optionally be coded to act as a proxy to this contained DOM. Also, create a simple type that contains previous revisions of the XML as string-based properties, rather than storing a full DOM for each revision. Alternately, use compression methods to compress these archived revisions XML and use file objects containing the compressed text. Then, keep track of things with a state table of some kind. The downside to this is that you would have overhead in converting XML to DOM every time you loaded up a previous revision, but from a storage perspective, this should work quite well. A caveat: I haven't done exactly this before, but, after serious thought, this is the approach I plan to take with a project I am starting work on for XML-based content types in our content management system, and my hunch is that this will work quite well. Cheers, Sean -----Original Message----- From: Dan Shafer [mailto:pydan@danshafer.com] Sent: Wednesday, May 15, 2002 9:45 AM To: Chris Withers Cc: Dan Shafer; zope@zope.org Subject: Re: [Zope] XML in Zope OK, I'll be happy to describe the problem. I'll try to be brief but clear. In the briefest terms I can come up with: I need a way of storing a document-like object in the ZODB which will be sufficiently structured to allow the selective display and replacement of its elements while allowing for the immediate display of its contents as a nicely formatted document in a browser. Now for the longer-winded version for those with the patience or curiosity to read more. I am building an application which is quite document-centric. When the client interacts with the application, he is building a "record" of an interaction with a patient. This interaction can extend over a period of days, so it needs to be able to be resumed. The application now consists entirely of HTML forms which trigger Python scripts that initially create and then update a DTML Method object. I embed formatting into the DTML Method so that when the user wants to see a report of the patient interaction, all I have to do is give him a way to view the document. The client *loves* the application as it has developed so far, in part because as the code updates the DTML Method during the patient interaction, a supervising clinician can see the progress of the interaction as it develops. So I prefer not to lose the ability to allow the supervisor to see the document the code is generating, but I may have to do so. When the interaction with the patient proceeds linearly and in one session - which is most often the case - this system works like a charm. But if for any reason the interaction gets interrupted, my scripts, which always merely *append* information to the end of the document as each step proceeds, are not now capable of a smooth resumption in place. This would require, e.g., that information now stored in the DTML Method be parsed out to supply initial values to fields on the interactive form that have already been completed. That is the minimum I need, but as I have been thinking about the design, I've decided I actually need much more than that. I believe the best way to accomplish that objective of maximum flexibility for the user is to use a more structured approach to the document. As I see it, I had three alternatives: (1) parse the text in the DTML Method to pull out and rewrite pieces as needed; (2) use properties of the DTML Method in addition to text content and link the two; (3) go to XML to take advantage of readily available Python techniques for managing the structured data while being able to use XSLT to retain the ability to show the document in process immediately on demand. I spent some time looking at the first option and decided it was going to be terribly inefficient. I would in essence have to define my own start/end tags for each element of the document (which I could use comments for) and then manually parse them. Ugly. The second approach also felt inefficient because I would be storing all of the information in a document twice, once as a document (so that it could be displayed quickly on demand) and once as properties (so they could be retrieved and replaced as needed). I have not completely discarded that approach but XML seems more promising. I am fully aware of the horrendous overhead associated with XML documents but these documents are quite small (3-4 pages on average, printed) and always identically structured. With Python's excellent XML support, I have already gotten some very good routines written to parse such documents. My problem is that when I try to translate that code to Zope External Methods, they don't work and debugging them in Zope is a nightmare. I am learning, and I am open to any suggestions or corrections to my thinking. I appreciate the group's patience with me and any ideas you have. At 09:18 AM 5/15/2002 +0100, Chris Withers wrote:
Dan Shafer wrote:
I had in mind to use DTML documents as the storage mechanism for the
main
documents at the core of an application I'm building for a client, but it turns out they are going to need to do some things that would be pretty cumbersome in a DTML document or method. So I'm investigating using XML for these documents.
Searching zope.org turns up a lot of stuff about XML but, as far as I can tell, only one product: XMLDocument. But it describes itself as out of date and replaced by ParsedXML, which, as far as I can tell, hasn't had a product release yet.
I can't tell from the ZQR which XML implementation it is documenting.
XMLDocument is out of date. The ZQR is no longer maintained. ParsedXML is dead as it currently has no maintainer (well, bar Martijn, who's busy on the Zope 3 effort and not subscribed to this list...)
To be honest, if you explained the problem we might be able to suggest an alternate solution...
cheers,
Chris - XML: the world's most inefficient data transfer format
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )