RE: [Zope] XML in Zope

18 May 2002

      I would think ParsedXML+DOM in Python Scripts+DTML/ZPT would work here???
Contrary to popular belief, I don't think ParsedXML is "dead" - it works
quite fine, thank you, for all the DOM L2 stuff I do, which is likely why
there hasn't been a lot of development... and I've seen that Martijn Faassen
has been doing some small improvements in CVS in the last few days.  All
this assumes small documents like forms and news-articles, not book-sized
docs, since nodes in a ParsedXML doc are not manageable, so each edit
reqires the whole DOM from the document object downward... I've used this
type of setup for interactive form applications as well as XML-stored
content applications and been quite happy, but I've been doing a lot of
DOM-in-python-scripts coding; it is a bit tedious to code for, there is a
lot of recursion to think about, etc.... but once you get it right, it just
works...  Alternately, I think you could put any Python XSLT on top of this
with a few external methods, though I think it's a bit hackish since you
would be printing XML, outputting it to a string, and having it parsed again
(but, hey, if it works...) - as long as you are not indexing the results of
XSLT in a Catalog index for a bunch of docs!

As an aside, it sounds like you need to make a lot of revisions to your
documents, and if you store them in the ODB it might create bloat in a
versioned ODB unless you pack frequently.  A brief thought for you: create
something that will do application-level versioning and use non-undo mounted
storage for a repository for your documents...

Consider creating a folderish type whose instances contain a few specialized
objects: one is a ParsedXML DOM object that contains the current XML for
your document; your folderish base object could optionally be coded to act
as a proxy to this contained DOM.  Also, create a simple type that contains
previous revisions of the XML as string-based properties, rather than
storing a full DOM for each revision.  Alternately, use compression methods
to compress these archived revisions XML and use file objects containing the
compressed text.  Then, keep track of things with a state table of some
kind.  The downside to this is that you would have overhead in converting
XML to DOM every time you loaded up a previous revision, but from a storage
perspective, this should work quite well.  A caveat: I haven't done exactly
this before, but, after serious thought, this is the approach I plan to take
with a project I am starting work on for XML-based content types in our
content management system, and my hunch is that this will work quite well.

Cheers,
Sean

-----Original Message-----
From: Dan Shafer [mailto:pydan@danshafer.com]
Sent: Wednesday, May 15, 2002 9:45 AM
To: Chris Withers
Cc: Dan Shafer; zope@zope.org
Subject: Re: [Zope] XML in Zope

OK, I'll be happy to describe the problem. I'll try to be brief but clear.

In the briefest terms I can come up with:

I need a way of storing a document-like object in the ZODB which will be 
sufficiently structured to allow the selective display and replacement of 
its elements while allowing for the immediate display of its contents as a 
nicely formatted document in a browser.

Now for the longer-winded version for those with the patience or curiosity 
to read more.

I am building an application which is quite document-centric. When the 
client interacts with the application, he is building a "record" of an 
interaction with a patient. This interaction can extend over a period of 
days, so it needs to be able to be resumed.

The application now consists entirely of HTML forms which trigger Python 
scripts that initially create and then update a DTML Method object. I embed 
formatting into the DTML Method so that when the user wants to see a report 
of the patient interaction, all I have to do is give him a way to view the 
document. The client *loves* the application as it has developed so far, in 
part because as the code updates the DTML Method during the patient 
interaction, a supervising clinician can see the progress of the 
interaction as it develops. So I prefer not to lose the ability to allow 
the supervisor to see the document the code is generating, but I may have 
to do so.

When the interaction with the patient proceeds linearly and in one session 
- which is most often the case - this system works like a charm. But if for 
any reason the interaction gets interrupted, my scripts, which always 
merely *append* information to the end of the document as each step 
proceeds, are not now capable of a smooth resumption in place. This would 
require, e.g., that information now stored in the DTML Method be parsed out 
to supply initial values to fields on the interactive form that have 
already been completed. That is the minimum I need, but as I have been 
thinking about the design, I've decided I actually need much more than 
that. I believe the best way to accomplish that objective of maximum 
flexibility for the user is to use a more structured approach to the
document.

As I see it, I had three alternatives: (1) parse the text in the DTML 
Method to pull out and rewrite pieces as needed; (2) use properties of the 
DTML Method in addition to text content and link the two; (3) go to XML to 
take advantage of readily available Python techniques for managing the 
structured data while being able to use XSLT to retain the ability to show 
the document in process immediately on demand.

I spent some time looking at the first option and decided it was going to 
be terribly inefficient. I would in essence have to define my own start/end 
tags for each element of the document (which I could use comments for) and 
then manually parse them. Ugly. The second approach also felt inefficient 
because I would be storing all of the information in a document twice, once 
as a document (so that it could be displayed quickly on demand) and once as 
properties (so they could be retrieved and replaced as needed). I have not 
completely discarded that approach but XML seems more promising. I am fully 
aware of the horrendous overhead associated with XML documents but these 
documents are quite small (3-4 pages on average, printed) and always 
identically structured. With Python's excellent XML support, I have already 
gotten some very good routines written to parse such documents. My problem 
is that when I try to translate that code to Zope External Methods, they 
don't work and debugging them in Zope is a nightmare.

I am learning, and I am open to any suggestions or corrections to my 
thinking. I appreciate the group's patience with me and any ideas you have.

At 09:18 AM 5/15/2002 +0100, Chris Withers wrote:
...
Dan Shafer wrote:
...
I had in mind to use DTML documents as the storage mechanism for the
main
...
documents at the core of an application I'm building for a client, but
it
turns out they are going to need to do some things that would be pretty
cumbersome in a DTML document or method. So I'm investigating using XML
for
these documents.
Searching zope.org turns up a lot of stuff about XML but, as far as I
can
tell, only one product: XMLDocument. But it describes itself as out of
date
and replaced by ParsedXML, which, as far as I can tell, hasn't had a
product release yet.
I can't tell from the ZQR which XML implementation it is documenting.
XMLDocument is out of date.
The ZQR is no longer maintained.
ParsedXML is dead as it currently has no maintainer (well, bar Martijn, 
who's busy on the
Zope 3 effort and not subscribed to this list...)
To be honest, if you explained the problem we might be able to suggest an 
alternate
solution...
cheers,
Chris - XML: the world's most inefficient data transfer format
_______________________________________________
Zope maillist  -  Zope@zope.org
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )

RE: [Zope] XML in Zope

sean.upton＠uniontrib.com