[Zope] doing a "diff" of two XML export files.
Robert Leftwich
robert@leftfieldcorp.com
Wed, 15 Sep 1999 18:02:20 +1000
If you are not fundamentally opposed to Java, see
http://www.alphaWorks.ibm.com/tech/xmltreediff
from which I obtained the following text :
XML TreeDiff is a package of beans that provide the ability to
efficiently differentiate and update
DOM trees, just like diff and patch differentiate and update data
files.
XMLTreeDiff is a set of Java beans designed to perform fast
differentiation and update of DOM
structures. XMLTreeDiff works in many ways like diff and patch.
However, rather than differentiating
the file representations of the documents (that is, the XML files),
XMLTreeDiff runs directly on the
DOM's themselves. This way, the differences are directly expressed in
terms of native tree
operations like change node, delete node or insert node, rather than
line mismatches. The
advantages of this approach are several: it avoids the need to convert
the DOM trees to file format
prior to comparing them; with that, it eliminates the 'false negative'
reports caused by dissimilar file
representations of equivalent DOM structures; finally it avoids the
need to infer the tree structural
meaning of a line difference report.
It is well known that the process of differentiating two labeled tree
structures is an expensive one,
with a cost (for ordered trees) at least quadratic in the number of
tree nodes. This has traditionally
held developers back from using direct tree to tree comparison tools.
XMLTreeDiff uses an optimal
tree differentiating algorithm together with a fast subtree matching
procedure to make direct tree
differentiation a practical tool. XMLTreeDiff is particularly well
suited to do version management of
XML documents and tree structured data in general.
XMLTreeDiff is packaged as a set of Java beans, and allows both
command line and programming
access to the differentiation and updated tools. It includes a
differentiating tool, and update tool,
and a graphical user interface to display the differences directly on
the compared trees. Difference
reports are output in XML format as well.
Anthony Baxter wrote:
>
> It seems to me that it would be a Cool Thing to be able to take
> two Zope XML export files, say from different days, and feed it into
> something that would read the two, and produce a "change file" which
> specifies what has changed (think "diff" for XML).
>
> This would be useful for
>
> ZClass based packages, and making new versions of same.
> Tracking what's done on a site on a day-by-day basis (something
> that would produce a list at the end of the day showing exactly
> what changed, and where).
>
> Just doing a diff of the XML files is one extraordinarily yucky way
> to do this, but it's not exactly easy to then reapply that as a patch.
> Something that walked through the files would be more useful...
>
> thoughts? A little bird told me that people are already considering
> something for this task... any ideas?
>
> would the XML folks have already put something like this together?
>
> Anthony
>
> _______________________________________________
> Zope maillist - Zope@zope.org
> http://www.zope.org/mailman/listinfo/zope
>
> (To receive general Zope announcements, see:
> http://www.zope.org/mailman/listinfo/zope-announce
>
> For developer-specific issues, zope-dev@zope.org -
> http://www.zope.org/mailman/listinfo/zope-dev )