Re: [Zope-dev] FTP interface being worked on?

19 Mar 2001

      I hadn't thought of the issues you raise.  Thanks for mentioning them.

"John D. Heintz" wrote in part:
...
If we
standardize "properties" to an XML file, then optionally dump other
files to expose specific aspects of an instance for serialized editing
it might not be as big a problem as I was thinking.
I think that is the shared vision.  Some aspects of each object could be
serialized into a format that is easy to edit.  For those aspects we
leave it up to the developer of the object to write a serialization
method -- we don't try to guess what an "easy to use" format would look
like.

Other aspects of objects might be impossible to serialize into a
meaningful format.  For those we have a default like XML pickle --
essentially a black box.
...
I guess I would suggest that the serialized form of a Zope instance by
default would be a single XML file, but that arbitrary sections of that
XML file could be custom dumped to separate serialized files with
similiar names.  That way authors would have a pretty easy job of
overriding sections of the dump process to spit out one or more simple
files that have little parsing overhead.
Sounds reasonable.
...
...
...
2) A lesser problem is when trying to edit the serialized "files".
Because objects are methods and state how you modify an object can
be guided if not controlled.  When we have serialized the
objects in a Zope system to files, we have exported only the state
of the objects in the ZODB.  We then have to live with the ability
to foul up invariant across many objects by changing some data in
the serialized format.  A good example would be ZCatalogs. [...]
Yup... it's probably easiest to make ZCatalogs a black box.
Black box doesn't solve this problem, only the first one.  Imagine that
I move a serialized version of a Zope object that is indexed by an
instance of ZCatalog (or many for that matter).  When I move it the
ZCatalogs must be notified to handle the change, but only at import time
because ZCatalogs are serialized as binary for lots of good reasons.
I see the problem.  I think the example you give can be handled
adequately at import time.

But I can see other examples where allowing edits to the serialized
representation could create problems that would be impossible to resolve
at import.

So it seems like we might want to make some things read only.  That is,
when you serialize the objects in the Zope ODB to a filesystem, some of
those serialized files are read-only "black boxes".  A comment in those
files could let a developer know that to change the information in that
file she needs to do an import, or edit the ODB directly.
...
When I import the object
from the serialized format all I can know is that something changed, but
without expensive processing (XML diffing is hard in the general case,
we might be able to limit the structures to managable scope though) we
can't know that the "foo" ZCatalog should be updated instead of the
"bar" ZCatalog.
Seems like we will need to consider the import code very carefully.

I don't know enough about how ZCatalog works to discuss the options
intelligently.  But in other indexing systems I have worked with, there
have been solutions for reindexing when making updates to the corpus.
...
...
...
a) XML is structured enough that it can reliably hold the
data from the
ZODB.  The current XML dump is not useful for this - it
would need to
create individual files and folders to represent
containment.
This is pretty easy right now.  Ten lines of recursive code
can walk the whole tree if necessary and export only leaf
objects.
Great.  Maybe I am closer than I realize to the CVS management
solution.  I need to look more closely at ZCVSmixin to see what it
does.  But for our immediate need (which is to allow a distributed team
of developers to share code and track changes via a central CVS
repository), maybe it makes the most sense just to segment the existing
XML export into directories and files and enhance the existing import to
allow overwriting objects.
...
...
...
b) A hybrid XML and custom dump solution.  An Image for
example could dump out as a binary image file with meta-data in a
similiarly name XML file.
Yes, each object should make its own policy regarding its
body.  Its metadata format should be standardized, however.
I like this idea.

After I have the XML export/import working in a way that fits better
with CVS (even if the sreialized representation is essentially a black
box), then I can tackle how each object represents its body in a
"morally plain text" serialized format.

In other words, first get the default XML representation and
export/import working for all objects.  Then start with the easiest type
of objects to serialize (such as DTML Methods) and create an easy to use
serialization representation.  Then work on the import for that
serialized format.

I think this approach would be different than FSDump and ZCVSMixin,
right?  As far as I understand it, FSDump just goes one way (ZODB ->
filesystem) and only for certain types of objects.  I don't understand
what ZCVSMixin does (will need to spend some time looking at it --
unlike FSDump, ZCVSMixin is not obvious from the documentation and a
quick review).

Thanks for helping with this project!
Fred
-- 
Fred Wilson Horch			mailto:fhorch@ecoaccess.org
Executive Director, EcoAccess		http://ecoaccess.org/
P.O. Box 2823, Durham, NC 27715-2823	phone: 919.419-8354