Hi folks, Is anyone working on the FTP support in Zope? For our project, we'd like to improve the FTP interface to Zope. I noticed that ZServer/README.txt in Zope 2.3.1b1 states Properties and FTP: The next step The next phase of FTP support will allow you to edit properties of all Zope objects. Probably properties will be exposed via special files which will contain an XML representation of the object's properties. You could then download the file, edit the XML and upload it to change the object's properties. We do not currently have a target date for FTP property support. I have a proposal written up, if anyone is interested in providing feedback to me before I start work on it. Along these lines, I also have a proposal to add an export / import feature that would dump objects in the ZODB to separate files in a directory tree. (Currently the XML export seems to create one big XML file.) The goal is to allow objects to be managed as flat files so you can edit them more easily and use CVS to track revisions. Also, it sure is confusing to try to figure out where you accept patches. Is the Collector still the best place to send them? Thanks, Fred -- Fred Wilson Horch mailto:fhorch@ecoaccess.org Executive Director, EcoAccess http://ecoaccess.org/ P.O. Box 2823, Durham, NC 27715-2823 phone: 919.419-8354
Hi Fred, The "export as files" paradigm is something we'd really like to see soon in Zope. We'd like it to be possible for objects to be asked for a "filesystem representation" of themselves and serialize their structures out to disk. This filesystem representation should be simple enough to edit naturally for all kinds of files, and it should play nicely with tools like CVS. I have a proposal up somewhere about this, which basically claims that objects should decide on their own representations of themselves, and that two-way "sync" should be accomplishable. I'd be interested in seeing your proposal too. The best place for these sorts of things are at http://dev.zope.org (the "fishbowl")... Thanks! - C ----- Original Message ----- From: "Fred Wilson Horch" <fhorch@ecoaccess.org> To: <zope-dev@zope.org> Sent: Sunday, March 11, 2001 5:57 PM Subject: [Zope-dev] FTP interface being worked on?
Hi folks,
Is anyone working on the FTP support in Zope? For our project, we'd like to improve the FTP interface to Zope.
I noticed that ZServer/README.txt in Zope 2.3.1b1 states
Properties and FTP: The next step
The next phase of FTP support will allow you to edit properties of all Zope objects. Probably properties will be exposed via special files which will contain an XML representation of the object's properties. You could then download the file, edit the XML and upload it to change the object's properties.
We do not currently have a target date for FTP property support.
I have a proposal written up, if anyone is interested in providing feedback to me before I start work on it.
Along these lines, I also have a proposal to add an export / import feature that would dump objects in the ZODB to separate files in a directory tree. (Currently the XML export seems to create one big XML file.) The goal is to allow objects to be managed as flat files so you can edit them more easily and use CVS to track revisions.
Also, it sure is confusing to try to figure out where you accept patches. Is the Collector still the best place to send them?
Thanks, Fred -- Fred Wilson Horch mailto:fhorch@ecoaccess.org Executive Director, EcoAccess http://ecoaccess.org/ P.O. Box 2823, Durham, NC 27715-2823 phone: 919.419-8354
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
Hi Chris, You wrote in part:
The "export as files" paradigm is something we'd really like to see soon in Zope. [...] I'd be interested in seeing your proposal too.
Great to hear we're thinking alike. My proposals are available on our SourceForge site (sorry for the long URL -- I can send the proposals as an attachment if you'd prefer): FTP proposal http://cvs.sourceforge.net/cgi-bin/cvsweb.cgi/proposals/ftp_access/ftp_acces... XML-RPC proposal (export as files) http://cvs.sourceforge.net/cgi-bin/cvsweb.cgi/proposals/xml_rpc/xml_rpc_prop...
The best place for these sorts of things are at http://dev.zope.org (the "fishbowl")...
Okay, I'll take a look and submit my proposals through the Fishbowl process. Thanks, Fred -- Fred Wilson Horch mailto:fhorch@ecoaccess.org Executive Director, EcoAccess http://ecoaccess.org/ P.O. Box 2823, Durham, NC 27715-2823 phone: 919.419-8354
Hi Fred, I read the serialization proposal... yours is somewhere inbetween what we have now and where I think we're heading. I really like the format of your proposals! These should definitely go up on the fishbowl. One thing that I think is important for a FS serialization strategy is that the filesystem representation of objects should be as simple as possible to work with. This means that the filesystem representation of a PythonScript should look like a Python script, that the filesystem representation of a DTML method should look as much as possible like a normal HTML document, etc. Instead of relying on a single monolithic format for fs reps of Zope objects (e.g. one expressed in XML), I think each object should be able to determine its own serialization to filesystem. Properties and security settings of each object could be expressed in the contents of a separate file (e.g. a "resource fork" in Mac terminology, I guess) which would be related to the actual content via a naming convention. Tres Seaver has done some work on this with his FSDump product (http://www.zope.org/Members/tseaver/FSDump), although it only goes "one way" at the moment, and Steve Spicklemire has gone a slightly different route with his ZCVSMixin product (http://www.zope.org/Members/sspickle/ZCVSMixin/). I have a proposal up on the Digital Creations intranet which makes the proposal to leave serialization format up to each object, and gives some info about possible implementation strategies. I need to clean it up and move it over to the fishbowl at some point, but I hope this email serves as a sort of overview about what we want to do about the problem at DC... it'd be great to be able to conserve resources and work on the same problem together. ----- Original Message ----- From: "Fred Wilson Horch" <fhorch@ecoaccess.org> To: "Chris McDonough" <chrism@digicool.com> Cc: <zope-dev@zope.org> Sent: Sunday, March 11, 2001 11:30 PM Subject: Re: [Zope-dev] FTP interface being worked on?
Hi Chris,
You wrote in part:
The "export as files" paradigm is something we'd really like to see soon
in
Zope. [...] I'd be interested in seeing your proposal too.
Great to hear we're thinking alike. My proposals are available on our SourceForge site (sorry for the long URL -- I can send the proposals as an attachment if you'd prefer):
FTP proposal
http://cvs.sourceforge.net/cgi-bin/cvsweb.cgi/proposals/ftp_access/ftp_acces s_prop.txt?rev=1.2&content-type=text/x-cvsweb-markup&cvsroot=ecoaccess
XML-RPC proposal (export as files)
http://cvs.sourceforge.net/cgi-bin/cvsweb.cgi/proposals/xml_rpc/xml_rpc_prop .txt?rev=1.1&content-type=text/x-cvsweb-markup&cvsroot=ecoaccess
The best place for these sorts of things are at http://dev.zope.org (the "fishbowl")...
Okay, I'll take a look and submit my proposals through the Fishbowl process.
Thanks, Fred
-- Fred Wilson Horch mailto:fhorch@ecoaccess.org Executive Director, EcoAccess http://ecoaccess.org/ P.O. Box 2823, Durham, NC 27715-2823 phone: 919.419-8354
Hi Chris, Thanks for the pointers to the work others have done. You wrote in part:
Tres Seaver has done some work on this with his FSDump product (http://www.zope.org/Members/tseaver/FSDump), although it only goes "one way" at the moment, and Steve Spicklemire has gone a slightly different route with his ZCVSMixin product (http://www.zope.org/Members/sspickle/ZCVSMixin/).
I will take a look at these. I see they are both Zope Products. I had not planned to write a Product, but maybe I should reconsider. For the FTP interface, I had planned to hack on the Zope internals directly. And for the XML-RPC interface, I had planned to write a separate client that could leverage the XML-RPC support already built into Zope.
I have a proposal up on the Digital Creations intranet which makes the proposal to leave serialization format up to each object, and gives some info about possible implementation strategies.
Get that proposal in the Fishbowl! ;-) I wonder if yet another interface is really required. If you think about it, isn't the FTP interface basically a file system serialization format? All objects already support the FTP interface -- if we improve it, then conceivably we can use standard FTP mirroring tools for filesystem export and import. Another serialized format that all Zope objects support is the XML interface, which exposes all the objects' guts. With XML-RPC I envisioned being able to improve on the FTP interface by adding things like md5 checksums to determine if the local and remote objects are in synch. I haven't looked too deeply, but presumably via XML you could support all of the management functionality that is currently provided by the HTML management interface. So you could build a client with a rich feature set for managing Zope objects. I understand your point about having each object's serialization "look like" that kind of object, but isn't there also some value in the consistency of XML representing every kind of object? For automated tools, it seems like an XML representation is a great idea, and one that could be exploited with a good client-side tool that understands the Zope ODB DTD. So I basically see three interfaces as necessary and sufficient: 1) XHTML - gets you started, can manage things with a browser 2) FTP - serialization to and from a filesystem 3) XML - the advanced management interface, easy to automate I don't know much about WebDAV -- since we're a volunteer organization, we are using free software where possible and I haven't seen much free software that supports WebDAV. cadaver seems to work fine with Zope. But I can easily see the combination of FTP + CVS providing us everything we need. So in some ways WebDAV seems like an extra that will be nice if and when there are clients that support it.
I hope this email serves as a sort of overview about what we want to do about the problem at DC... it'd be great to be able to conserve resources and work on the same problem together.
Absolutely! We liked your Fishbowl process so much we are basing our own development process on it. (For details of our process, check out http://cvs.sourceforge.net/cgi-bin/cvsweb.cgi/docs/tech-process.txt?rev=1.2&... ) -- Fred Wilson Horch mailto:fhorch@ecoaccess.org Executive Director, EcoAccess http://ecoaccess.org/ P.O. Box 2823, Durham, NC 27715-2823 phone: 919.419-8354
On Mon, 12 Mar 2001, Fred Wilson Horch wrote:
Absolutely! We liked your Fishbowl process so much we are basing our own development process on it. (For details of our process, check out http://cvs.sourceforge.net/cgi-bin/cvsweb.cgi/docs/tech-process.txt?rev=1.2&... )
You may also find our documentation process interesting: http://www.zope.org/DocProjects/intro -Michel
You may also find our documentation process interesting:
Yes, very interesting! But I'm sorry to see that the Developer's Guide is only in the planning stages. Here is some info that should go into it (from our Zope notes at http://cvs.sourceforge.net/cgi-bin/cvsweb.cgi/docs/notes-zope.txt?rev=1.8&co...): This section explains the directory structure you'll find when you unpack the Zope 2.3.1b1 source tar file. Root directory: Extensions/ code for External Methods go in this directory ZServer/ python code for ZServer and medusa (see the README.txt) doc/ documentation (especially INSTALL.txt) import/ used by the running Zope process to import objects into the ZODB inst/ installation scripts lib/ python library (most of Zope's code is under here) pcgi/ C and python code for PCGI (see the README) utilities/ random utilities (see the README.txt) var/ contains the FileStorage for the ZODB (Data.fs) and various other files (logs, pids, etc.) LICENSE.txt Zope Public License (ZPL) Version 1.0 README.txt general information about the Zope source release w_pcgi.py scripts for setting up Zope with PCGI wo_pcgi.py without PCGI z2.py the start script for Zope zpasswd.py create or change the Zope emergency account and password lib: Components python extension modules written in C - BTree, ExtensionClass, cPickle, zlib python everything else lib/python: Most of the Zope code is in here. -- Fred Wilson Horch mailto:fhorch@ecoaccess.org Executive Director, EcoAccess http://ecoaccess.org/ P.O. Box 2823, Durham, NC 27715-2823 phone: 919.419-8354
On Mon, 12 Mar 2001, Fred Wilson Horch wrote:
You may also find our documentation process interesting:
Yes, very interesting!
But I'm sorry to see that the Developer's Guide is only in the planning stages. Here is some info that should go into it (from our Zope notes at http://cvs.sourceforge.net/cgi-bin/cvsweb.cgi/docs/notes-zope.txt?rev=1.8&co...):
This information should actually go in the administrator's guide: http://sourceforge.net/projects/zope-admin -Michel
I had not planned to write a Product, but maybe I should reconsider. For the FTP interface, I had planned to hack on the Zope internals directly. And for the XML-RPC interface, I had planned to write a separate client that could leverage the XML-RPC support already built into Zope.
It's possible to use Products for all sorts of stuff, even overriding methods and attributes of objects in the core (this is how Hotfixes work). It's hard to maintain a product that does this though.
I have a proposal up on the Digital Creations intranet which makes the proposal to leave serialization format up to each object, and gives some info about possible implementation strategies.
Get that proposal in the Fishbowl! ;-)
Yup.
I wonder if yet another interface is really required. If you think about it, isn't the FTP interface basically a file system serialization format? All objects already support the FTP interface -- if we improve it, then conceivably we can use standard FTP mirroring tools for filesystem export and import.
Yes! Take for example PythonScripts.. they already support such a serialization for their content... even though there's info in the bindings tab and the arguments list, it's placed into the serialization in a reasonable way. This is the sort of thing I'd like to see generalized for all basic Zope objects, but with better functionality for security settings and properties. It's probably not even fair to call this kind of serialization "filesystem serialization" because it's the sort of representation of objects that can be used by FTP, WebDAV, etc. It's just a human-readable representation of Zope objects that fits into a filesystem-like model that attempts to preserve most object information (although there's no guarantee that it won't be lossy).
Another serialized format that all Zope objects support is the XML interface, which exposes all the objects' guts. With XML-RPC I envisioned being able to improve on the FTP interface by adding things like md5 checksums to determine if the local and remote objects are in synch. I haven't looked too deeply, but presumably via XML you could support all of the management functionality that is currently provided by the HTML management interface. So you could build a client with a rich feature set for managing Zope objects.
I understand your point about having each object's serialization "look like" that kind of object, but isn't there also some value in the consistency of XML representing every kind of object? For automated tools, it seems like an XML representation is a great idea, and one that could be exploited with a good client-side tool that understands the Zope ODB DTD.
Yes, and this is great for XML export. But I see filesystem serialization and XML export as different things. Zope already has a little-known XML format for representing python objects ("ppml", Python Pickle Markup Language), which is the format which XML exports are done in. But when developers work with filesystem reps of objects, I'd hate for them to have to work with it.
So I basically see three interfaces as necessary and sufficient:
1) XHTML - gets you started, can manage things with a browser 2) FTP - serialization to and from a filesystem 3) XML - the advanced management interface, easy to automate
I don't know much about WebDAV -- since we're a volunteer organization, we are using free software where possible and I haven't seen much free software that supports WebDAV. cadaver seems to work fine with Zope. But I can easily see the combination of FTP + CVS providing us everything we need. So in some ways WebDAV seems like an extra that will be nice if and when there are clients that support it.
I hope this email serves as a sort of overview about what we want to do about the problem at DC... it'd be great to be able to conserve resources and work on the same problem together.
Absolutely! We liked your Fishbowl process so much we are basing our own development process on it. (For details of our process, check out
http://cvs.sourceforge.net/cgi-bin/cvsweb.cgi/docs/tech-process.txt?rev=1.2& content-type=text/x-cvsweb-markup&cvsroot=ecoaccess
)
Wow! Looks like you're planning ahead. I probably won't be available for a little while (I'm off this week), but hopefully I can get that proposal cleaned up and on the fishbowl and we can resume this discussion in that Wiki. Thanks! - C
I hope folks don't mind if I resume the object serialization thread on the mailing list. Chris McDonough wrote:
I wonder if yet another interface is really required. If you think about it, isn't the FTP interface basically a file system serialization format?
Yes! [...] It's probably not even fair to call this kind of serialization "filesystem serialization" because it's the sort of representation of objects that can be used by FTP, WebDAV, etc. It's just a human-readable representation of Zope objects that fits into a filesystem-like model that attempts to preserve most object information (although there's no guarantee that it won't be lossy).
The "no guarantee" lossy part bothers me. For our purposes, we'd like to see lossless serialization that provides full control over objects through FTP, WebDAV, etc. Lossy serialization will cause problems for round tripping objects, i.e. getting them out of the object database, updating them, then putting them back in. One of our goals is to be able to use CVS to track our updates and distribute our object database. We definitely do not want to be losing information through serialization.
I understand your point about having each object's serialization "look like" that kind of object, but isn't there also some value in the consistency of XML representing every kind of object? For automated tools, it seems like an XML representation is a great idea, and one that could be exploited with a good client-side tool that understands the Zope ODB DTD.
Yes, and this is great for XML export. But I see filesystem serialization and XML export as different things.
No disagreement here. I wouldn't want to have to deal with the XML representation when I'm using an FTP or WebDAV client.
Zope already has a little-known XML format for representing python objects ("ppml", Python Pickle Markup Language), which is the format which XML exports are done in. But when developers work with filesystem reps of objects, I'd hate for them to have to work with it.
Good point. So the XML format stays monothilic (i.e an XML export of the root object creates one big file, not a directory full of sub-directories and files representing objects) and when you want to deal with files and directories you don't get the XML format, you get something else. That means that each object needs to support two serialization formats: XML and the "filesystem serialization" format.
So I basically see three interfaces as necessary and sufficient:
1) XHTML - gets you started, can manage things with a browser 2) FTP - serialization to and from a filesystem 3) XML - the advanced management interface, easy to automate
To elaborate, first, the existing FTP serialization format could be enhanced to be this "filesystem serialization" format. Second, the XML serialization format could be the basis for some sophisticated client side management tools based on XML-RPC. Unlike the existing HTML (or XHTML) client side management interface, an XML management interface could leverage XML libraries for parsing serialized object data, and for communicating with Zope via XML-RPC.
Wow! Looks like you're planning ahead. I probably won't be available for a little while (I'm off this week), but hopefully I can get that proposal cleaned up and on the fishbowl and we can resume this discussion in that Wiki.
Okay, I'll try to deal with the Wiki. But I have to admit that I find the Wiki interface painful. Is it okay to keep using the mailing list for discussions like this? I assume the keeper of the Wiki can copy and paste useful bits into the Wiki as the mood strikes them. Fred -- Fred Wilson Horch mailto:fhorch@ecoaccess.org Executive Director, EcoAccess http://ecoaccess.org/ P.O. Box 2823, Durham, NC 27715-2823 phone: 919.419-8354
* Fred Wilson Horch (fhorch@ecoaccess.org) [010312 10:27]:
Another serialized format that all Zope objects support is the XML interface, which exposes all the objects' guts. With XML-RPC I envisioned being able to improve on the FTP interface by adding things like md5 checksums to determine if the local and remote objects are in synch. I haven't looked too deeply, but presumably via XML you could support all of the management functionality that is currently provided by the HTML management interface. So you could build a client with a rich feature set for managing Zope objects.
An idea might be to just impliment the rsync protocol, which does a lot of this already. Ciao! -- "Laugh while you can, monkey boy!" --Dr. Emilio Lizardo (Adventures of Buckaroo Banzai) The Doctor What: <fill in the blank> http://docwhat.gerf.org/ docwhat@gerf.org KF6VNC
Fred, I've put the proposal up at http://dev.zope.org/Wikis/DevSite/Proposals/RepresentingObjectsOnTheFilesyst em. Let me know what you think! - C ----- Original Message ----- From: "Fred Wilson Horch" <fhorch@ecoaccess.org> To: "Chris McDonough" <chrism@digicool.com> Cc: <zope-dev@zope.org> Sent: Monday, March 12, 2001 11:32 AM Subject: Re: [Zope-dev] FTP interface being worked on?
Hi Chris,
Thanks for the pointers to the work others have done. You wrote in part:
Tres Seaver has done some work on this with his FSDump product (http://www.zope.org/Members/tseaver/FSDump), although it only goes "one way" at the moment, and Steve Spicklemire has gone a slightly different route with his ZCVSMixin product (http://www.zope.org/Members/sspickle/ZCVSMixin/).
I will take a look at these. I see they are both Zope Products.
I had not planned to write a Product, but maybe I should reconsider. For the FTP interface, I had planned to hack on the Zope internals directly. And for the XML-RPC interface, I had planned to write a separate client that could leverage the XML-RPC support already built into Zope.
I have a proposal up on the Digital Creations intranet which makes the proposal to leave serialization format up to each object, and gives some info about possible implementation strategies.
Get that proposal in the Fishbowl! ;-)
I wonder if yet another interface is really required. If you think about it, isn't the FTP interface basically a file system serialization format? All objects already support the FTP interface -- if we improve it, then conceivably we can use standard FTP mirroring tools for filesystem export and import.
Another serialized format that all Zope objects support is the XML interface, which exposes all the objects' guts. With XML-RPC I envisioned being able to improve on the FTP interface by adding things like md5 checksums to determine if the local and remote objects are in synch. I haven't looked too deeply, but presumably via XML you could support all of the management functionality that is currently provided by the HTML management interface. So you could build a client with a rich feature set for managing Zope objects.
I understand your point about having each object's serialization "look like" that kind of object, but isn't there also some value in the consistency of XML representing every kind of object? For automated tools, it seems like an XML representation is a great idea, and one that could be exploited with a good client-side tool that understands the Zope ODB DTD.
So I basically see three interfaces as necessary and sufficient:
1) XHTML - gets you started, can manage things with a browser 2) FTP - serialization to and from a filesystem 3) XML - the advanced management interface, easy to automate
I don't know much about WebDAV -- since we're a volunteer organization, we are using free software where possible and I haven't seen much free software that supports WebDAV. cadaver seems to work fine with Zope. But I can easily see the combination of FTP + CVS providing us everything we need. So in some ways WebDAV seems like an extra that will be nice if and when there are clients that support it.
I hope this email serves as a sort of overview about what we want to do about the problem at DC... it'd be great to be able to conserve resources and work on the same problem together.
Absolutely! We liked your Fishbowl process so much we are basing our own development process on it. (For details of our process, check out
http://cvs.sourceforge.net/cgi-bin/cvsweb.cgi/docs/tech-process.txt?rev=1.2& content-type=text/x-cvsweb-markup&cvsroot=ecoaccess
) -- Fred Wilson Horch mailto:fhorch@ecoaccess.org Executive Director, EcoAccess http://ecoaccess.org/ P.O. Box 2823, Durham, NC 27715-2823 phone: 919.419-8354
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
Hi again, I'm commenting by e-mail because the Wiki interface is too horrible for me to face on a Saturday night when I should be doing other things. ;-) Chris McDonough wrote:
I've put the proposal up at http://dev.zope.org/Wikis/DevSite/Proposals/RepresentingObjectsOnTheFilesyst em.
Let me know what you think!
This is a great start. My major question is I don't understand the design decision to allow lossy representation. Seems to me this is a recipe for disaster. You aren't working with Microsoft on this one, are you? ;-) Will there be some undocumented API call that only Zope employees know about to get the serialized lossless representation? ;-) The proposal states in part:
"Lossless" general serialization is not an explicit goal. If a developer wishes to make his or her object serializable to a directory structure, he or she will need to implement methods of an API on the object instance which allow it to be represented adequately enough to be reconstructable into its original Python representation when it's "imported". If this API is not implemented by the developer, the result is undefined
I think lossless serialization should be an explicit goal. If a developer doesn't provide specific object serialization methods, then a default method (perhaps XML) should be invoked that is lossless even if hard to work with. The worst thing you can do is make some things hidden in the ZODB and only available through a certain interface. The whole point for us is to get full control of our objects through CVS. I need to get started on something for our project so we can manage our objects via CVS by the beginning of May. I have taken a look at http://www.zope.org/Members/tseaver/FSDump and http://www.zope.org/Members/sspickle/ZCVSMixin. Can anyone tell me where my effort would best be spent? Would it be best for me to start with FSDump or ZCVSMixin and corrupt them to serve my evil plans, or should I start from scratch based on the object serialization discussion we've been having (but with the explicit goal of lossless serialization, unlike Chris' proposal)? To understand what I want to do, you can read my two project proposals at http://cvs.sourceforge.net/cgi-bin/cvsweb.cgi/proposals/ftp_access/?cvsroot=... http://cvs.sourceforge.net/cgi-bin/cvsweb.cgi/proposals/xml_rpc/?cvsroot=eco... Thanks in advance for any advice! Fred -- Fred Wilson Horch mailto:fhorch@ecoaccess.org Executive Director, EcoAccess http://ecoaccess.org/ P.O. Box 2823, Durham, NC 27715-2823 phone: 919.419-8354
--On Saturday, March 17, 2001 08:46:26 PM -0500 Fred Wilson Horch <fhorch@ecoaccess.org> wrote:
I think lossless serialization should be an explicit goal. If a developer doesn't provide specific object serialization methods, then a default method (perhaps XML) should be invoked that is lossless even if hard to work with.
I'm not sure what the caveats were that lead to the non-lossless guarantee. Think of the filesystem representation of a ZCatalog. What is lossless vs. non-lossless? If the filesystem representation dumps evrything required to recreate a working copy of the catalog after a (perhaps lengthy) computation but doesn't actually dump the full current contents is that a lossless representation?
The whole point for us is to get full control of our objects through CVS.
And grep and emacs, etc. At least for us. This is really the big issue. If all you need is CVS, a "morally binary" XML representation can do. Zope already provides one, though it is not ideal for CVS. If you want to be able to use other file system based tools (a.k.a. normal development tools) then you need a representation much closer normal text. It's almost obvious what this should be for folders, DTML, ZSQL, PythonScripts, etc. It's much less obvious what this should be for ZCatalogs, Racks (yeah DC probably doesn't care but I do), ZClasses, etc. It may be hard to come up with something better than XML pickles, which I agree should probably be the default if nothing better is specified. Then there is metadata. That leads into your next question:
Can anyone tell me where my effort would best be spent? Would it be best for me to start with FSDump or ZCVSMixin and corrupt them to serve my evil plans, or should I start from scratch based on the object serialization discussion we've been having (but with the explicit goal of lossless serialization, unlike Chris' proposal)?
The difference is that ZCVSMixin reads and writes XML pickles because capturing all metadata was a major goal. We can't live with the extreme unfriendlyness of XML pickles to other tools. FSDump tries to capture all metadata explicitly in ".props" files. I suspect that it is much closer to the eventual file system representation of Chris' proposal. FSDump has no read capability. At IPC9, someone from DC told me that Tres was worried that read capability would be a giant security hole. I can't remember if that someone was Tres or not. IMHO, the solution to this probably involves forcing read to be invoked only from outside of Zope (or maybe only from a local machine login?). I'm not sure how this would be done.
"Dan L. Pierson" wrote in part:
I think lossless serialization should be an explicit goal.
What is lossless vs. non-lossless? If the filesystem representation dumps evrything required to recreate a working copy of the catalog after a (perhaps lengthy) computation but doesn't actually dump the full current contents is that a lossless representation?
Yes, in my book (as long as original and recreated copies of the catalog are functionally identical). I'm using lossless in the sense it is used in the compression field. If you can recreate the same objects from the representation (even if it requires several computational steps) then the representation is "lossless". A "lossy" representation would mean that you lose some piece of information that is essential to getting back to the original state of the object database. For images, JPEG is a lossy compression scheme, which means it is one way. If you convert a TIFF to JPEG, then you can't go back to the exact same TIFF. By contrast, PNG is lossless. You can convert from TIFF to PNG and back to TIFF and get the exact same TIFF. My concern is that if the plain text serialized format is lossy, it will be one way only. That is not good for us. To preserve the round trip ability, we need a lossless representation.
The whole point for us is to get full control of our objects through CVS.
And grep and emacs, etc.
Yes. CVS is the principal driver for us, but grep and emacs are important too.
If all you need is CVS, a "morally binary" XML representation can do. Zope already provides one, though it is not ideal for CVS. If you want to be able to use other file system based tools (a.k.a. normal development tools) then you need a representation much closer normal text. It's almost obvious what this should be for folders, DTML, ZSQL, PythonScripts, etc. It's much less obvious what this should be for ZCatalogs, Racks (yeah DC probably doesn't care but I do), ZClasses, etc.
Good points. I'm eager to hear from anyone else with a perspective on this before I start coding something up. Thanks, Fred -- Fred Wilson Horch mailto:fhorch@ecoaccess.org Executive Director, EcoAccess http://ecoaccess.org/ P.O. Box 2823, Durham, NC 27715-2823 phone: 919.419-8354
On Sun, 18 Mar 2001, Dan L. Pierson wrote:
representation of Chris' proposal. FSDump has no read capability. At IPC9, someone from DC told me that Tres was worried that read capability would be a giant security hole. I can't remember if that someone was Tres or not. IMHO, the solution to this probably involves forcing read to be invoked only from outside of Zope (or maybe only from a local machine login?). I'm not sure how this would be done.
Presumably the issue here is the one that results in 'import' only working on files stored in the host file system (ie: you have enough authority to have file system privs in the zope directory to import zexp pickles or XML pickles). A file-system-serialized represenatation has the additional advantage over XML pickles that it can be re-parsed and have the security rules applied on read. This however means that XML as the default for objects that don't explicitly implement the file-system-serialize API is probably not secure. For CVS, XML default would be good. For round trip editing using "standard tools", XML default would not be good. So I think XML should be the default for write, but there should be no default for read. --RDM
On Sat, 17 Mar 2001 20:46:26 -0500 Fred Wilson Horch <fhorch@ecoaccess.org> wrote:
This is a great start.
My major question is I don't understand the design decision to allow lossy representation.
Seems to me this is a recipe for disaster. You aren't working with Microsoft on this one, are you? ;-) Will there be some undocumented API call that only Zope employees know about to get the serialized lossless representation? ;-)
No less documented than it already is. ;-) Zope has no employees.
The proposal states in part:
"Lossless" general serialization is not an explicit goal. If a developer wishes to make his or her object serializable to a directory structure, he or she will need to implement methods of an API on the object instance which allow it to be represented adequately enough to be reconstructable into its original Python representation when it's "imported". If this API is not implemented by the developer, the result is undefined
I think lossless serialization should be an explicit goal. If a developer doesn't provide specific object serialization methods, then a default method (perhaps XML) should be invoked that is lossless even if hard to work with.
I think the proposal says something like this.
The whole point for us is to get full control of our objects through CVS.
That's one use, which is important to you. Another is to use Emacs or Dreamweaver on a representation of, for example, DTML methods on a filesystem, which is important to other folks. The point of having potentially "lossy" representation of objects is to make it easier to work with them in these kinds of tools. Nobody wants to edit XML, AFAICT. "Potentially lossy" also doesn't mean "leaky". It just means that folks who expose their objects to this sort of serialization can choose their own format, and if it represents the object adequately for their own use in both directions, it's good enough. If you want a lossless "morally binary" representation, it's probably best to use XML export, which is great for your purposes, because it already exists! ;-)
Chris McDonough wrote:
Fred Horch wrote:
My major question is I don't understand the design decision to allow lossy representation. [...]
I think lossless serialization should be an explicit goal. If a developer doesn't provide specific object serialization methods, then a default method (perhaps XML) should be invoked that is lossless even if hard to work with.
I think the proposal says something like this.
The proposal states in part: If this API is not implemented by the developer, the result is undefined (XML pickle representation if allowed by the object on a per-object basis?). I guess I'm voting to rewrite this sentence: If this API is not implemented by the developer, the result is a default serialized representation (perhaps XML pickle) on a per-object basis.
The whole point for us is to get full control of our objects through CVS.
That's one use, which is important to you. Another is to use Emacs or Dreamweaver on a representation of, for example, DTML methods on a filesystem, which is important to other folks.
The point of having potentially "lossy" representation of objects is to make it easier to work with them in these kinds of tools. Nobody wants to edit XML, AFAICT.
I see. I agree with the goal to have a representation that is easy to work with in emacs. Would it be possible to have a "lossless" representation that is also easy to work with? The current XML export format is "lossless" but hard to work with.
"Potentially lossy" also doesn't mean "leaky". It just means that folks who expose their objects to this sort of serialization can choose their own format, and if it represents the object adequately for their own use in both directions, it's good enough.
Maybe the issue is semantics. I think "potentially lossy" == "potentially leaky". Even a small leak would cause problems for us. Maybe it wouldn't cause problems for others. But it sure seems like it would be possible to create a solution that works for everyone. Namely, a lossless representation that is easy to work with. I completely agree with the point that we want to be able to work with representations of objects using tools like emacs and dreamweaver. In fact, we'd like to use emacs as our front end to CVS. The ideal situation would be to edit Zope objects in emacs, publish them to a Zope object database, test them (perhaps using a separate web browser like Netscape or Internet Explorer), and then once everything is working, commit the objects to a CVS repository (using emacs or from the command line).
If you want a lossless "morally binary" representation, it's probably best to use XML export, which is great for your purposes, because it already exists! ;-)
I've heard it said that all progress is due to the unreasonable man. So to do my part for progress, what I want is a lossless "morally plain text" representation. ;-) If the existing XML export really was great for our purposes, I'd be done! The problem is that everything comes out in one big monolithic file. That's not good for project management using CVS. (At least, as far as I can tell.) I think there is the potential for a really good solution that solves our need to manage our project via CVS, and to solve the greater need to enhance the Zope management interface to support tools that work with filesystem objects. I am about to jump into this project next week. I do want to stay in touch with anyone who is working on similar projects, so please keep in touch! I will post reports as I make progress in case anyone is interested. Thanks, Fred -- Fred Wilson Horch mailto:fhorch@ecoaccess.org Executive Director, EcoAccess http://ecoaccess.org/ P.O. Box 2823, Durham, NC 27715-2823 phone: 919.419-8354
Fred Wilson Horch wrote:
Chris McDonough wrote:
Fred Horch wrote:
My major question is I don't understand the design decision to allow lossy representation. [...]
I think lossless serialization should be an explicit goal. If a developer doesn't provide specific object serialization methods, then a default method (perhaps XML) should be invoked that is lossless even if hard to work with.
I think the proposal says something like this.
The proposal states in part:
If this API is not implemented by the developer, the result is undefined (XML pickle representation if allowed by the object on a per-object basis?).
I guess I'm voting to rewrite this sentence:
If this API is not implemented by the developer, the result is a default serialized representation (perhaps XML pickle) on a per-object basis
The whole point for us is to get full control of our objects through CVS.
That's one use, which is important to you. Another is to use Emacs or Dreamweaver on a representation of, for example, DTML methods on a filesystem, which is important to other folks.
The point of having potentially "lossy" representation of objects is to make it easier to work with them in these kinds of tools. Nobody wants to edit XML, AFAICT.
I see. I agree with the goal to have a representation that is easy to work with in emacs.
Would it be possible to have a "lossless" representation that is also easy to work with?
The current XML export format is "lossless" but hard to work with.
"Potentially lossy" also doesn't mean "leaky". It just means that folks who expose their objects to this sort of serialization can choose their own format, and if it represents the object adequately for their own use in both directions, it's good enough.
Maybe the issue is semantics. I think "potentially lossy" == "potentially leaky". Even a small leak would cause problems for us. Maybe it wouldn't cause problems for others. But it sure seems like it would be possible to create a solution that works for everyone. Namely, a lossless representation that is easy to work with.
I'd first like to say that I applaud the goal stated in the previous line! I think there are two key problems with achieving it. 1) Because everyone writing extensions for Zope can define their own data structures it make it very difficult to store them anywhere but an object database. I think this problem has nearly the same complexity as figuring out the RDBMS table structures necessary for all the Products and builtin objects in Zope... 2) A lesser problem is when trying to edit the serialized "files". Because objects are methods and state how you modify an object can be guided if not controlled. When we have serialized the objects in a Zope system to files, we have exported only the state of the objects in the ZODB. We then have to live with the ability to foul up invariant across many objects by changing some data in the serialized format. A good example would be ZCatalogs. When some piece of data changes the code can automatically call reindex(), if I'm editing a file I might not know that I need to change other files due to runtime dependencies. (I know that ZCatalog is a poor example because earlier in the thread cataloging was discussed wrt lossy/lossless behavior, but it was the easiest for me to make my point with.) Having said that, I suspect that a few systems could solve the first problem. (I don't know how to solve the second one with serialized data...) a) XML is structured enough that it can reliably hold the data from the ZODB. The current XML dump is not useful for this - it would need to create individual files and folders to represent containment. b) A hybrid XML and custom dump solution. An Image for example could dump out as a binary image file with meta-data in a similiarly name XML file. c) A special file system - like ReisferFS might be that system. To my knowledge ReisferFS is eventually intended to be a melding of file system, relation db, and object db. This is obviously serious paraphrasing, but I think it may have enough descriptive power to replace XML. I know know how this would interact with CVS, or be edited, but I thought I'd mention it. None of these solve problem 2), but then again I don't think anything does.. Thanks for listening, John
I completely agree with the point that we want to be able to work with representations of objects using tools like emacs and dreamweaver. In fact, we'd like to use emacs as our front end to CVS. The ideal situation would be to edit Zope objects in emacs, publish them to a Zope object database, test them (perhaps using a separate web browser like Netscape or Internet Explorer), and then once everything is working, commit the objects to a CVS repository (using emacs or from the command line).
If you want a lossless "morally binary" representation, it's probably best to use XML export, which is great for your purposes, because it already exists! ;-)
I've heard it said that all progress is due to the unreasonable man. So to do my part for progress, what I want is a lossless "morally plain text" representation. ;-)
If the existing XML export really was great for our purposes, I'd be done! The problem is that everything comes out in one big monolithic file. That's not good for project management using CVS. (At least, as far as I can tell.)
I think there is the potential for a really good solution that solves our need to manage our project via CVS, and to solve the greater need to enhance the Zope management interface to support tools that work with filesystem objects.
I am about to jump into this project next week. I do want to stay in touch with anyone who is working on similar projects, so please keep in touch! I will post reports as I make progress in case anyone is interested.
Thanks, Fred
-- . . . . . . . . . . . . . . . . . . . . . . . . John D. Heintz | Senior Engineer 1016 La Posada Dr. | Suite 240 | Austin TX 78752 T 512.633.1198 | jheintz@isogen.com w w w . d a t a c h a n n e l . c o m
Fred wrote:
I guess I'm voting to rewrite this sentence:
If this API is not implemented by the developer, the result is a default serialized representation (perhaps XML pickle) on a per-object basis
I think this makes sense.
Maybe the issue is semantics. I think "potentially lossy" == "potentially leaky". Even a small leak would cause problems for us. Maybe it wouldn't cause problems for others. But it sure seems like it would be possible to create a solution that works for everyone. Namely, a lossless representation that is easy to work with.
This is possible probably for many objects. DTML Methods, for instance, are basically just big bags of text with some security settings and other associated metadata. Recreating them losslessly from a filesystem representation is pretty easy. ZCatalogs, on the other hand, have lots of state, which is hard to adequately represent in anything but a morally binary representation. Maybe we won't even try to make it editable, and we'll choose to use XML for these. John wrote:
I'd first like to say that I applaud the goal stated in the previous line!
I think there are two key problems with achieving it. 1) Because everyone writing extensions for Zope can define their own data structures it make it very difficult to store them anywhere but an object database. I think this problem has nearly the same complexity as figuring out the RDBMS table structures necessary for all the Products and builtin objects in Zope...
Yes... luckily, because we have OO and polymorphism, we don't have to do this! ;-) I don't think it's reasonable or wise to impose any "master structure" for filesystem serialization of bodies of objects. Each instance (or perhaps each class) should define how best to serialize itself to disk. Representations between classes are likely to be radically different. A place for standardization is in the "properties" file(s) which accompany each object rep... this is likely to be XML or another structured variant.
2) A lesser problem is when trying to edit the serialized "files". Because objects are methods and state how you modify an object can be guided if not controlled. When we have serialized the objects in a Zope system to files, we have exported only the state of the objects in the ZODB. We then have to live with the ability to foul up invariant across many objects by changing some data in the serialized format. A good example would be ZCatalogs. When some piece of data changes the code can automatically call reindex(), if I'm editing a file I might not know that I need to change other files due to runtime dependencies.
Yup... it's probably easiest to make ZCatalogs a black box.
a) XML is structured enough that it can reliably hold the data from the ZODB. The current XML dump is not useful for this - it would need to create individual files and folders to represent containment.
This is pretty easy right now. Ten lines of recursive code can walk the whole tree if necessary and export only leaf objects.
b) A hybrid XML and custom dump solution. An Image for example could dump out as a binary image file with meta-data in a similiarly name XML file.
Yes, each object should make its own policy regarding its body. Its metadata format should be standardized, however.
Chris McDonough wrote:
Fred wrote:
I guess I'm voting to rewrite this sentence:
If this API is not implemented by the developer, the
result is
a default serialized representation (perhaps XML
pickle) on a
per-object basis
I think this makes sense.
Maybe the issue is semantics. I think "potentially
lossy" ==
"potentially leaky". Even a small leak would cause
problems for us.
Maybe it wouldn't cause problems for others. But it
sure seems like it
would be possible to create a solution that works for
everyone. Namely,
a lossless representation that is easy to work with.
This is possible probably for many objects. DTML Methods, for instance, are basically just big bags of text with some security settings and other associated metadata. Recreating them losslessly from a filesystem representation is pretty easy. ZCatalogs, on the other hand, have lots of state, which is hard to adequately represent in anything but a morally binary representation. Maybe we won't even try to make it editable, and we'll choose to use XML for these.
John wrote:
I'd first like to say that I applaud the goal stated in the previous line!
I think there are two key problems with achieving it. 1) Because everyone writing extensions for Zope can define their own data structures it make it very difficult to store them anywhere but an object database. I think this problem has nearly the same complexity as figuring out the RDBMS table structures necessary for all the Products and builtin objects in Zope...
Yes... luckily, because we have OO and polymorphism, we don't have to do this! ;-)
I don't think it's reasonable or wise to impose any "master structure" for filesystem serialization of bodies of objects. Each instance (or perhaps each class) should define how best to serialize itself to disk. Representations between classes are likely to be radically different. A place for standardization is in the "properties" file(s) which accompany each object rep... this is likely to be XML or another structured variant.
My point was that serializing a Zope instance(class) in the general case is about as hard as mapping to tables. As an example look at the StructuredText format. This is a Zope class that has a very clear serialized format, but a lot of work went into defining it and being able to parse it. That amount of work may have to happen for each class to be useful in serial format. I think I might have missed some of your point though. If we standardize "properties" to an XML file, then optionally dump other files to expose specific aspects of an instance for serialized editing it might not be as big a problem as I was thinking. I guess I would suggest that the serialized form of a Zope instance by default would be a single XML file, but that arbitrary sections of that XML file could be custom dumped to separate serialized files with similiar names. That way authors would have a pretty easy job of overriding sections of the dump process to spit out one or more simple files that have little parsing overhead.
2) A lesser problem is when trying to edit the serialized "files". Because objects are methods and state how you modify an object can be guided if not controlled. When we have serialized the objects in a Zope system to files, we have exported only the state of the objects in the ZODB. We then have to live with the ability to foul up invariant across many objects by changing some data in the serialized format. A good example would be ZCatalogs. When some piece of data changes the code can automatically call reindex(), if I'm editing a file I might not know that I need to change other files due to runtime dependencies.
Yup... it's probably easiest to make ZCatalogs a black box.
Black box doesn't solve this problem, only the first one. Imagine that I move a serialized version of a Zope object that is indexed by an instance of ZCatalog (or many for that matter). When I move it the ZCatalogs must be notified to handle the change, but only at import time because ZCatalogs are serialized as binary for lots of good reasons. Just listing that my example serialized file is used by some other objects doesn't help because ZCatalog may not refer directly to the object anyway. The editing and import process must work together to track changed files, moved files, and deleted files at a minimum. This may not be good enough, because the code written into a Zope Product may say that when property "x" is changed on these objects to reindex the "foo" ZCatalog for that object. When I import the object from the serialized format all I can know is that something changed, but without expensive processing (XML diffing is hard in the general case, we might be able to limit the structures to managable scope though) we can't know that the "foo" ZCatalog should be updated instead of the "bar" ZCatalog.
a) XML is structured enough that it can reliably hold the data from the ZODB. The current XML dump is not useful for this - it would need to create individual files and folders to represent containment.
This is pretty easy right now. Ten lines of recursive code can walk the whole tree if necessary and export only leaf objects.
b) A hybrid XML and custom dump solution. An Image for example could dump out as a binary image file with meta-data in a similiarly name XML file.
Yes, each object should make its own policy regarding its body. Its metadata format should be standardized, however.
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
-- . . . . . . . . . . . . . . . . . . . . . . . . John D. Heintz | Senior Engineer 1016 La Posada Dr. | Suite 240 | Austin TX 78752 T 512.633.1198 | jheintz@isogen.com w w w . d a t a c h a n n e l . c o m
I hadn't thought of the issues you raise. Thanks for mentioning them. "John D. Heintz" wrote in part:
If we standardize "properties" to an XML file, then optionally dump other files to expose specific aspects of an instance for serialized editing it might not be as big a problem as I was thinking.
I think that is the shared vision. Some aspects of each object could be serialized into a format that is easy to edit. For those aspects we leave it up to the developer of the object to write a serialization method -- we don't try to guess what an "easy to use" format would look like. Other aspects of objects might be impossible to serialize into a meaningful format. For those we have a default like XML pickle -- essentially a black box.
I guess I would suggest that the serialized form of a Zope instance by default would be a single XML file, but that arbitrary sections of that XML file could be custom dumped to separate serialized files with similiar names. That way authors would have a pretty easy job of overriding sections of the dump process to spit out one or more simple files that have little parsing overhead.
Sounds reasonable.
2) A lesser problem is when trying to edit the serialized "files". Because objects are methods and state how you modify an object can be guided if not controlled. When we have serialized the objects in a Zope system to files, we have exported only the state of the objects in the ZODB. We then have to live with the ability to foul up invariant across many objects by changing some data in the serialized format. A good example would be ZCatalogs. [...]
Yup... it's probably easiest to make ZCatalogs a black box.
Black box doesn't solve this problem, only the first one. Imagine that I move a serialized version of a Zope object that is indexed by an instance of ZCatalog (or many for that matter). When I move it the ZCatalogs must be notified to handle the change, but only at import time because ZCatalogs are serialized as binary for lots of good reasons.
I see the problem. I think the example you give can be handled adequately at import time. But I can see other examples where allowing edits to the serialized representation could create problems that would be impossible to resolve at import. So it seems like we might want to make some things read only. That is, when you serialize the objects in the Zope ODB to a filesystem, some of those serialized files are read-only "black boxes". A comment in those files could let a developer know that to change the information in that file she needs to do an import, or edit the ODB directly.
When I import the object from the serialized format all I can know is that something changed, but without expensive processing (XML diffing is hard in the general case, we might be able to limit the structures to managable scope though) we can't know that the "foo" ZCatalog should be updated instead of the "bar" ZCatalog.
Seems like we will need to consider the import code very carefully. I don't know enough about how ZCatalog works to discuss the options intelligently. But in other indexing systems I have worked with, there have been solutions for reindexing when making updates to the corpus.
a) XML is structured enough that it can reliably hold the data from the ZODB. The current XML dump is not useful for this - it would need to create individual files and folders to represent containment.
This is pretty easy right now. Ten lines of recursive code can walk the whole tree if necessary and export only leaf objects.
Great. Maybe I am closer than I realize to the CVS management solution. I need to look more closely at ZCVSmixin to see what it does. But for our immediate need (which is to allow a distributed team of developers to share code and track changes via a central CVS repository), maybe it makes the most sense just to segment the existing XML export into directories and files and enhance the existing import to allow overwriting objects.
b) A hybrid XML and custom dump solution. An Image for example could dump out as a binary image file with meta-data in a similiarly name XML file.
Yes, each object should make its own policy regarding its body. Its metadata format should be standardized, however.
I like this idea. After I have the XML export/import working in a way that fits better with CVS (even if the sreialized representation is essentially a black box), then I can tackle how each object represents its body in a "morally plain text" serialized format. In other words, first get the default XML representation and export/import working for all objects. Then start with the easiest type of objects to serialize (such as DTML Methods) and create an easy to use serialization representation. Then work on the import for that serialized format. I think this approach would be different than FSDump and ZCVSMixin, right? As far as I understand it, FSDump just goes one way (ZODB -> filesystem) and only for certain types of objects. I don't understand what ZCVSMixin does (will need to spend some time looking at it -- unlike FSDump, ZCVSMixin is not obvious from the documentation and a quick review). Thanks for helping with this project! Fred -- Fred Wilson Horch mailto:fhorch@ecoaccess.org Executive Director, EcoAccess http://ecoaccess.org/ P.O. Box 2823, Durham, NC 27715-2823 phone: 919.419-8354
Fred Wilson Horch wrote:
I hadn't thought of the issues you raise. Thanks for mentioning them.
These are issues that may very well affect everyone and I'm happy to share my thoughts.
I guess I would suggest that the serialized form of a Zope instance by default would be a single XML file, but that arbitrary sections of that XML file could be custom dumped to separate serialized files with similiar names. That way authors would have a pretty easy job of overriding sections of the dump process to spit out one or more simple files that have little parsing overhead.
Sounds reasonable.
2) A lesser problem is when trying to edit the serialized "files". Because objects are methods and state how you modify an object can be guided if not controlled. When we have serialized the objects in a Zope system to files, we have exported only the state of the objects in the ZODB. We then have to live with the ability to foul up invariant across many objects by changing some data in the serialized format. A good example would be ZCatalogs. [...]
Yup... it's probably easiest to make ZCatalogs a black box.
Black box doesn't solve this problem, only the first one. Imagine that I move a serialized version of a Zope object that is indexed by an instance of ZCatalog (or many for that matter). When I move it the ZCatalogs must be notified to handle the change, but only at import time because ZCatalogs are serialized as binary for lots of good reasons.
I see the problem. I think the example you give can be handled adequately at import time.
But I can see other examples where allowing edits to the serialized representation could create problems that would be impossible to resolve at import.
So it seems like we might want to make some things read only. That is, when you serialize the objects in the Zope ODB to a filesystem, some of those serialized files are read-only "black boxes". A comment in those files could let a developer know that to change the information in that file she needs to do an import, or edit the ODB directly.
I'm not sure that in the most general case this would solve the problem either. :-( How do we know when the value (or rather the change in value) of a property for some Zope object should trigger some method? It depends not only on the object itself, but possibly on many other objects. This is the general problem of separating an objects state from its methods. This is also equivalent to RDBMS triggers and referential integrity. A pretty good example of this would be a Zope Product that provided Lamps and Switches. Several lamps instances could be tied to a single switch instance. When the switch is on, the lamps need to be also. If I dump this to CVS then I can change the lamps and switches data separately. Should all the property values for a lamp be read-only? Even the description property? I understand that the kinds of objects you are working on this for don't have many of these problems, and that a very useful system could be built given the 80/20 rule. I'm bringing this up to make sure we know what the other 20 means.
When I import the object from the serialized format all I can know is that something changed, but without expensive processing (XML diffing is hard in the general case, we might be able to limit the structures to managable scope though) we can't know that the "foo" ZCatalog should be updated instead of the "bar" ZCatalog.
Seems like we will need to consider the import code very carefully.
I don't know enough about how ZCatalog works to discuss the options intelligently. But in other indexing systems I have worked with, there have been solutions for reindexing when making updates to the corpus.
As I understand it, the issue with ZCatalog is a good example because of the separation of concerns. A Catalog with indexes that contain Brains to get to the actual objects, a Controller that calls reindex/unindex, and the objects themselves that don't know they are cataloged. When I'm editing the property "x" of some object "Y" it can be very hard to know that it is indexed in some Catalog. Because it is hard to know I might have a difficult time deciding what should be read-only or when doing the import of "Y" that I need to call update on some other controller object to ensure that the indexes get updated.
a) XML is structured enough that it can reliably hold the data from the ZODB. The current XML dump is not useful for this - it would need to create individual files and folders to represent containment.
This is pretty easy right now. Ten lines of recursive code can walk the whole tree if necessary and export only leaf objects.
Great. Maybe I am closer than I realize to the CVS management solution. I need to look more closely at ZCVSmixin to see what it does. But for our immediate need (which is to allow a distributed team of developers to share code and track changes via a central CVS repository), maybe it makes the most sense just to segment the existing XML export into directories and files and enhance the existing import to allow overwriting objects.
b) A hybrid XML and custom dump solution. An Image for example could dump out as a binary image file with meta-data in a similiarly name XML file.
Yes, each object should make its own policy regarding its body. Its metadata format should be standardized, however.
I like this idea.
After I have the XML export/import working in a way that fits better with CVS (even if the sreialized representation is essentially a black box), then I can tackle how each object represents its body in a "morally plain text" serialized format.
I want to add here that it may be very useful to not specify that any object have only one serial format separate from the XML default. Specifically it might make it an easier problem if the author of the export code for a type of object can dump property "x" as <object-name>-x.<format> and property "y" as <object-name>-y.<format> For example instead of just: | -- foo_page.xml -- foo_page.dtml it might be useful to have arbitrary other files created when foo_page is dumped: | -- foo_page.xml -- foo_page.dtml -- foo_page-description.txt This would imply that the description property is not captured in the foo_page.xml,but instead the easier to use text file. I'm worried about the parsing complexity when trying to build a single "morally plain text" serialized format, and I think that "morally plain" can be applied at a sub-object level to make it easier to work with. The example that comes to mind it Image: | -- icon.xml -- icon.png -- icon-description.txt The "morally plain" output here would be binary, not text. A single output file would be hard pressed to allow binary and text editing in the same file. John -- . . . . . . . . . . . . . . . . . . . . . . . . John D. Heintz | Senior Engineer 1016 La Posada Dr. | Suite 240 | Austin TX 78752 T 512.633.1198 | jheintz@isogen.com w w w . d a t a c h a n n e l . c o m
On Mon, 19 Mar 2001, John D. Heintz wrote:
I'm not sure that in the most general case this would solve the problem either. :-( How do we know when the value (or rather the change in value) of a property for some Zope object should trigger some method?
This is a definate advantage of having the class of the object implement (have the opportunity to implement) the read/write interface. It can do whatever needs to be done at write time. This applies most clearly if you are talking about an edit cycle using an external editor. When the new version is committed back, whatever triggered actions are necessary can be done. But I think it applies even if you are serializing a collection of objects, editing them, and committing the collection back. The complication there is that the order in which the objects are committed can affect the outcome. However, the outcome is still self consistent. Now, if you are talking about reconstructing a collection back to a previous (CVS) version, things get more complicated. I'm not sure that one is solvable in the general case, other than by the system that Zope itself uses (whole-db transaction based versions). --RDM
On Sun, 18 Mar 2001, Chris McDonough wrote:
"Potentially lossy" also doesn't mean "leaky". It just means that folks who expose their objects to this sort of serialization can choose their own format, and if it represents the object adequately for their own use in both directions, it's good enough.
If you want a lossless "morally binary" representation, it's probably best to use XML export, which is great for your purposes, because it already exists! ;-)
So you are saying that the reconstructed object must be functionally identical, but may or may not be bytewise identical when reconstructed from the serialization. For round trip editing, this seems reasonble. For CVS, not getting a lossless result could result in spurrious diffs. Somehow, though, I don't think this will be a problem in practice, and we can ensure that it won't be by requiring that a read of the reconstructed object produce a bytewise identical serialization to the one which was used to build the reconstructed object. This is likely to be the case in any reasonable serialization implementation, and so as I said should not be a problem in practice. --RDM
Chris McDonough wrote:
That's one use, which is important to you. Another is to use Emacs or Dreamweaver on a representation of, for example, DTML methods on a filesystem, which is important to other folks.
I think there is really only one issue nobody has been able to sort out: do we want the objects to be actually stored on the filesystem or do we want to be able to mirror a ZODB? Or do we want both?
Nobody wants to edit XML, AFAICT.
Careful, that's a religious issue in some circles. :-) I know of many people who prefer that data be stored in XML just because humans can read and change it if the need arises. (This from a guy who has actually edited ZODB XML export files. :-) ) Shane
I think there is really only one issue nobody has been able to sort out: do we want the objects to be actually stored on the filesystem or do we want to be able to mirror a ZODB? Or do we want both?
My conception of it is that objects won't be served from the filesystem, but just put in a editable serialized format for version control and tool purposes.
Nobody wants to edit XML, AFAICT.
Careful, that's a religious issue in some circles. :-) I know of many people who prefer that data be stored in XML just because humans can read and change it if the need arises. (This from a guy who has actually edited ZODB XML export files. :-) )
Eek. ;-)
Chris McDonough wrote:
I think there is really only one issue nobody has been able to sort out: do we want the objects to be actually stored on the filesystem or do we want to be able to mirror a ZODB? Or do we want both?
My conception of it is that objects won't be served from the filesystem, but just put in a editable serialized format for version control and tool purposes.
How useful/difficult would it be to put the SMB protocol on top of a ZEO server? As serialized formats are build, would this be useful to anyone? Another use for this that I think would be great for is schema migration. Right now with ZODB we can have custom __setstate__() methods that do schema migration, but being able to perform text processing changes with common tools would be great. Just thinking out loud...
Nobody wants to edit XML, AFAICT.
Careful, that's a religious issue in some circles. :-) I know of many people who prefer that data be stored in XML just because humans can read and change it if the need arises. (This from a guy who has actually edited ZODB XML export files. :-)
Eek. ;-)
_______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
-- . . . . . . . . . . . . . . . . . . . . . . . . John D. Heintz | Senior Engineer 1016 La Posada Dr. | Suite 240 | Austin TX 78752 T 512.633.1198 | jheintz@isogen.com w w w . d a t a c h a n n e l . c o m
How useful/difficult would it be to put the SMB protocol on top of a ZEO server? As serialized formats are build, would this be useful to anyone?
I'm not sure how hard it would be, although currently ZEO servers need not know about anything but strings (they don't necessarily need the code for the objects they serve lying around anywhere)... It would be pretty neat to build some sort of SMB access into a Zope client, although I imagine it would be very difficult. I think that I may have lost sight of the fact that the representations we've been throwing around aren't tied to filesystem reps, really. They're generic filesystem-like reps that should probably be used for FTP, WebDAV, SMB, etc. They're a great case for an "adapter" ;-)
participants (8)
-
Chris McDonough -
Dan L. Pierson -
Fred Wilson Horch -
John D. Heintz -
Michel Pelletier -
R. David Murray -
Shane Hathaway -
The Doctor What