Shane, AdaptableStorage is insane and beautiful - congratulations :-) It could fit a possible project we have coming up where a requirement is to store content in an XML format inside MS SQL Server. Do you have a TODO list? Are there any "particularly alpha" parts? I would need to get an idea of the risk I was taking on, but if the list is sufficiently small, I love to help mature and extend the product. Seb
seb bacon wrote:
Shane, AdaptableStorage is insane and beautiful - congratulations :-)
Thanks! I've been working on this for a long time. Two years ago a Digital Creations customer demanded proper object-relational mapping. The customer abandoned us for different reasons, but I feel like the root cause was our inability to fulfill that basic requirement. Ever since then I've focused on this. It took 99% perspiration and 1% inspiration, yet inspiration accounted for 99% of the implementation. :-)
It could fit a possible project we have coming up where a requirement is to store content in an XML format inside MS SQL Server.
It should fit that use quite well, I think.
Do you have a TODO list? Are there any "particularly alpha" parts? I would need to get an idea of the risk I was taking on, but if the list is sufficiently small, I love to help mature and extend the product.
I didn't release the alpha until all the gnarly details were dealt with, like moving objects on the filesystem and implementing complex mappers. The only things left to do before calling it a "beta" release are: - Implement specific mappers for more Zope 2 object types, like DTML methods, BTreeFolders, etc. - Come up with a good manual cache invalidation scheme. Currently, you can disable the cache (or turn on "volatile" mode, which is really the same thing) and you'll see immediate updates, but you'll lose a lot of performance. There needs to be a way for applications that modify the database to tell Zope about the modification, so Zope can reset its caches. Shane
Shane Hathaway wrote:
performance. There needs to be a way for applications that modify the database to tell Zope about the modification, so Zope can reset its caches.
But, IIRC, the last time this was discussed on a mailing list you had some cool ideas to sovle the problem, right? *grinz* Chris
Chris Withers wrote:
Shane Hathaway wrote:
performance. There needs to be a way for applications that modify the database to tell Zope about the modification, so Zope can reset its caches.
But, IIRC, the last time this was discussed on a mailing list you had some cool ideas to sovle the problem, right?
Yes, but I want to hear other people's ideas first. What do you think? Shane
Shane Hathaway wrote:
Chris Withers wrote:
Shane Hathaway wrote:
performance. There needs to be a way for applications that modify the database to tell Zope about the modification, so Zope can reset its caches.
But, IIRC, the last time this was discussed on a mailing list you had some cool ideas to sovle the problem, right?
Yes, but I want to hear other people's ideas first. What do you think?
Isn't this a different problem for each kind of storage, e.g. MD5 hash for ext2, transaction ID for foo...? Or are you referring to a different aspect of the problem? While reading the referenced thread on the subject, I found your description of the product design here: http://lists.zope.org/pipermail/zope-dev/2002-August/016981.html Could this go in the docs/ directory of the product? The design, while very clean, doesn't lend itself to immediate understanding on a cursory view of the source... seb
seb bacon wrote:
Shane Hathaway wrote:
Chris Withers wrote:
Shane Hathaway wrote:
performance. There needs to be a way for applications that modify the database to tell Zope about the modification, so Zope can reset its caches.
But, IIRC, the last time this was discussed on a mailing list you had some cool ideas to sovle the problem, right?
Yes, but I want to hear other people's ideas first. What do you think?
Isn't this a different problem for each kind of storage, e.g. MD5 hash for ext2, transaction ID for foo...? Or are you referring to a different aspect of the problem?
I'm thinking about "real-time" updates. When the underlying data changes, you'd like Zope to see the change immediately. If indefinite delays are OK, then AdaptableStorage already does enough: it raises a ConflictError if you try to write changes based on old data. The idea I like the most for relational databases is to ask the RDBMS what the ID of the last transaction was. If Zope missed a transaction, it should flush all caches. This will work if the database is infrequently changed by external applications, or if Zope is accessed infrequently. If external applications make a lot of changes, however, and Zope needs good performance at the same time, then both Zope and the external applications need to update a per-object transaction ID. Then, at the beginning of transactions, Zope would invalidate only the recently updated objects. Hmm, perhaps smarter RDBMSs could make it easy to keep transaction IDs updated using triggers. (This solution could also replace both ZEO and ZRS, BTW. ;-) ) On the filesystem, the problem seems much more difficult, since there are no transactions. You'd like the kernel to send Zope a message anytime someone modifies a file in a certain hierarchy, but that would require kernel hacking. For that case, I'm thinking that requiring external apps to touch a special file somewhere might be the right thing. At the beginning of each transaction, if Zope sees a change to the file, it flushes its cache.
While reading the referenced thread on the subject, I found your description of the product design here:
http://lists.zope.org/pipermail/zope-dev/2002-August/016981.html
Could this go in the docs/ directory of the product? The design, while very clean, doesn't lend itself to immediate understanding on a cursory view of the source...
I'm hoping to present a complete tutorial on AdaptableStorage at PyCon DC 2003. I'll integrate those notes. Thanks for pointing them out--I'd forgotten about them. The names are changed somewhat, but the basic design is the same. Shane
Shane Hathaway wrote:
On the filesystem, the problem seems much more difficult, since there are no transactions. You'd like the kernel to send Zope a message anytime someone modifies a file in a certain hierarchy, but that would require kernel hacking.
FWIW, since I had the same problem some time ago (which could be solved in another way), I dug out an url, which might be of interest - probably you already know about it: FAM, used by the two major open source desktop envs: http://oss.sgi.com/projects/fam/ It may at least help to make the whole problem more os independend. They have a lot of related pointers on their homepage. Btw. windows (>=nt IIRC) already has the capability to notify on directory alteration events, without polling. cheers, oliver
Oliver Bleutgen wrote:
Shane Hathaway wrote:
On the filesystem, the problem seems much more difficult, since there are no transactions. You'd like the kernel to send Zope a message anytime someone modifies a file in a certain hierarchy, but that would require kernel hacking.
FWIW, since I had the same problem some time ago (which could be solved in another way), I dug out an url, which might be of interest - probably you already know about it:
FAM, used by the two major open source desktop envs:
http://oss.sgi.com/projects/fam/
It may at least help to make the whole problem more os independend. They have a lot of related pointers on their homepage.
I've seen it before, but I don't think FAM is able to monitor an entire directory tree. It only monitors individual files. I'd really like to be wrong. :-)
Btw. windows (>=nt IIRC) already has the capability to notify on directory alteration events, without polling.
Do you know what API? That would sure help. Shane
Shane Hathaway wrote:
Oliver Bleutgen wrote:
Shane Hathaway wrote:
On the filesystem, the problem seems much more difficult, since there are no transactions. You'd like the kernel to send Zope a message anytime someone modifies a file in a certain hierarchy, but that would require kernel hacking.
FWIW, since I had the same problem some time ago (which could be solved in another way), I dug out an url, which might be of interest - probably you already know about it:
FAM, used by the two major open source desktop envs:
http://oss.sgi.com/projects/fam/
It may at least help to make the whole problem more os independend. They have a lot of related pointers on their homepage.
I've seen it before, but I don't think FAM is able to monitor an entire directory tree. It only monitors individual files. I'd really like to be wrong. :-)
I think you are wrong, because the manpage (for IRIX) says otherwise. Additionally, it wouldn't be of much use for kde etc. if it only could monitor files. I think a filemanager would mainly be interested in directory changes (files added/deleted). Then there's also dnotify (also reference from the FAM site) - there's hope that the "d" isn't an acronym for "file" ;). I remember someting about recent 2.4.x versions having the prerequisites to use that.
Btw. windows (>=nt IIRC) already has the capability to notify on directory alteration events, without polling.
Do you know what API? That would sure help.
I don't have any expirience on win32, but just searched google. There's Win32::ChangeNotify for perl, described here http://www.xav.com/perl/site/lib/Win32/ChangeNotify.html and this seems to use ReadDirectoryChangesW, decribed here: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/base... cheers, oliver
Oliver Bleutgen wrote:
Shane Hathaway wrote:
I've seen it before, but I don't think FAM is able to monitor an entire directory tree. It only monitors individual files. I'd really like to be wrong. :-)
I think you are wrong, because the manpage (for IRIX) says otherwise. Additionally, it wouldn't be of much use for kde etc. if it only could monitor files. I think a filemanager would mainly be interested in directory changes (files added/deleted).
I checked again. It is still limited to 1000 files or directories at a time. It's not meant for entire subtrees, it has to run as root, and it requires portmap, making it less attractive.
Then there's also dnotify (also reference from the FAM site) - there's hope that the "d" isn't an acronym for "file" ;). I remember someting about recent 2.4.x versions having the prerequisites to use that.
Now this one is quite interesting. It requires at least kernel 2.4.19, so I guess I'm at the edge of kernel development. (!) It just might do the trick, and maybe even better than I hoped. Thanks.
I don't have any expirience on win32, but just searched google. There's Win32::ChangeNotify for perl, described here http://www.xav.com/perl/site/lib/Win32/ChangeNotify.html
and this seems to use ReadDirectoryChangesW, decribed here: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/base...
Yep, that's it. With some "unicode update", it even works on Win95. But I'm going to leave the Windows work for someone else. Shane
Shane Hathaway wrote:
I'm thinking about "real-time" updates. When the underlying data changes, you'd like Zope to see the change immediately. If indefinite delays are OK, then AdaptableStorage already does enough: it raises a ConflictError if you try to write changes based on old data.
I think it depends on how you're using it, so I guess thsi wants to be configurable. Is that possible?
The idea I like the most for relational databases is to ask the RDBMS what the ID of the last transaction was. If Zope missed a transaction, it should flush all caches. This will work if the database is infrequently changed by external applications, or if Zope is accessed infrequently.
Indeed.
If external applications make a lot of changes, however, and Zope needs good performance at the same time, then both Zope and the external applications need to update a per-object transaction ID. Then, at the beginning of transactions, Zope would invalidate only the recently updated objects. Hmm, perhaps smarter RDBMSs could make it easy to keep transaction IDs updated using triggers. (This solution could also replace both ZEO and ZRS, BTW. ;-) )
This sounds cool, and the best option, when it's possible...
On the filesystem, the problem seems much more difficult, since there are no transactions. You'd like the kernel to send Zope a message anytime someone modifies a file in a certain hierarchy, but that would require kernel hacking.
How about having a seperate process which just watched the files and notifed Zope when they changed?
For that case, I'm thinking that requiring external apps to touch a special file somewhere might be the right thing. At the beginning of each transaction, if Zope sees a change to the file, it flushes its cache.
I don't think you can rely on this :-(
I'm hoping to present a complete tutorial on AdaptableStorage at PyCon DC 2003.
I hope you'll make this availabel for those of us who can't make it... cheers, Chris
Chris Withers wrote:
Shane Hathaway wrote:
I'm thinking about "real-time" updates. When the underlying data changes, you'd like Zope to see the change immediately. If indefinite delays are OK, then AdaptableStorage already does enough: it raises a ConflictError if you try to write changes based on old data.
I think it depends on how you're using it, so I guess thsi wants to be configurable. Is that possible?
I think so.
How about having a seperate process which just watched the files and notifed Zope when they changed?
A definite possibility. It might even just poke an URL to send the notification.
I'm hoping to present a complete tutorial on AdaptableStorage at PyCon DC 2003.
I hope you'll make this availabel for those of us who can't make it...
Yes, I plan to, assuming they accept my proposal. Shane
How about having a seperate process which just watched the files and notifed Zope when they changed?
A definite possibility. It might even just poke an URL to send the notification.
Since every storage will have its own unique notification scheme, which may be more or less inefficient (worst case scenario, periodic polling of entire storage for recently modified items), it might make sense to have a "notification server." It would make it simpler for users to create custom storage transaction alert handlers. For different cache invalidation scenarios, Zope could poll the server as well as get poked. seb
seb bacon wrote:
Since every storage will have its own unique notification scheme, which may be more or less inefficient (worst case scenario, periodic polling of entire storage for recently modified items), it might make sense to have a "notification server." It would make it simpler for users to create custom storage transaction alert handlers.
For different cache invalidation scenarios, Zope could poll the server as well as get poked.
I'd prefer just to have a method somewhere that, as Shane suggested, could be hit by URL, etc. A whole seperate server seems like overkill... cheers, Chris
Chris Withers wrote:
seb bacon wrote:
Since every storage will have its own unique notification scheme, which may be more or less inefficient (worst case scenario, periodic polling of entire storage for recently modified items), it might make sense to have a "notification server." It would make it simpler for users to create custom storage transaction alert handlers.
For different cache invalidation scenarios, Zope could poll the server as well as get poked.
I'd prefer just to have a method somewhere that, as Shane suggested, could be hit by URL, etc.
A whole seperate server seems like overkill...
But what about, for example, databases which don't have an efficient way to do callbacks to external applications? You may have to do a "SELECT id FROM tblObjects WHERE timestamp > some_time" or a similar kludge from a polling server. You may want this server to reside at the same location as the RDBMS, rather than as a thread in Zope. I'm worrying that if we are not to be restricted to Oracle or bleeding edge kernels, the notification part of the cache invalidation scheme may be (a) kludgy, (b) inefficient, and (c) utterly different in design between different storages. A server could offer a layer of indirection which could provide a single API for Zope to see, an opportunity to take the process load somewhere else, and a pluggable interface for writers of storages. On the other hand, I don't know much about RDBMS callbacks or filesystem accounting, so I could be inventing a problem to solve :-) seb
seb bacon wrote:
I'd prefer just to have a method somewhere that, as Shane suggested, could be hit by URL, etc.
A whole seperate server seems like overkill...
But what about, for example, databases which don't have an efficient way to do callbacks to external applications? You may have to do a "SELECT id FROM tblObjects WHERE timestamp > some_time" or a similar kludge from a polling server. You may want this server to reside at the same location as the RDBMS, rather than as a thread in Zope.
Well, if by server you could mean "script that gets run by cron every 1 minute and hits a URL in Zope if something has changed", then I might be in agreement ;-)
I'm worrying that if we are not to be restricted to Oracle or bleeding edge kernels, the notification part of the cache invalidation scheme may be (a) kludgy, (b) inefficient, and (c) utterly different in design between different storages.
I think this is as "such is life" problem. Provided AdaptableStorage provides some way (exposed url?) for an external process to say that things have changed, I think that's the best form of flexibility we can provide. cheers, Chris
On Wed, 22 Jan 2003, Chris Withers wrote:
I think this is as "such is life" problem. Provided AdaptableStorage provides some way (exposed url?) for an external process to say that things have changed, I think that's the best form of flexibility we can provide.
I think I'll provide such an URL, then. Thanks for going over this. Shane
Chris Withers wrote:
seb bacon wrote:
I'd prefer just to have a method somewhere that, as Shane suggested, could be hit by URL, etc.
A whole seperate server seems like overkill...
Well, if by server you could mean "script that gets run by cron every 1 minute and hits a URL in Zope if something has changed", then I might be in agreement ;-)
Put it like this, I don't mean "managed blade cluster running custom distributed architecture" :-)
On Wed, Jan 15, 2003 at 05:30:58PM +0000, seb bacon wrote:
Shane, AdaptableStorage is insane and beautiful - congratulations :-)
It seems to inspire insanity :) kosh and i got into a discussion on #zope about using AdaptableStorage with reiserfs4, mapping zope properties to reiserfs4 properties... finally, a fully really transparent unix filesystem <-> ZODB solution that isn't a half-assed version of either! didn't get into detail but it seems like it should be doable. -- Paul Winkler http://www.slinkp.com Look! Up in the sky! It's THE BUTTERY THIGH! (courtesy of isometric.spaceninja.com)
Paul Winkler scrive:
On Wed, Jan 15, 2003 at 05:30:58PM +0000, seb bacon wrote:
Shane, AdaptableStorage is insane and beautiful - congratulations :-)
It seems to inspire insanity :) kosh and i got into a discussion on #zope about using AdaptableStorage with reiserfs4, mapping zope properties to reiserfs4 properties... finally, a fully really transparent unix filesystem <-> ZODB solution that isn't a half-assed version of either! didn't get into detail but it seems like it should be doable.
uhumm.. nice idea... i'm thinking also about using Subversion (subversiom.tigris.org), a cvs interesting alternative, as a storage for zope objects at some level. Subversion has the concept of versioned properties on a file and, best of all, it has a complete python swig interface. I'm still thinking at what level should i do the mapping, if at Storage level or at "user" level, like CVSFolder does. After looking at the code, it seems that AdaptableStorage could help on this. azazel
Paul Winkler wrote:
On Wed, Jan 15, 2003 at 05:30:58PM +0000, seb bacon wrote:
Shane, AdaptableStorage is insane and beautiful - congratulations :-)
It seems to inspire insanity :) kosh and i got into a discussion on #zope about using AdaptableStorage with reiserfs4, mapping zope properties to reiserfs4 properties... finally, a fully really transparent unix filesystem <-> ZODB solution that isn't a half-assed version of either! didn't get into detail but it seems like it should be doable.
That's not insanity, it's just discovering that what was nasty and difficult before is now doable. :-) I really didn't appreciate the elegance of Mac resource forks until I created the ".properties" files. I understand NTFS has multiple named forks and other filesystems like ReiserFS are developing similar capabilities. Neat. Let's use them! Shane
participants (6)
-
azazel@chiaroscuro.com -
Chris Withers -
Oliver Bleutgen -
Paul Winkler -
seb bacon -
Shane Hathaway