Announce - Audio Product
Odd that noone has made a product for audio storage, analogous to Image, so I took a crack at it. So far it parses MP3 tag data. Anyone have info on RealAudio tag data? Since it's based on File, it'll actually store anything. But the cool thing is, it sucks the ID3 tag data out of the file (if there is any) and puts it into properties, so it can be indexed. You can also edit the data in the properties page, and whenever the audio is rendered, the new tag data is inserted. It should be easy to add stuff to look for WAV and AIFF header info and suck that out, since python comes with libraries to do that, but those aren't popular web formats so I haven't done it yet. It's not CatalogAware, just as Image.... I figured you'd wrap it or whatever to do that, but I'm open to suggestions. It's the first release, so obviously don't depend on it to launch rockets :) Get it here: http://www.zope.org/Members/bowerymarc/Audio
marc lindahl wrote:
Odd that noone has made a product for audio storage, analogous to Image, so I took a crack at it.
So far it parses MP3 tag data. Anyone have info on RealAudio tag data? Since it's based on File, it'll actually store anything. But the cool thing is, it sucks the ID3 tag data out of the file (if there is any) and puts it into properties, so it can be indexed. You can also edit the data in the properties page, and whenever the audio is rendered, the new tag data is inserted.
It should be easy to add stuff to look for WAV and AIFF header info and suck that out, since python comes with libraries to do that, but those aren't popular web formats so I haven't done it yet.
It's not CatalogAware, just as Image.... I figured you'd wrap it or whatever to do that, but I'm open to suggestions.
It's the first release, so obviously don't depend on it to launch rockets :)
Get it here:
Cool beans! A jump-start on my Zapster site 8^) -- | Casey Duncan | Kaivo, Inc. | cduncan@kaivo.com `------------------>
I thought about when I saw Apache-MP3, and I was on a mission to make zope do everything Apache can (I gave up). My main problem is that serving back a large file through Zope sucks as it puts the whole thing in memory. Has this been solved since last time I looked? -- Andy McKay. ----- Original Message ----- From: "Casey Duncan" <cduncan@kaivo.com> To: "marc lindahl" <marc@bowery.com> Cc: <zope@zope.org> Sent: Wednesday, May 02, 2001 12:32 PM Subject: Re: [Zope] Announce - Audio Product
marc lindahl wrote:
Odd that noone has made a product for audio storage, analogous to Image,
so
I took a crack at it.
So far it parses MP3 tag data. Anyone have info on RealAudio tag data? Since it's based on File, it'll actually store anything. But the cool thing is, it sucks the ID3 tag data out of the file (if there is any) and puts it into properties, so it can be indexed. You can also edit the data in the properties page, and whenever the audio is rendered, the new tag data is inserted.
It should be easy to add stuff to look for WAV and AIFF header info and suck that out, since python comes with libraries to do that, but those aren't popular web formats so I haven't done it yet.
It's not CatalogAware, just as Image.... I figured you'd wrap it or whatever to do that, but I'm open to suggestions.
It's the first release, so obviously don't depend on it to launch rockets :)
Get it here:
Cool beans! A jump-start on my Zapster site 8^)
-- | Casey Duncan | Kaivo, Inc. | cduncan@kaivo.com `------------------>
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
I don't think it ever did this. From looking at File, which Image and Audio are both based on, it has this data type called Pdata(). It uses this whenever the data size is larger than 2^16 (an arbitrary number buried in the file). What that does is, it chops the file into chunks (of, I think, 2^16), and builds a linked list of them. Then when it stores or retrieves the file, it only has one chunk in memory at a time. Same when it renders it, it writes it out one chunk at a time. Take a look in OFS/Image.py at the bottom of the file. Anyway, memory is cheap :)
From: "Andy McKay" <andym@ActiveState.com>
My main problem is that serving back a large file through Zope sucks as it puts the whole thing in memory. Has this been solved since last time I looked?
Thanks I'll have a look at that.
Anyway, memory is cheap :)
Memory is cheap but 20 clients x 20 meg download goes through your memory pretty quick... Cheers. -- Andy McKay. ----- Original Message ----- From: "marc lindahl" <marc@bowery.com> To: "Andy McKay" <andym@activestate.com> Cc: <zope@zope.org> Sent: Wednesday, May 02, 2001 1:49 PM Subject: Re: [Zope] Announce - Audio Product
I don't think it ever did this. From looking at File, which Image and Audio are both based on, it has this data type called Pdata(). It uses this whenever the data size is larger than 2^16 (an arbitrary number buried in the file). What that does is, it chops the file into chunks (of, I think, 2^16), and builds a linked list of them. Then when it stores or retrieves the file, it only has one chunk in memory at a time. Same when it renders it, it writes it out one chunk at a time. Take a look in OFS/Image.py at the bottom of the file.
Anyway, memory is cheap :)
From: "Andy McKay" <andym@ActiveState.com>
My main problem is that serving back a large file through Zope sucks as it puts the whole thing in memory. Has this been solved since last time I looked?
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
Thing is, any of the external file products will have to suck the whole file into a python object first, or do what File does and mete it out in chunks, so it's really the same deal. Same if you used an SQL backend or whatever, as long is it's going thru Zope. The only workaround would be to have abjects that passed absolute URLs which a different webserver, like apache, would then actually send out to the user.
Anyway, memory is cheap :)
Memory is cheap but 20 clients x 20 meg download goes through your memory pretty quick...
On 02 May 2001 15:07:29 -0400, marc lindahl wrote:
Odd that noone has made a product for audio storage, analogous to Image, so I took a crack at it. ... It's not CatalogAware, just as Image.... I figured you'd wrap it or whatever to do that, but I'm open to suggestions.
How about a version based on ExtImage, for storing ithe big datafile outside of the ZODB? :) /me thinks many, many people would be up for that version! Bill
Well, someone would have to convince me that's a good idea. So far, I think ZODB is the place to keep 'em. Of course, it's open source, someone could adapt it to ExtFile or LocalFS...
From: Bill Anderson <bill@libc.org> Date: 02 May 2001 19:14:11 -0600 To: zope@zope.org Subject: Re: [Zope] Announce - Audio Product
On 02 May 2001 15:07:29 -0400, marc lindahl wrote:
Odd that noone has made a product for audio storage, analogous to Image, so I took a crack at it. ... It's not CatalogAware, just as Image.... I figured you'd wrap it or whatever to do that, but I'm open to suggestions.
How about a version based on ExtImage, for storing ithe big datafile outside of the ZODB?
:)
/me thinks many, many people would be up for that version!
Bill
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
Somehow keeping large static files in the ZODB just doesnt seem to gel for me. I know it fits the lots of reads / few writes model, put somehow making my Data.fs bloat on account of mp3's seems wrong. I would (in fact do http://www.agmweb.ca/agmweb/dna/dan) put my mp3's on the file system through LocalFS. Makes my ZODB much smaller and happier. Plus I have them on the file system for other uses (like listening to them), Cheers. -- Andy McKay. ----- Original Message ----- From: "marc lindahl" <marc@bowery.com> To: "Bill Anderson" <bill@libc.org>; <zope@zope.org> Sent: Wednesday, May 02, 2001 6:35 PM Subject: Re: [Zope] Announce - Audio Product
Well, someone would have to convince me that's a good idea. So far, I think ZODB is the place to keep 'em.
Of course, it's open source, someone could adapt it to ExtFile or LocalFS...
From: Bill Anderson <bill@libc.org> Date: 02 May 2001 19:14:11 -0600 To: zope@zope.org Subject: Re: [Zope] Announce - Audio Product
On 02 May 2001 15:07:29 -0400, marc lindahl wrote:
Odd that noone has made a product for audio storage, analogous to Image, so I took a crack at it. ... It's not CatalogAware, just as Image.... I figured you'd wrap it or whatever to do that, but I'm open to suggestions.
How about a version based on ExtImage, for storing ithe big datafile outside of the ZODB?
:)
/me thinks many, many people would be up for that version!
Bill
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
I don't see how a smaller ZODB is a happier ZODB.... ? Keeping them in ZODB makes your site alot easier to back up! And, they take up basically the same amount of disk space on the FS or in the ZODB. If you use BTreeFolders, you could store 1000's of them in one folder (if that makes sense for you)... depending on your file system, that might be a problem on the FS.
From: "Andy McKay" <andym@ActiveState.com> Date: Wed, 2 May 2001 18:47:44 -0700 To: "marc lindahl" <marc@bowery.com>, "Bill Anderson" <bill@libc.org>, <zope@zope.org> Subject: Re: [Zope] Announce - Audio Product
Somehow keeping large static files in the ZODB just doesnt seem to gel for me.
I know it fits the lots of reads / few writes model, put somehow making my Data.fs bloat on account of mp3's seems wrong. I would (in fact do http://www.agmweb.ca/agmweb/dna/dan) put my mp3's on the file system through LocalFS. Makes my ZODB much smaller and happier.
Plus I have them on the file system for other uses (like listening to them),
Cheers. -- Andy McKay.
I don't see how a smaller ZODB is a happier ZODB.... ?
Smaller = faster to start / stop. Although a lot of that is based on getting an index for each object. I've hit the 2gb limit a few times, so I try to be careful. We'll just agree to disagree on the rest. Also we are looking at different usage for Zope, mine is more flexible and personal use. Cheers. -- Andy McKay.
On 02 May 2001 21:53:21 -0400, marc lindahl wrote:
I don't see how a smaller ZODB is a happier ZODB.... ? Keeping them in ZODB makes your site alot easier to back up!
I back up the entire Zope tree, no difference in ease. A difference in reliability, yes.
And, they take up basically the same amount of disk space on the FS or in
Actually, from where I sit, it _appears_ they take up _more_ space in the ZODB, bu tthat could just be appearances.
the ZODB. If you use BTreeFolders, you could store 1000's of them in one folder (if that makes sense for you)... depending on your file system, that might be a problem on the FS.
Right, but it can be a problem for the ZODB as well, regardless of BTreeFolder. Bill
On 02 May 2001 21:35:00 -0400, marc lindahl wrote:
Well, someone would have to convince me that's a good idea. So far, I think ZODB is the place to keep 'em.
o Well, consider the effect on the ZODB. o Some people are still (dare i say many?) running on a system that has a 2Gb limit on filesize. o A large ZODB _can_ cause problems when you need it back it up, or restore it o Consider the problems involved with a large ZODB when starting up and shutting down I have an N-Class HPUX beastie that has a 362+MB ZODb, it refuses to pack, and takes a while to start up. Memory is not an issue, believe me, I have several gigs of that untouched yet. yet it still takes an unacceptable amountof time to strart up. o Consider the problem of someone uploading a diferent (perhaps improved mp3) file. Now, until you pack past this change, you have _two_ copies of the data. Actually, I believe that even if only some metadata is changed, you get another copy. This. Gets. Ugly. Fast. o Consider the ZSP that would much rather have the data stored in the FS where they can keep better track of it (and use quotas perhaps) o Consider the possibility of having the file sitting on the OS's file system, available for outbound-ftp through non-zope means, or local system use (for. those who would use it on their local machine as an MP3 organizer) or for an audio-streamer program, such as icecast or shoutcast, to stream them out Just some things to consider. ;^)~ Bill
This is a very necessary discussion, methinks... there's been bits of threads, but I don't think enough.
From: Bill Anderson <bill@libc.org>
o Some people are still (dare i say many?) running on a system that has a 2Gb limit on filesize.
OK, they can try this: http://www.zope.org/Members/hathawsh/PartitionedFileStorage
o A large ZODB _can_ cause problems when you need it back it up, or restore it
how, more than a large bunch of files in a directory tree? Either way, typical backup software splits for CD burning, multiple tapes, etc.
o Consider the problems involved with a large ZODB when starting up and shutting down I have an N-Class HPUX beastie that has a 362+MB ZODb, it refuses to pack, and takes a while to start up.
What do you mean, it refuses to pack? It gives an error or something?
Memory is not an issue, believe me, I have several gigs of that untouched yet. yet it still takes an unacceptable amountof time to strart up.
Several gigs of RAM? Cool! Maybe someone from DC can detail what goes on at startup that might take alot of time. I guess, this is important for a desktop machine, that you turn on every day -- for a server that's always on, no big deal, right?
o Consider the problem of someone uploading a diferent (perhaps improved mp3) file. Now, until you pack past this change, you have _two_ copies of the data. Actually, I believe that even if only some metadata is changed, you get another copy. This. Gets. Ugly. Fast.
Actually, this could be a good thing. What if the 'improved' version is broken somehow? Or the meta data was in error? 'Undo'! Also, importantly, version control! If you're talking about a site like MP3.COM where you have lots of people uploading their own stuff, then you might have a problem with packing, with mp3's, images, other stuff. What to do about it? A couple of ideas come to mind: pack often, like every day, or use the packless database and forgo the feature, or ...? I'm aiming at uploading of audio as something with a little more control (like a review process), so having 1000 versions of the same file unpacked isn't an issue, and the versioning is important.
o Consider the ZSP that would much rather have the data stored in the FS where they can keep better track of it (and use quotas perhaps)
If they're in linux, they could quota the data.fs, no difference there... I don't know that they should be monitoring your files in either case, should they?
o Consider the possibility of having the file sitting on the OS's file system, available for outbound-ftp through non-zope means, or local system use (for. those who would use it on their local machine as an MP3 organizer) or for an audio-streamer program, such as icecast or shoutcast, to stream them out
These are the important use cases I see for external storage. Though, as far as icecast goes, someone should beat that code into a Product, it would be great to have streaming built in to zope!
Memory is not an issue, believe me, I have several gigs of that untouched yet. yet it still takes an unacceptable amountof time to strart up.
Several gigs of RAM? Cool! Maybe someone from DC can detail what goes on at startup that might take alot of time.
When you start a ZODB up, an index is built of every object inside the ZODB so it that can call it again later, effectively it reads through the entire DB. This also removes any corrupted or broken objects on the ZODB. This is directly related then to the number of objects and hence size. When this starts getting signifcant depends on many factors. Ive had 1 gig db's and found them a pain to work with from this point of view. But Ive never had a corrupted database I couldnt get into. That is good. Shane Hathaways Refresh product really helped there though :) Cheers. -- Andy McKay.
BTW, there's a product called RadioFreePete that has stuff to scan your disk for MP3s, make playlists, and control icecast...
On Wed, 2 May 2001 20:26:37 -0700, "Andy McKay" <andym@ActiveState.com> wrote:
Several gigs of RAM? Cool! Maybe someone from DC can detail what goes on at startup that might take alot of time.
When you start a ZODB up
strictly, FileStorage. ZODB on BerkelyStorage is very much better.
, an index is built of every object inside the ZODB so it that can call it again later, effectively it reads through the entire DB.
With FileStorage, This only happens if zope is not shut down cleanly. A clean shut down will leave a var/Data.fs.index file. On the next startup it loads this index if it is present, or scans the whole Data.fs if it is not. Toby Dickenson tdickenson@geminidataloggers.com
From: Toby Dickenson <tdickenson@devmail.geminidataloggers.co.uk>
, an index is built of every object inside the ZODB so it that can call it again later, effectively it reads through the entire DB.
With FileStorage, This only happens if zope is not shut down cleanly.
A clean shut down will leave a var/Data.fs.index file. On the next startup it loads this index if it is present, or scans the whole Data.fs if it is not.
The other thing that occurred to me.... this indexing time would depend on the number of objects, not their size, so it should be the same for the internally-stored mp3 and an hypothetical externally-stored mp3 with internal meta...
--On 05/02/01 20:26:37 -0700 Andy McKay chiseled:
Several gigs of RAM? Cool! Maybe someone from DC can detail what goes on at startup that might take alot of time.
When you start a ZODB up, an index is built of every object inside the ZODB so it that can call it again later, effectively it reads through the entire DB. [...] Ive had 1 gig db's and found them a pain to work with from this point of view.
2 things: first, if you use ZEO, then this startup time is pushed to the storage server. Your ZClients can start up in seconds, depending on how large the ZEO cache is. For example, with the 2.8GB Zope.org data.fs, ZEO clients take about 13 seconds to start up with a 300mb cache and 2 seconds to start up with no cache. if you are doing lots of product changes, make sure that ZEO_CLIENT is not set so that your products reinitialize. As far as the memory usage goes, Berkeley Storage will help this tremendously. Hope that helps, -- -mindlace- zopatista community liason
From: ethan mindlace fremen <mindlace@digicool.com>
2 things: first, if you use ZEO, then this startup time is pushed to the storage server. Your ZClients can start up in seconds, depending on how large the ZEO cache is. For example, with the 2.8GB Zope.org data.fs, ZEO clients take about 13 seconds to start up with a 300mb cache and 2 seconds to start up with no cache. if you are doing lots of product changes, make sure that ZEO_CLIENT is not set so that your products reinitialize.
How long does it take the storage server to start?
On 02 May 2001 23:13:01 -0400, marc lindahl wrote:
This is a very necessary discussion, methinks... there's been bits of threads, but I don't think enough.
From: Bill Anderson <bill@libc.org>
o Some people are still (dare i say many?) running on a system that has a 2Gb limit on filesize.
OK, they can try this: http://www.zope.org/Members/hathawsh/PartitionedFileStorage
o A large ZODB _can_ cause problems when you need it back it up, or restore it
how, more than a large bunch of files in a directory tree? Either way, typical backup software splits for CD burning, multiple tapes, etc.
Which is worse: A corrupted Data.fs, or a corrupted .mp3 file? If you get a corruption in the backup process (Yes, it ODES happen), or the restore, which would you rather have?
o Consider the problems involved with a large ZODB when starting up and shutting down I have an N-Class HPUX beastie that has a 362+MB ZODb, it refuses to pack, and takes a while to start up.
What do you mean, it refuses to pack? It gives an error or something?
No error, and no pack. I haven't had a lot of time to go into it ...
Memory is not an issue, believe me, I have several gigs of that untouched yet. yet it still takes an unacceptable amountof time to strart up.
Several gigs of RAM? Cool! Maybe someone from DC can detail what goes on at startup that might take alot of time.
Well, indexing the db is a starting point, I believe.
I guess, this is important for a desktop machine, that you turn on every day -- for a server that's always on, no big deal, right?
Wrong. Would you like to fsck that 20 GB filesystem each time you started up? It is also common for the developer/admin to have a duplicate install of their production site to test out new code on. these servers tend to be restarted frequently. And yes, on a desktop machine, it would indeed get nightmarish.
o Consider the problem of someone uploading a diferent (perhaps improved mp3) file. Now, until you pack past this change, you have _two_ copies of the data. Actually, I believe that even if only some metadata is changed, you get another copy. This. Gets. Ugly. Fast.
Actually, this could be a good thing. What if the 'improved' version is broken somehow? Or the meta data was in error? 'Undo'!
Also, importantly, version control!
you can version control the metadata.
If you're talking about a site like MP3.COM where you have lots of people uploading their own stuff, then you might have a problem with packing, with mp3's, images, other stuff. What to do about it? A couple of ideas come to mind: pack often, like every day, or use the packless database and forgo the feature, or ...?
pack every day, and you may as well forget about an undo from yesterday.
I'm aiming at uploading of audio as something with a little more control (like a review process), so having 1000 versions of the same file unpacked isn't an issue, and the versioning is important.
Like I said, you can make it so the metadata is version-controllable. 1000 copies at 2MB=2GB ... for ONE lousy MP3? While _you_ may be fine with that, most people ar enot. Still, it is your choice. Multiply that by, say a hundred mp3s ...
o Consider the ZSP that would much rather have the data stored in the FS where they can keep better track of it (and use quotas perhaps)
If they're in linux, they could quota the data.fs, no difference there... I
Except that you are then you are using quota on the _entire_ Data.fs. With an ExtFile you can have a seperate mountpoint that has a separate quota control. say I only want up to X Gb of mp3 files, but would also like up toX Gb of other objects, such as ExtImage objects. by limiting the data.fs, you can't do that very well. Additionally, maybe I don't want 1.5-2.5 Gb of a single song stored. ;^)~
don't know that they should be monitoring your files in either case, should they?
For filesize, to determine how much space is bein used, why not? What they *should* be doing is not at issue her eanyway.
o Consider the possibility of having the file sitting on the OS's file system, available for outbound-ftp through non-zope means, or local system use (for. those who would use it on their local machine as an MP3 organizer) or for an audio-streamer program, such as icecast or shoutcast, to stream them out
These are the important use cases I see for external storage. Though, as far as icecast goes, someone should beat that code into a Product, it would be great to have streaming built in to zope!
If it could do it reliably and speedily, yes, it would be awesome. Anyway, I was just kicking out some real-world reasons whyan ExtMP# type file would be (more) useful. Feel free to disagree, just understand that there are many of us out here here who have very valid reasons for not storing file objects that average over 1MB in the ZODb, especially when we are talking about potentially thousands, as you indicated. I've gone through all these issues in developing a Zope Advertising Product. Bill
From: Bill Anderson <bill@libc.org>
Which is worse: A corrupted Data.fs, or a corrupted .mp3 file? If you get a corruption in the backup process (Yes, it ODES happen), or the restore, which would you rather have?
Well, if it was something like the MP3.COM site, there wouldn't be another centralized source for replacements for the .mp3 files, so I don't see the issue. If the backup is corrupted, then you'd go to the next-oldest backup...
What do you mean, it refuses to pack? It gives an error or something?
No error, and no pack. I haven't had a lot of time to go into it ...
Anyone else out there have this problem??? Hope it isn't a serious bug...!
Also, importantly, version control!
you can version control the metadata.
That's another ball of worms - syncronizing with the external data (e.g. what if the file moves/renames/changes? Do you automatically scan? Or have a maintainance procedure of some kind to check? And have error handling, etc.?) I haven't looked at the code of ExtFile or LocalFS, don't know how they deal with that.
pack every day, and you may as well forget about an undo from yesterday.
Right, which is bad for all the other stuff you might want to undo...
I'm aiming at uploading of audio as something with a little more control (like a review process), so having 1000 versions of the same file unpacked isn't an issue, and the versioning is important.
Like I said, you can make it so the metadata is version-controllable. 1000 copies at 2MB=2GB ... for ONE lousy MP3? While _you_ may be fine with that, most people ar enot. Still, it is your choice. Multiply that by, say a hundred mp3s ...
Well, I was overstating the case. But I find it typical that, for example, one might mistakenly put up the wrong mix of a song, and have to pull it quickly... or that a mistake with some meta data, like publisher name or song title, is made once or twice before it's gotten right. I wonder if, to get super-tweeky, it would be possible for a 'partial pack', where you could follow the 'thread' of a particular object or set of objects, and just pack those?
Except that you are then you are using quota on the _entire_ Data.fs. With an ExtFile you can have a seperate mountpoint that has a separate quota control. say I only want up to X Gb of mp3 files, but would also like up toX Gb of other objects, such as ExtImage objects. by limiting the data.fs, you can't do that very well. Additionally, maybe I don't want 1.5-2.5 Gb of a single song stored. ;^)~
A related thing I was thinking about was how to limit file size of individual files, and individual user quotas. My answer to that one was: 1. filesize would be limited by the 30 minute built-in timeout of zope... you can do the math... as long as you have enough storage for one of those (for each zope thread) you won't die... then you could, after upload, check the filesize of the file (either internal or external), and if it's too big, delete it and warn the user. 2. again, you can have methods to scan the user's area and add up his file storage, and perhaps disable uploading if it's over. To me, it seems you could integrate these types of controls better with the site functionality if everything is inside zope.
Anyway, I was just kicking out some real-world reasons whyan ExtMP# type file would be (more) useful. Feel free to disagree, just understand that there are many of us out here here who have very valid reasons for not storing file objects that average over 1MB in the ZODb, especially when we are talking about potentially thousands, as you indicated. I've gone through all these issues in developing a Zope Advertising Product.
I'm looking at this issue keenly, and trying to dig down to the roots on these issues... I don't think there are easy answers, but overall I think the ZODB is underappreciated, and I'm trying to get a better grip on it's limitations. I'm still not clear on how the size of a python object is a factor (as opposed to the number of objects in data.fs), though it's pretty clear to me that anything that passes thru zope (e.g. via ExtFile) is at some point inside zope and therefore the same, at that point, as something stored in data.fs, as far as speed and performance goes.
Thats just a matter of trust to me. Ext2 has proven to be stable even if I did nasty things to it. Ext2 and the tools it comes with has had a lot more exposure, and a lot more time to mature (and besides Data.fs sits on top of Ext2, so you have the vulnerabilities of both, in a single point of failure). The added features of ZODB sure would be handy sometimes, and yes, the added bookkeeping is painful, but I simply wouldn't want big amounts of data in ZODB. Or in a RDBMS, for that matter. Use RDBMS or ZODB for metadata, keep the bulk in lowtech storage... ru, peter. On Thu, 3 May 2001, marc lindahl wrote:
From: Bill Anderson <bill@libc.org>
Which is worse: A corrupted Data.fs, or a corrupted .mp3 file? If you get a corruption in the backup process (Yes, it ODES happen), or the restore, which would you rather have?
Well, if it was something like the MP3.COM site, there wouldn't be another centralized source for replacements for the .mp3 files, so I don't see the issue. If the backup is corrupted, then you'd go to the next-oldest backup...
What do you mean, it refuses to pack? It gives an error or something?
No error, and no pack. I haven't had a lot of time to go into it ...
Anyone else out there have this problem??? Hope it isn't a serious bug...!
Also, importantly, version control!
you can version control the metadata.
That's another ball of worms - syncronizing with the external data (e.g. what if the file moves/renames/changes? Do you automatically scan? Or have a maintainance procedure of some kind to check? And have error handling, etc.?) I haven't looked at the code of ExtFile or LocalFS, don't know how they deal with that.
pack every day, and you may as well forget about an undo from yesterday.
Right, which is bad for all the other stuff you might want to undo...
I'm aiming at uploading of audio as something with a little more control (like a review process), so having 1000 versions of the same file unpacked isn't an issue, and the versioning is important.
Like I said, you can make it so the metadata is version-controllable. 1000 copies at 2MB=2GB ... for ONE lousy MP3? While _you_ may be fine with that, most people ar enot. Still, it is your choice. Multiply that by, say a hundred mp3s ...
Well, I was overstating the case. But I find it typical that, for example, one might mistakenly put up the wrong mix of a song, and have to pull it quickly... or that a mistake with some meta data, like publisher name or song title, is made once or twice before it's gotten right.
I wonder if, to get super-tweeky, it would be possible for a 'partial pack', where you could follow the 'thread' of a particular object or set of objects, and just pack those?
Except that you are then you are using quota on the _entire_ Data.fs. With an ExtFile you can have a seperate mountpoint that has a separate quota control. say I only want up to X Gb of mp3 files, but would also like up toX Gb of other objects, such as ExtImage objects. by limiting the data.fs, you can't do that very well. Additionally, maybe I don't want 1.5-2.5 Gb of a single song stored. ;^)~
A related thing I was thinking about was how to limit file size of individual files, and individual user quotas. My answer to that one was: 1. filesize would be limited by the 30 minute built-in timeout of zope... you can do the math... as long as you have enough storage for one of those (for each zope thread) you won't die... then you could, after upload, check the filesize of the file (either internal or external), and if it's too big, delete it and warn the user. 2. again, you can have methods to scan the user's area and add up his file storage, and perhaps disable uploading if it's over.
To me, it seems you could integrate these types of controls better with the site functionality if everything is inside zope.
Anyway, I was just kicking out some real-world reasons whyan ExtMP# type file would be (more) useful. Feel free to disagree, just understand that there are many of us out here here who have very valid reasons for not storing file objects that average over 1MB in the ZODb, especially when we are talking about potentially thousands, as you indicated. I've gone through all these issues in developing a Zope Advertising Product.
I'm looking at this issue keenly, and trying to dig down to the roots on these issues... I don't think there are easy answers, but overall I think the ZODB is underappreciated, and I'm trying to get a better grip on it's limitations. I'm still not clear on how the size of a python object is a factor (as opposed to the number of objects in data.fs), though it's pretty clear to me that anything that passes thru zope (e.g. via ExtFile) is at some point inside zope and therefore the same, at that point, as something stored in data.fs, as far as speed and performance goes.
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
-- _________________________________________________ peter sabaini, mailto: sabaini@niil.at -------------------------------------------------
participants (7)
-
Andy McKay -
Bill Anderson -
Casey Duncan -
ethan mindlace fremen -
marc lindahl -
Peter Sabaini -
Toby Dickenson