HI all I am trying to mentally figure out how zope can help manage a huge archives of audio files. A database can't hold audio files even as blobs they are to large. What I have seen done with relational databases is to hold the audio files in a directory and scan the file headers as well as the directory contents and keep them in sync with at database. Somehow I got the feeling that with zope I can store the actual audio files (ogg) in the zope database. However I am not sure how this helps me. I have audio files and their metadata as well as parallel music notation files for each recording which I need to keep in sync. I need a frontend to add the audio. This will query the db to see if a recording/notation file already exist and if there is metadata for either the notation or the recording and permit the addition of the metadata from the notation to the recording or visa versa. Now my question is how can I take advantage of existing zope objects/products to manage my audio archives. Thanks Aaron
Hi Aaron... I wouldn't place the songs in Zope... Instead I would use LocalFS to get access to them... Using LocalFS you can access the files on normal disk as if they where zope objects, while they remain out side of the Zope database. This will avoid database bloating, which you would most definatetly have to fight with otherwise... You should probably look into other "Local FIle System" tools, some of them offer more integration into Zope as LocalFS. Which you might want. Jerry On Fri, 2004-05-21 at 00:24, Aaron wrote:
HI all
I am trying to mentally figure out how zope can help manage a huge archives of audio files.
A database can't hold audio files even as blobs they are to large. What I have seen done with relational databases is to hold the audio files in a directory and scan the file headers as well as the directory contents and keep them in sync with at database.
Somehow I got the feeling that with zope I can store the actual audio files (ogg) in the zope database. However I am not sure how this helps me.
I have audio files and their metadata as well as parallel music notation files for each recording which I need to keep in sync.
I need a frontend to add the audio. This will query the db to see if a recording/notation file already exist and if there is metadata for either the notation or the recording and permit the addition of the metadata from the notation to the recording or visa versa.
Now my question is how can I take advantage of existing zope objects/products to manage my audio archives.
Thanks Aaron
_______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
On Fri, 2004-05-21 at 00:24, Aaron wrote:
I am trying to mentally figure out how zope can help manage a huge archives of audio files. [...] Somehow I got the feeling that with zope I can store the actual audio files (ogg) in the zope database. However I am not sure how this helps me.
On Thursday 20 May 2004 07:02 pm, Jerome R. Westrick wrote:
I wouldn't place the songs in Zope... Instead I would use LocalFS to get access to them...
I've tried this pretty successfully on my personal website. I pulled it because I just couldn't afford the bandwidth to stream audio files, and it was just a lark anyway. ;-)
You should probably look into other "Local FIle System" tools, some of them offer more integration into Zope as LocalFS. Which you might want.
I specifically used the "Streaming" variant of LocalFS I can't find this online now, although there is a product that seems to use it: http://zope.org/Members/morphex/media_file_system_1_0 which may be based on it? Cheers, Terry -- Terry Hancock ( hancock at anansispaceworks.com ) Anansi Spaceworks http://www.anansispaceworks.com
On Fri, 2004-05-21 at 03:02, Jerome R. Westrick wrote:
Hi Aaron...
I wouldn't place the songs in Zope... Instead I would use LocalFS to get access to them...
Would I use the zope db to keep track of the audio or a relational db? My question Is generally more generic, what advantage does zope give me over another solution. The streaming issue is secondary for me as is at this time bandwidth. I will have hundreds of recording that will need to be maintained, searched, edited, listened to etc. I am trying to see how zope will help me do this. Aaron
Using LocalFS you can access the files on normal disk as if they where zope objects, while they remain out side of the Zope database.
This will avoid database bloating, which you would most definatetly have to fight with otherwise...
You should probably look into other "Local FIle System" tools, some of them offer more integration into Zope as LocalFS. Which you might want.
Jerry
On Fri, 2004-05-21 at 00:24, Aaron wrote:
HI all
I am trying to mentally figure out how zope can help manage a huge archives of audio files.
A database can't hold audio files even as blobs they are to large. What I have seen done with relational databases is to hold the audio files in a directory and scan the file headers as well as the directory contents and keep them in sync with at database.
Somehow I got the feeling that with zope I can store the actual audio files (ogg) in the zope database. However I am not sure how this helps me.
I have audio files and their metadata as well as parallel music notation files for each recording which I need to keep in sync.
I need a frontend to add the audio. This will query the db to see if a recording/notation file already exist and if there is metadata for either the notation or the recording and permit the addition of the metadata from the notation to the recording or visa versa.
Now my question is how can I take advantage of existing zope objects/products to manage my audio archives.
Thanks Aaron
_______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
_______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Basically. the way I see it is as follows: In Zope I implent my Objects (if I'm using that model) In Zope I implement my seperation of "Display code" and "application code". In my last project I tried to keep the database mostly read only, and stored all data in postgress. I had applicational reasons for this but it also worked out rather well, having code in Zope and Data in PG. In a preveous project, I stored PDF's in the Zope DB. This is definatetly a mistake, which causes problems with an Over fulled Zope DB holding a history of PDFs for nothing. Hope this helps Jerry On Fri, 2004-05-21 at 08:04, Aaron wrote:
On Fri, 2004-05-21 at 03:02, Jerome R. Westrick wrote:
Hi Aaron...
I wouldn't place the songs in Zope... Instead I would use LocalFS to get access to them...
Would I use the zope db to keep track of the audio or a relational db?
My question Is generally more generic, what advantage does zope give me over another solution.
The streaming issue is secondary for me as is at this time bandwidth. I will have hundreds of recording that will need to be maintained, searched, edited, listened to etc.
I am trying to see how zope will help me do this.
Aaron
Using LocalFS you can access the files on normal disk as if they where zope objects, while they remain out side of the Zope database.
This will avoid database bloating, which you would most definatetly have to fight with otherwise...
You should probably look into other "Local FIle System" tools, some of them offer more integration into Zope as LocalFS. Which you might want.
Jerry
On Fri, 2004-05-21 at 00:24, Aaron wrote:
HI all
I am trying to mentally figure out how zope can help manage a huge archives of audio files.
A database can't hold audio files even as blobs they are to large. What I have seen done with relational databases is to hold the audio files in a directory and scan the file headers as well as the directory contents and keep them in sync with at database.
Somehow I got the feeling that with zope I can store the actual audio files (ogg) in the zope database. However I am not sure how this helps me.
I have audio files and their metadata as well as parallel music notation files for each recording which I need to keep in sync.
I need a frontend to add the audio. This will query the db to see if a recording/notation file already exist and if there is metadata for either the notation or the recording and permit the addition of the metadata from the notation to the recording or visa versa.
Now my question is how can I take advantage of existing zope objects/products to manage my audio archives.
Thanks Aaron
_______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
_______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
_______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
On Fri, May 21, 2004 at 05:14:54PM +0200, Jerome R. Westrick wrote:
Basically. the way I see it is as follows:
In Zope I implent my Objects (if I'm using that model)
In Zope I implement my seperation of "Display code" and "application code".
In my last project I tried to keep the database mostly read only, and stored all data in postgress. I had applicational reasons for this but it also worked out rather well, having code in Zope and Data in PG.
In a preveous project, I stored PDF's in the Zope DB. This is definatetly a mistake, which causes problems with an Over fulled Zope DB holding a history of PDFs for nothing.
*shrug* there are a lot of options there. It is now possible to "mount" multiple databases in one Zope instance. So all the audio files could go in a dedicated ZODB - something capable of scaling to many GB of data, so I'd probably choose DirectoryStorage. Activity in this database wouldn't affect any other zope data. And you could pack it however often you like. Re. some of the earlier comments:
On Fri, 2004-05-21 at 08:04, Aaron wrote:
On Fri, 2004-05-21 at 03:02, Jerome R. Westrick wrote:
Hi Aaron...
I wouldn't place the songs in Zope... Instead I would use LocalFS to get access to them...
At one time I would have agreed, to keep the ZODB from getting unwieldy; but nowadays, unless I had another reason that I wanted the files easily accessible as plain files on the filesystem, I would just use DirectoryStorage instead. It's robust, and given an appropriate filesystem it can handle really massive amounts of data, and it's easier to set up than fiddling around with LocalFS or ExternalFile or whatever. Download speed of blobs is still an issue. Note that the commonly deployed versions of LocalFS and many other attempts at keeping Zope data on the filesystem actually perform MUCH worse than the normal ZODB-based File class. I would handle the speed issue by throwing massive amounts of disk caching at the problem. If the content is anonymously downloadable I'd use Squid or apache + caching; if not, I'd use the FileCacheManager that we hacked up at Pycon. And if using ZEO, I'd use an enormous ZEO client cache. For more background on blobs in zope, see: http://www.slinkp.com/code/zopestuff/blobnotes Sure, you could end up eating several times the size of your data set for all these disk caches... but who cares? At roughly a dollar per gigabyte, buying more disk space is a much better investment than spending even an hour worrying about it.
Would I use the zope db to keep track of the audio or a relational db?
If it were me, regardless of where the data is stored, I would use zope's Catalog to provide searching and browsing capabilities based on metadata. But I guess it depends on what kind of queries you need to be able to do. The Catalog is pretty flexible, but probably not as flexible as a full-blown query language (SQL). To get an idea of what zope's catalog can do, take a look at plone.org. The front page uses the catalog for all the cool stuff on the right: * the Search feature * the News portlet to the right * the Upcoming Events portlet * the Calendar at lower right
My question Is generally more generic, what advantage does zope give me over another solution.
It's just a different way of working / thinking. Once you have some understanding of zope, you can work pretty darn fast. And it allows you to cleanly separate business logic from presentation code and from content (although many people do not do so). -- Paul Winkler http://www.slinkp.com
Thanks for everyones advice, I will indeed have massive amount of data to store. I don't envision massive bandwidth. I also don't envision letting users to actually download audio files, rather to create playlist to listen to streaming. After reading the replies I see I have a lot of learning to do before I get to far afield. Thanks Aaron I am now reading about On Fri, 2004-05-21 at 19:24, Paul Winkler wrote:
On Fri, May 21, 2004 at 05:14:54PM +0200, Jerome R. Westrick wrote:
Basically. the way I see it is as follows:
In Zope I implent my Objects (if I'm using that model)
In Zope I implement my seperation of "Display code" and "application code".
In my last project I tried to keep the database mostly read only, and stored all data in postgress. I had applicational reasons for this but it also worked out rather well, having code in Zope and Data in PG.
In a preveous project, I stored PDF's in the Zope DB. This is definatetly a mistake, which causes problems with an Over fulled Zope DB holding a history of PDFs for nothing.
*shrug* there are a lot of options there. It is now possible to "mount" multiple databases in one Zope instance. So all the audio files could go in a dedicated ZODB - something capable of scaling to many GB of data, so I'd probably choose DirectoryStorage. Activity in this database wouldn't affect any other zope data. And you could pack it however often you like.
Re. some of the earlier comments:
On Fri, 2004-05-21 at 08:04, Aaron wrote:
On Fri, 2004-05-21 at 03:02, Jerome R. Westrick wrote:
Hi Aaron...
I wouldn't place the songs in Zope... Instead I would use LocalFS to get access to them...
At one time I would have agreed, to keep the ZODB from getting unwieldy; but nowadays, unless I had another reason that I wanted the files easily accessible as plain files on the filesystem, I would just use DirectoryStorage instead. It's robust, and given an appropriate filesystem it can handle really massive amounts of data, and it's easier to set up than fiddling around with LocalFS or ExternalFile or whatever.
Download speed of blobs is still an issue. Note that the commonly deployed versions of LocalFS and many other attempts at keeping Zope data on the filesystem actually perform MUCH worse than the normal ZODB-based File class.
I would handle the speed issue by throwing massive amounts of disk caching at the problem. If the content is anonymously downloadable I'd use Squid or apache + caching; if not, I'd use the FileCacheManager that we hacked up at Pycon. And if using ZEO, I'd use an enormous ZEO client cache.
For more background on blobs in zope, see: http://www.slinkp.com/code/zopestuff/blobnotes
Sure, you could end up eating several times the size of your data set for all these disk caches... but who cares? At roughly a dollar per gigabyte, buying more disk space is a much better investment than spending even an hour worrying about it.
Would I use the zope db to keep track of the audio or a relational db?
If it were me, regardless of where the data is stored, I would use zope's Catalog to provide searching and browsing capabilities based on metadata. But I guess it depends on what kind of queries you need to be able to do. The Catalog is pretty flexible, but probably not as flexible as a full-blown query language (SQL).
To get an idea of what zope's catalog can do, take a look at plone.org. The front page uses the catalog for all the cool stuff on the right:
* the Search feature * the News portlet to the right * the Upcoming Events portlet * the Calendar at lower right
My question Is generally more generic, what advantage does zope give me over another solution.
It's just a different way of working / thinking. Once you have some understanding of zope, you can work pretty darn fast. And it allows you to cleanly separate business logic from presentation code and from content (although many people do not do so).
*shrug* there are a lot of options there. It is now possible to "mount" multiple databases in one Zope instance. So all the audio files could go in a dedicated ZODB - something capable of scaling to many GB of data, so I'd probably choose DirectoryStorage. Activity in this database wouldn't affect any other zope data. And you could pack it however often you like.
So let me get this straight. I just setup directories and dump my files in to them. I can have a zopedb for audio and and zopedb for notation, and another zopedb for site related things?
Re. some of the earlier comments:
On Fri, 2004-05-21 at 08:04, Aaron wrote:
On Fri, 2004-05-21 at 03:02, Jerome R. Westrick wrote:
Hi Aaron...
I wouldn't place the songs in Zope... Instead I would use LocalFS to get access to them...
At one time I would have agreed, to keep the ZODB from getting unwieldy; but nowadays, unless I had another reason that I wanted the files easily accessible as plain files on the filesystem, I would just use DirectoryStorage instead. It's robust, and given an appropriate filesystem
an appropriate filesystem??
it can handle really massive amounts of data, and it's easier to set up than fiddling around with LocalFS or ExternalFile or whatever.
Download speed of blobs is still an issue. I doubt I will allow direct downloads and listening will be streaming.
Have you seen the audio products? In what way will they help me? Is it good enough to make a folder and dump the audio in there or will the audio products help me in some way?
Note that the commonly deployed versions of LocalFS and many other attempts at keeping Zope data on the filesystem actually perform MUCH worse than the normal ZODB-based File class.
I actually tried installing the patched LocalFS but didn't get to far into it.
I would handle the speed issue by throwing massive amounts of disk caching at the problem. If the content is anonymously downloadable I'd use Squid or apache + caching; if not, I'd use the FileCacheManager that we hacked up at Pycon. And if using ZEO, I'd use an enormous ZEO client cache.
Well I think the largest ogg file will be 9 megs. Certainly not 30- 50 meg. I am not planning to run apache in addition to zope. I haven't looked at ZEO and I will do that soon I guess. If anyone is willing/interested in giving some feedback about how to best use zope for my project, My Wiki explains the project details and design issues. I would most welcome and comments, suggestions, etc. Thanks again for the help. I am as you can tell very new to Zope. Aaron
DirectoryStorage can handle really massive amounts of data, and it's easier to set up than fiddling around with LocalFS or ExternalFile or whatever.
Download speed of blobs is still an issue.
I doubt I will allow direct downloads and listening will be streaming.
The fact that the files are stored within Zope means that you will still have to serve them up via Zope (either download or streaming), and as has been previously mentioned, Zope is not very good at serving up large files.
Have you seen the audio products? In what way will they help me?
CMFAudio, written by Marc Bowery is fine for small MP3 files, but still stores the files in the ZODB, so it wouldn't be suitable for large MP3 files.
Is it good enough to make a folder and dump the audio in there or will the audio products help me in some way?
It depends on if these audio files are going to be read-only, and only addable by site administrator, or if you want to workflow, edit and search for them. If the former, you probably want a solution such as LocalFS (with the streaming support), although performance may not be good enough for your needs. From your first post, it sounds like you need a frontend to add the audio and edit the metadata. You may be interested to hear that I'm working on ATAudio, an Archetypes-based content type which stores the metadata about the audio file (song title, artist, album, year, genre, etc.) in the ZODB, but stores the actual binary file on the file system. The cool thing is that we've implemented an ID3Storage layer which acts as an interface for reading/writing the ID3 tags from/to the MP3 files This was done using pyid3lib (http://pyid3lib.sourceforge.net). This saves you from doing a lot of data-entry, especially if your audio files already have metadata embedded in them. And it allows you to keep the metadata stored both in the MP3 file itself (so it's used when people stream your audio), but also in your Zope database (so it is searchable within your website). You upload an MP3 file to your Plone site (or drag-n-drop via WebDAV) and it will appear as an object in your site complete with metadata.
Note that the commonly deployed versions of LocalFS and many other attempts at keeping Zope data on the filesystem actually perform MUCH worse than the normal ZODB-based File class.
Unless you don't even use Zope to serve the file, but simply provide a pointer to the URL where another 3rd party tool (such as Apache, Icecast, Shoutcast, Edna, etc.) actually serves up the file. This is how we do it with ATAudio. When you click on the 'Play' icon, it auto-generates an .m3u file which points to the MP3 file being served from your Apache server.
Well I think the largest ogg file will be 9 megs. Certainly not 30- 50 meg.
Jens Klein (jensens) is working on providing support for OGG files, using the pyOGG libraries. http://www.andrewchatham.com/pyogg/
I am not planning to run apache in addition to zope. I haven't looked at ZEO and I will do that soon I guess.
Why aren't you planning to run Apache? It's pretty foolish to run Zope 'in the wild' without Apache in front of it. Also, Apache (esp. with mod_mp3 or Apache::MP3) makes an excellent server for serving up those large MP3/OGG files.
If anyone is willing/interested in giving some feedback about how to best use zope for my project, My Wiki explains the project details and design issues.
What is the URL to your wiki? Mine is http://plone4artists.jazkarta.com/development/Audio The ATAudio product is still a little rough around the edges, but you are welcome to join in the development efforts. The bugs/issues collector for ATAudio describes the known issues and features which are planned. http://tinyurl.com/37u7k You can also read the TODO.txt if you grab the product from the collective CVS. http://sf.net/projects/collective I'm looking forward to this talk in a few weeks at the EuroPython conference: "The Railroad Project: Managing Large files Around Zope" http://www.europython.org/conferences/epc2004/info/talks/zope/kitblake01 Nate -- Nate Aune - natea@jazkarta.com Plone4Artists - http://plone4artists.jazkarta.com "Build your own artist community website!"
The fact that the files are stored within Zope means that you will still have to serve them up via Zope (either download or streaming), and as has been previously mentioned, Zope is not very good at serving up large files.
As i understood from announce, Zope 2.7.1 will contain some improvements in this area.
On Tue, May 25, 2004 at 11:16:00AM +1100, Sergey Volobuev wrote:
The fact that the files are stored within Zope means that you will still have to serve them up via Zope (either download or streaming), and as has been previously mentioned, Zope is not very good at serving up large files.
As i understood from announce, Zope 2.7.1 will contain some improvements in this area.
Sort of. 2.7.1 includes a feature which product developers can use to stream data directly from the filesystem faster than current solutions (and fast enough to saturate a T3, so there's probably no point in going faster :-). But it depends on the product to take advantage of it, by returning a special iterator instead of a string; and - notably - it's unsafe to do database reads inside the iterator. So it's really only good for stuff that is stored or cached on the filesystem. We did write a filesystem cache that uses this feature. That might be what you're thinking of. More info: http://www.slinkp.com/code/zopestuff/blobnotes -- Paul Winkler http://www.slinkp.com
On Friday 21 May 2004 01:04 am, Aaron wrote:
Would I use the zope db to keep track of the audio or a relational db?
My question Is generally more generic, what advantage does zope give me over another solution.
The streaming issue is secondary for me as is at this time bandwidth. I will have hundreds of recording that will need to be maintained, searched, edited, listened to etc.
I am trying to see how zope will help me do this.
It really depends on how you want to use the data. If it is mostly a table of read-mostly information that you want to refer to, and if the number of records is going to be very large (like the electronic card catalog at a library, for example), then I would go with the RDBM approach. This has some overhead in Zope, but it's probably worth it for a couple of different reasons. There's also the point that RDBM are optimized for having no intrinsic structure, and applying the structure at query time (i.e. searches which may be on any combination of fields). OTOH, if the data is more object-like, you want to read/write more often, need to have variable meta-data associated with it, need relatively few (a few hundred might be relatively few) objects, or a natural hierarchical structure (e.g. like a filesystem) is a better model for you, then I think you might want to use the ZODB. My first guess would be "use the ZODB until you prove you need the RDBM". The only reason to go straight to the RDBM is for cases where you know in advance that you're going to be dealing with scads of records and/or you want more separability from Zope. You should think of Zope as primarily being the *application* framework, rather than focusing on storage of the data objects themselves. Using the ZODB is a much more convenient solution for modest sized collections of data, but you may want a dedicated database if you have large amounts of non-hierarchical data. Cheers, Terry -- Terry Hancock ( hancock at anansispaceworks.com ) Anansi Spaceworks http://www.anansispaceworks.com
participants (6)
-
Aaron -
Jerome R. Westrick -
Nate Aune -
Paul Winkler -
Sergey Volobuev -
Terry Hancock