RE: [Zope] searching and serving large textfiles ~120 Mb
We're handling something similar, most of our files are in the 60-80Mb range and in varying formats from plain text to MS-Word to PDF. We're just getting into the testing phases now so I'm not certain if the performance will be noticeable different under load or not but it's food for thought at least. We're storing the files on the filesystem with an ExtFile type Product and using TextIndexNG2 to index the fulltext of the documents into Zope. The theory being that we can minimize the bloat to the ZODB by only storing the data needed for the searches in the ZODB and keeping the actual file contents out on the file system. Someone with a better working knowledge of Zope might be able to give you a theoretical guestimate on the performance differences, if any, of storing/serving large objects internally vs externally. HTH Lee
From: "Sebastian Krollmann" <sebastian.krollmann@gmx.net> To: <zope@zope.org> Subject: [Zope] searching and serving large textfiles ~120 Mb Date: Fri, 5 Dec 2003 10:31:08 +0100
Hi zopistas,
I need to access large textfiles (~120Mb) from zope. I know the python lager file support and that it is better to keep large files out of the ZODB. I have to do a full text search on these files residing in a folder hierachy on the server, show their content around the location of the found string and allow scrolling in that files source from zope.
Has anybody done something similar to this with that lager files and would share his experiences? Are there any do's and don'ts or best ways to do it?
Thanks for your answers,
SK
_________________________________________________________________ Add photos to your messages with MSN 8. Get 2 months FREE*. http://join.msn.com/?page=dept/features&pgmarket=en-ca&RU=http%3a%2f%2fjoin....
Lee J. McAllister wrote:
We're handling something similar, most of our files are in the 60-80Mb range and in varying formats from plain text to MS-Word to PDF. We're just getting into the testing phases now so I'm not certain if the performance will be noticeable different under load or not but it's food for thought at least.
We're storing the files on the filesystem with an ExtFile type Product and using TextIndexNG2 to index the fulltext of the documents into Zope. The theory being that we can minimize the bloat to the ZODB by only storing the data needed for the searches in the ZODB and keeping the actual file contents out on the file system.
Someone with a better working knowledge of Zope might be able to give you a theoretical guestimate on the performance differences, if any, of storing/serving large objects internally vs externally.
Hm, I think (downloading) performance should be roughly the same -- after all, the files would still have to loaded by Zope, at least with Products like LocalFS and ExtFile that first load the whole file into memory before starting to serve it? I wrote myself a streaming method for this reason. - peter.
HTH Lee
From: "Sebastian Krollmann" <sebastian.krollmann@gmx.net> To: <zope@zope.org> Subject: [Zope] searching and serving large textfiles ~120 Mb Date: Fri, 5 Dec 2003 10:31:08 +0100
Hi zopistas,
I need to access large textfiles (~120Mb) from zope. I know the python lager file support and that it is better to keep large files out of the ZODB. I have to do a full text search on these files residing in a folder hierachy on the server, show their content around the location of the found string and allow scrolling in that files source from zope.
Has anybody done something similar to this with that lager files and would share his experiences? Are there any do's and don'ts or best ways to do it?
Thanks for your answers,
SK
FWIW, ExtFile only serves one chunk at a time and does not load the entire file into RAM. Don't know about LocalFS though. Stefan On Freitag, Dez 5, 2003, at 19:28 Europe/Vienna, Peter Sabaini wrote:
Hm, I think (downloading) performance should be roughly the same -- after all, the files would still have to loaded by Zope, at least with Products like LocalFS and ExtFile that first load the whole file into memory before starting to serve it? I wrote myself a streaming method for this reason.
-- The time has come to start talking about whether the emperor is as well dressed as we are supposed to think he is. /Pete McBreen/
On Mon, Dec 08, 2003 at 08:24:56PM +0100, Stefan H. Holek wrote:
FWIW, ExtFile only serves one chunk at a time and does not load the entire file into RAM. Don't know about LocalFS though.
It looks like every time you request a particular file from LocalFS, it instantiates a fresh OFS.Image.File from the filesystem data. For large files that seems likely to be expensive, but I haven't checked. Once the File is created it will serve chunks as usual. -- Paul Winkler http://www.slinkp.com Look! Up in the sky! It's THE LOVEABLE CHANCELLOR! (random hero from isometric.spaceninja.com)
Stefan H. Holek wrote:
FWIW, ExtFile only serves one chunk at a time and does not load the entire file into RAM. Don't know about LocalFS though.
Stefan
Not true. ExtFile-1.1.3 *does* load the whole file into RAM prior to serving it. Take a look at ExtFile.index_html() -- the file gets loaded into an StringIO object and then the contents of this StringIO object are returned. That doesn't matter for smaller files, but serving eg. videos like this could be getting expensive. Try it -- load a 500Mb AVI and watch your mem usage soar :-) - peter.
On Freitag, Dez 5, 2003, at 19:28 Europe/Vienna, Peter Sabaini wrote:
Hm, I think (downloading) performance should be roughly the same -- after all, the files would still have to loaded by Zope, at least with Products like LocalFS and ExtFile that first load the whole file into memory before starting to serve it? I wrote myself a streaming method for this reason.
-- The time has come to start talking about whether the emperor is as well dressed as we are supposed to think he is. /Pete McBreen/
This is correct. I was however referring to ExtFile-1.2.0 <wink>. See <http://zope.org/Members/shh/ExtFile> Unfortunately I have no powers to add a note about it to macgregor's page. Stefan --On Dienstag, 09. Dezember 2003 09:41 +0100 Peter Sabaini <peter@sabaini.at> wrote:
Not true. ExtFile-1.1.3 *does* load the whole file into RAM prior to serving it. Take a look at ExtFile.index_html() -- the file gets loaded into an StringIO object and then the contents of this StringIO object are returned.
-- The time has come to start talking about whether the emperor is as well dressed as we are supposed to think he is. /Pete McBreen/
Oh, I see. Nicely done. I've been searching zope.org for ExtFile, and macgregors product shows up first, sorry for the confusion... Did you try to contact macgregor about your changes? It certainly seems it would be an improvement for the ExtFile product. cheers, peter. Stefan H. Holek wrote:
This is correct. I was however referring to ExtFile-1.2.0 <wink>.
See <http://zope.org/Members/shh/ExtFile>
Unfortunately I have no powers to add a note about it to macgregor's page.
Stefan
--On Dienstag, 09. Dezember 2003 09:41 +0100 Peter Sabaini <peter@sabaini.at> wrote:
Not true. ExtFile-1.1.3 *does* load the whole file into RAM prior to serving it. Take a look at ExtFile.index_html() -- the file gets loaded into an StringIO object and then the contents of this StringIO object are returned.
participants (4)
-
Lee J. McAllister -
Paul Winkler -
Peter Sabaini -
Stefan H. Holek