Hello all. I'm wondering if anyone has done something similar to something I'm thinking about doing. The company I work for does Business Data Analysis. Amongst the tools we provide to our analysts is a stream of processed news articles, and events derived from them. The events part is fine, that's not that much data, but they also want access to the original articles and the ability to do full-text searches of them. I'm working on re-writing the 'event' part of things in Zope. That part is fairly straight forward. It seems that I could also add each article to Zope as a 'News Article.' This is done a quite a lot. Then we could arrange and sort them, search them or do lots of things with them. The question: Over the past few months, we have accumulated around 2 GIGs of articles. This trend is expected to continue. Can Zope and ZCatalog handle this kind of load? Will the searching capabilities be shot in the ass if I stored the articles as external files? (Does ZCatalog even know about those things?) OR, should I stick with the articles in the filesystem with ht://dig like we've got now, and just build the interface to ht://dig in Zope? Any thoughts? Thanks. Monty
The question: Over the past few months, we have accumulated around 2 GIGs of articles. This trend is expected to continue. Can Zope and ZCatalog handle this kind of load?
I dont see why not! I would imagine it depends on the amount of full-text for each object that is indexed and how many objects are free-textable. I'm not quite sure about ZCatalog and large amount of full-text, for object indexes ZCatalog would be best bet. large amounts of 'free-text' -- not quite sure.
Will the searching capabilities be shot in the ass if I stored the articles as external files? (Does ZCatalog even know about those things?)
no, ZCatalog does not know about external files (out of the box)
OR, should I stick with the articles in the filesystem with ht://dig like we've got now, and just build the interface to ht://dig in Zope?
we use Ultraseek http://ultraseek.com/products/ultraseek/ultratop.htm at work (which is available for the low low price of a 1000$ for x files indexed) and there is a Zope Ultraseek DA (havent used it yet)! I've run across a potentially really kick ass indexer, havent used it but it has a really nice cover! its called Udmsearch - http://mysearch.udm.net/ - its actively developed, searches all types of sources (news, ftp, http, FS, databases!!) and its OSS! <- I will be using this in the next 3 months ht://dig I believe is in PERL, (yuk!) <- resentful over doing a PERL project right now ~runyaga
Will the searching capabilities be shot in the ass if I stored the articles as external files? (Does ZCatalog even know about those things?)
no, ZCatalog does not know about external files (out of the box)
ZCatalog will work with the LocalFS product, with minor modifications. I'm still hesitant to make this modification part of the standard distribution for reasons I don't care to go into right at the moment. Try out LocalFS and let me know if you like it and I'll send you the patch to use it with ZCatalog. --jfarr
Hi! On Wed, May 03, 2000 at 07:41:17AM -0500, alan runyan wrote:
I've run across a potentially really kick ass indexer, havent used it but it has a really nice cover! its called Udmsearch - http://mysearch.udm.net/ - its actively developed, searches all types of sources (news, ftp, http, FS, databases!!) and its OSS! <- I will be using this in the next 3 months
I've been using it for our SunSITE.. The only problem might be that you will get a very big mysql database it uses this as storage.. On SunSITE this was a problem as mysql was for this installed in the wrong place but in general this shouldn't be a problem. Anyway, it was quite fast to install and configure (though I haven't checked the latest version). best, Christian
participants (4)
-
alan runyan -
cs@comlounge.net -
Jonothan Farr -
Monty Taylor