Zope is a great application server, the same as its soon to be released Content Management Framework, because of its bet on Python, everybody say it. Nevertheless, after reading the Directions Roadmap from DC, I was surprised that a substantial improvement of the searching features of Zope, wasn't mentioned as a major concern. For a new Zope enthusiast like me, it is a kind of addiction to arrange and administer content while taking the learning curve. Almost everybody in this list with a non-programming background might've experimented this. But when I arrived to the search features of ZCatalog, I got mixed feelings. (Right now I'm stuck on this OR indexes searching :) ) The fact is that - according to my strong belief - everybody uses more Google to look for Zope Site's content than Zope's own Zcatalog's search engine. Moreover, everybody uses more Google to look for everything, bypassing windows, doors, and portals!. Why? Because it's terribly smart (not mentioning its 6,000 Linux boxes, by the way), and because there's no need to follow the highly-engineered information architecture of a web site, if there's a trustful shorcut to the relevant content!. So, if I'd have to mention one big feature improvement to Zope, I wouldn't doubt: "search engine". I just wanted to point on this subject. I know Zope isn't about spidering and retrieving, but it should have "Greater Search Capabilities" as a title, within that roadmap. :) Ausum p.d. Right now I'm quite interested at the technology of searching and finding non structured content, in order to compose structured documents. For example, the guys at Vignette (StoryServer) say that its customers don't need to keyword anything in order to have a "related content" section. After the writer finishes a story, (possibly while) a routine by Autonomy (www.autonomy.com) reads the document and finds out what the document is about, and so it triggers a search for related content within the site, without the need of intervention by the writer. (For the curious, Autonomy has published a personal version of its software. It's called Kenjin (www.kenjin.com) ). On the other hand, Fast, from Norway, already have a nice multimedia search engine, from regular, non-structured, spidered web pages. Can we do that "structuring the unstructured" thing within Zope?
p.d. Right now I'm quite interested at the technology of searching and finding non structured content, in order to compose structured documents. For example, the guys at Vignette (StoryServer) say that its customers don't need to keyword anything in order to have a "related content" section. After the writer finishes a story,
Take a look at http://beta.osdigger.com. It is a mailing list search engine I was working on about a year ago, unfortunately I've not had the time to work on it since. It is designed to scale to doing full text ranked searches on millions of email messages in under a couple of seconds on a single machine (currently PII-300). It has a 'two-step' search feature which brings back related terms to ones you put in (usually :). eg. type in 'scsi controller' and chances are it will return adaptec amongst the list of other terms. The idea is to prompt the user to be more specific with their search. I would love to take more time to work on it again, and would like to be able to access it from withing Zope and use it to catalog arbitary Zope object like ZCatalog does. -Matt -- Matt Hamilton matth@netsight.co.uk Netsight Internet Solutions, Ltd. Business Vision on the Internet http://www.netsight.co.uk +44 (0)117 9090901 Web Hosting | Web Design | Domain Names | Co-location | DB Integration
after reading the Directions Roadmap from DC, I was surprised that a substantial improvement of the searching features of Zope, wasn't mentioned as a major concern.
<snip>
Moreover, everybody uses more Google to look for everything, bypassing windows, doors, and portals!. Why? Because it's terribly smart (not mentioning its 6,000 Linux boxes, by the way), and because there's no need to follow the highly-engineered information architecture of a web site, if there's a trustful shorcut to the relevant content!. So, if I'd have to mention one big feature improvement to Zope, I wouldn't doubt: "search engine". <snip>
On the other hand, Fast, from Norway, already have a nice multimedia search engine, from regular, non-structured, spidered web pages. Can we do that "structuring the unstructured" thing within Zope?
You have posed an important question (and, probably some answers), that hopefully I can clarify. One of the all-important points of the Zope directions document is that our number one goal is to make it wildly easier for _developers_ to create and deploy quality components. Why is this so important? Your questions in this email is why that is so important. You are very interested in high-quality search capabilities, and others certainly are as well. Some other folks care more about E-Commerce, or Corba integration, or communication with Java components. The problem, of course, is that even if DC devoted every single person here to creating the "best search engine" (which we couldn't do for very long - we'd soon be gone), we would still be hard pressed to even come close to making everybody happy or being competitive with every other search engine vendor out there. And the reality is that it is not our goal in life to be a better Google than Google. Multiply that by the number of things people want (ECommerce, Corba, et. al.), and the problem is quite clear - *DC cannot possibly provide the best, most featureful and competitive component for every problem*. The *solution* to this problem is what is outlined in the Zope directions document - dramatically lowering the bar of development to allow a thriving marketplace of robust components (that are *not* written by DC), allows interested parties to write (or better yet, reuse) "the best x component" for their purposes. In the future, Zope may come with "some batteries included", in that a Zope distribution may include the latest versions of the most popular and widely used components. But we hope that the idea of "The ZCatalog" (for instance) will fall by the wayside. Zope may still come with a search component such as ZCatalog that is useful for certain tasks and perhaps as a learningtool, but it will not be an infinitely-scalable infinitely-featureful thing that everyone uses for every problem. The hope is that when you outgrow ZCatalog you can move on to other search components particularly suited to your problem domain. If you scale beyond what ZC can handle, maybe you move up to some VeritySearch component that makes use of existing software. Even now, with the current pain level of component development, building a VeritySearch component would probably take considerably less time than building and maintaining equivalent features into "the ZCatalog". This is the future - the way that Zope will succeed is by being the best framework and component integration platform for the Web, not by trying to compete with verticals like search engine vendors on feature points. "Use the right tool for the job" is something we have always believed in, and providing a platform that will allow you to use and integrate the most appropriate tools will be our focus going forward. That is why "substantial improvement of searching features" is not on the futures roadmap - we do not want to provide the best search engine for every task. We want to make it easy for you to build or integrate the "right" search solution for your task. Brian Lloyd brian@digicool.com Software Engineer 540.371.6909 Digital Creations http://www.digicool.com
Please do NOT cross post. -Michel On Mon, 26 Feb 2001, Ausum wrote:
Zope is a great application server, the same as its soon to be released Content Management Framework, because of its bet on Python, everybody say it. Nevertheless, after reading the Directions Roadmap from DC, I was surprised that a substantial improvement of the searching features of Zope, wasn't mentioned as a major concern.
For a new Zope enthusiast like me, it is a kind of addiction to arrange and administer content while taking the learning curve. Almost everybody in this list with a non-programming background might've experimented this. But when I arrived to the search features of ZCatalog, I got mixed feelings. (Right now I'm stuck on this OR indexes searching :) )
The fact is that - according to my strong belief - everybody uses more Google to look for Zope Site's content than Zope's own Zcatalog's search engine. Moreover, everybody uses more Google to look for everything, bypassing windows, doors, and portals!. Why? Because it's terribly smart (not mentioning its 6,000 Linux boxes, by the way), and because there's no need to follow the highly-engineered information architecture of a web site, if there's a trustful shorcut to the relevant content!. So, if I'd have to mention one big feature improvement to Zope, I wouldn't doubt: "search engine".
I just wanted to point on this subject. I know Zope isn't about spidering and retrieving, but it should have "Greater Search Capabilities" as a title, within that roadmap. :)
Ausum
p.d. Right now I'm quite interested at the technology of searching and finding non structured content, in order to compose structured documents. For example, the guys at Vignette (StoryServer) say that its customers don't need to keyword anything in order to have a "related content" section. After the writer finishes a story, (possibly while) a routine by Autonomy (www.autonomy.com) reads the document and finds out what the document is about, and so it triggers a search for related content within the site, without the need of intervention by the writer. (For the curious, Autonomy has published a personal version of its software. It's called Kenjin (www.kenjin.com) ). On the other hand, Fast, from Norway, already have a nice multimedia search engine, from regular, non-structured, spidered web pages. Can we do that "structuring the unstructured" thing within Zope?
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
participants (4)
-
Ausum -
Brian Lloyd -
Matt Hamilton -
Michel Pelletier