[Zope-Annce] [ANN] TextIndexNG 2.0 alpha 1 released
Andreas Jung
Andreas Jung <andreas@andreas-jung.com>
Sun, 04 May 2003 10:54:42 +0200
I am pleased to announce the release of TextIndexNG 2.0 alpha1 .
What's new in TextIndexNG 2.0?
- Relevance ranking of search results added. Searches are now ranked
using an extended cosine measure. The cosine measure is based on
a vector model and calculates the document "score" based on the
frequency of the query terms inside the document result set.
- Much faster phrase/near search: the old implementation of TextIndexNG
had to perform a very expensive job at query time when phrase/near
search
was performed. Re-using the !WidCode module of !ZCTextIndex made
this operation less expensive.
- Left-truncation added: TextIndexNG can be configured creation-time
time to support left-truncation (means you can search for "*suffix")
- optional auto-expansion support: This optional feature also to get
better search results when some of the query terms could not be found.
The index expands a query term "foo" to "foo*" if there was no hit
for "foo". This expansion is currently global for the index. This
feature
will be available on a per-query basis in a later version.
(Auto-expansion
will be extended in a later version to search for similiar terms)
- improved HTML converter: now using Chris Withers "Strip-o-Gram" module
instead of the Strip-Tag-Parser
- added converter for text/sgml
- Similarity search (soundex, metaphone, doublemetaphone) dropped
and replace with a more general approach and language indepedant
approach using the Levenshtein distance.
- internal code cleanup, more unittests
Not implemented yet
- improved support for multilingual documents
- range searches like "Fi..Foo"
- substring searches "*substring*"
- optional improved ranking for terms based on their relative positions
inside a document
Installation notes
TextIndexNG 2.0 is *not compatible* with TextIndexNG 1.0 so there
is currently no migration path for existing indexes.
- **BACKUP YOUR Data.fs first!!!**
- remove any existing TextIndexNG from your ZCatalog indexes
- shutdown Zope
- remove the old Products/TextIndexNG directory
- untar the tarball in the Products folder
- recompile and install extension modules (see InstallationInstructions)
- restart Zope
- re-create the indexes and re-index
Download:
http://sourceforge.net/project/showfiles.php?group_id=50052
Project Wiki:
http://www.zope.org/Members/ajung/TextIndexNG/