[Zope-Annce] [ANN] TextIndexNG 2.0 final released
Andreas Jung
Andreas Jung <andreas@andreas-jung.com>
Sun, 20 Jul 2003 16:39:12 +0200
I am pleased to announce the final release of TextIndexNG 2.0.
What's new in TextIndexNG 2.0?
- Relevance ranking of search results added. Searches are now ranked
using an extended cosine measure. The cosine measure is based on
a vector model and calculates the document "score" based on the
frequency of the query terms inside the document result set.
- Much faster phrase/near search: the old implementation of TextIndexNG
had to perform a very expensive job at query time when phrase/near
search
was performed. Re-using the !WidCode module of !ZCTextIndex made
this operation less expensive.
- Left-truncation added: TextIndexNG can be configured creation-time
time to support left-truncation (means you can search for "*suffix")
- optional auto-expansion support: This optional feature also to get
better search results when some of the query terms could not be found.
The index expands a query term "foo" to "foo*" if there was no hit
for "foo". This expansion is currently global for the index. This
feature
will be available on a per-query basis in a later version.
(Auto-expansion
will be extended in a later version to search for similar terms)
- improved HTML converter: now using Chris Withers "Strip-o-Gram" module
instead of the Strip-Tag-Parser
- added converter for text/sgml
- Similarity search (soundex, metaphone, doublemetaphone) dropped
and replace with a more general approach and language indepedant
approach using the Levenshtein distance.
- internal code cleanup, more unittests
- range searches like "Fi..Foo"
- substring searches "*substring*"
- reduced conflict errors caused by the lexicon/storage implementation
- no longer conflicts with TextIndex V 1 installations
Download:
http://sourceforge.net/project/showfiles.php?group_id=50052
http://www.zope.org/Members/ajung/TextIndexNG/
Project Wiki:
http://www.zope.org/Members/ajung/TextIndexNG/
Note: there are currently no binary packages available
for the TextIndexNG extension modules. They will be provided
at a later time.