Re: [Zope] Search with partial words on own ZClass

4 Apr 2000

      ...
...
Is that just partial searching, or also wildcard and even regexp
searching?
I define 'Partial' and 'wildcard' as the same thing, but I'm using my
own terminology so I could be wrong.  I define partial as 'finding part
or all of a word', which can be acomplished with wildcards '*part*'.
Yes

[snip explanation]
...
Regular expressions are not feasiable in any searching system.  Although
it may be possible, with the existing lexical analysis that globbing
lexicons do, to implement a larger subset of regexp than just * and ?,
it is not feasable to implement the entire regexp language.
No, of course not. I was more thinking along the lines of (to stick with
your example)
fl[e|a|u]c*, but it doesn't really matter.
...
...
And since you keep locations of the words, is there proximity searching
also
possible?
The location in the document is not kept, just the score.  There are
TextIndex methods however for finding the positions of words in a
document, this is used to support the 'Near' operator, which is '...'
This operator exists in TextIndexes now (it allways has, since I took
over the indexing realm), I tested them a few months ago but couldn't
get the concept to work.  I suspect it's buggy, the code holds over from
ZTables.
Ok, that's clear
...
...
Another question: how do I retrieve a list of unique words from a
full-text
...
catalog?
In 2.1, you need to hack the lexicon from Python.  In 2.2, you call a
Vocabulary object's 'words' method, or you can call the Vocabulary with
a pattern '*' to match all words, or a more restrictive pattern if you
only want all the unique words that match a pattern, like '*ing', all
the words that end in ing.
That's very nice, and opens up (even more) possibilities for an already
great product.
...
...
Now, I know there is no standard way, but is it possible at all.
In 2.2 it is standard (and documented in the Interfaces Wiki).
...
Can I use the items, keys etc interfaces of the text index (perhaps with
some python hacking)?
TextIndexes do not store the word, they store an integer that the
lexicon maps to a word.  This is so text indexes can be language
independent.
right

Rik

Rik Hoekstra

tags

participants (1)