[Ann] Zope 2.1.6 ZCatalog patches: "and not", "near"
Zope 2.1.6's ZCatalog contains a small bug in its handling of "and not" searches. URL:http://www.handshake.de/~dieter/pyprojects/zope/andnot.pat fixes this bug. "near" searches do not work. As Michel pointed out, the relevant indexes have been removed for space considerations. The patch at URL:http://www.handshake.de/~dieter/pyprojects/zope/near.pat enables "near" searches. Instead of saving a word position index, word positions are determined dynamically during near searches (by reading the relevant document part). Of cause, this is much slower than an index based search. Moreover, it works only for persistent objects. It does not work e.g. for LocalFS objects (for them "near" is equivalent to "and"). For these reasons, you may not want to apply this patch. The score function for near searches is currently very ad hoc and should probably replaces by a more sensible function. Dieter
Dieter Maurer wrote:
Zope 2.1.6's ZCatalog contains a small bug in its handling of "and not" searches.
URL:http://www.handshake.de/~dieter/pyprojects/zope/andnot.pat
fixes this bug.
This has allready been fixed in the CVS; sorry Dieter. You came up with the exact fix though, good work. When fixing bugs in Zope it's allways best to go against the CVS since this is the baseline we will apply against.
"near" searches do not work. As Michel pointed out, the relevant indexes have been removed for space considerations. The patch at
URL:http://www.handshake.de/~dieter/pyprojects/zope/near.pat
enables "near" searches. Instead of saving a word position index, word positions are determined dynamically during near searches (by reading the relevant document part). Of cause, this is much slower than an index based search. Moreover, it works only for persistent objects. It does not work e.g. for LocalFS objects (for them "near" is equivalent to "and"). For these reasons, you may not want to apply this patch.
This isn't why I don't want to apply this patch, but rather I think it would be better to specialize TextIndex instead of modifying the existing one and create a new type of index, like a PositionalTextIndex, that works like a text index but uses your patches. Once again, I recommend working off the CVS. If you can provide an index like this I will work it into the ZCatalog, probably post 2.2. Thanks Dieter, -- -Michel Pelletier http://www.zope.org/Members/michel/MyWiki Visit WikiCentral for the latest Zen: http://www.zope.org/Members/WikiCentral
Michel Pelletier writes:
Dieter Maurer wrote:
"near" searches do not work. .... The patch at
URL:http://www.handshake.de/~dieter/pyprojects/zope/near.pat
enables "near" searches. .... This isn't why I don't want to apply this patch, but rather I think it would be better to specialize TextIndex instead of modifying the existing one and create a new type of index, like a PositionalTextIndex, that works like a text index but uses your patches. I think, you should do something with the *existing* index. It cruelly fails for "near" searches (returns no hits).
This probably is no problem for explicit use of the near operator "...", because it is unlikely that someone will use it. However, phrase searches in the form "a b" are mapped to "near" searches and phrase searches are quite common. Maybe, the existing index should approximate "..." by "and". For the time being, I am quite happy with my patch. At a later time, I may perhaps implement a true positional index, i.e. the word positions are compacted and stored in an IISBTree (Int x Int -> String) such that during search the document need not to be accessed. Such a thing would rightfully get a special name such as "PositionalTextIndex". Dieter
participants (2)
-
Dieter Maurer -
Michel Pelletier