[Zope-dev] ZCatalog and 'fuzzy logic'
Dieter Maurer
dieter@handshake.de
Tue, 9 Jan 2001 21:41:57 +0100 (CET)
Morten W. Petersen writes:
> Is there anyone who could try to give an estimate of how long it would
> take to add fuzzy logic (regexp-like) searching capability to the
> ZCatalog?
I do not think that "fuzzy logic" is strongly related to "regexp-like".
Anyway.
Fuzzy searching often means "finding matches with characters omitted,
replaced or inserted".
Zope's globbing vocabularies support wildcards '*' and '?'.
To implement wildcard based searches efficiently, they
index words under their two letter consitutents.
When you now get a pattern, you derive from the pattern
what two letter constituents the matching words must
have and retrieve them. This defines a candidate word set.
Then you check, whether the retrieved word really match
the expression.
You can extend this algorithm to get fuzzy searches.
Dieter