Re: [Zope-dev] Re: What catalog/index to use ...

9 Nov 2002

      Hi!
...
Please note that former Zope versions already include a  dedicated
unicode-aware
splitter that is already usable with the old TextIndex and maybe with
ZCTextIndex.
TextIndexNG resolves all these issues by doing the complete internal
processing by
converting the data into unicode. Every single processing step only
handles
unicode
data.
...
Most older browsers should be able to handle at least UTF-8 as character
set. This is
sufficient for most cases.
The problem seems to be that ZCTextIndex indeed does not do the splitting
"right" if German Umlauts are used. There is no option for "Unicode-aware
splitter". Instead of a Vocabulary it uses a Lexicon, which just offers two
options: "HTML aware splitter" and "Whitespace splitter". I haven't tested
the whitespace splitter yet, but the HTML aware splitter did not do the
Umlaut thing right without the patch, i.e. it used umlauts as splitting
characters ...

So there is a bug  ...

Joachim

Re: [Zope-dev] Re: What catalog/index to use ...

Joachim Werner