Re: [Zope-dev] Re: What catalog/index to use ...

9 Nov 2002


      ...
...
...
The problem seems to be that ZCTextIndex indeed does not do the
splitting "right" if German Umlauts are used. There is no option for
"Unicode-aware splitter".  Instead of a Vocabulary it uses a
Lexicon, which just offers two options: "HTML aware splitter" and
"Whitespace splitter". I haven't tested the whitespace splitter yet,
but the HTML aware splitter did not do the Umlaut thing right
without the patch, i.e. it used umlauts as splitting characters ...
That's just what the default ZMI interface for ZCTextIndex offers.
It's easy to add your own splitter by writing a few lines of Python
code.  RTSL.
of course everyone can write his own Splitter... one for german, one
for french, etc.pp. but what is the problem with the patch? is
pythons-regexp (?L) not just intended for this simple way of
"localizing" software?
and think of the european market:
no one will "buy" Zope, if it is not working with your native language 
out of the box. and that's what the patch for...
I must've missed the start of this thread (I only just signed up for
this list).  I didn't see any patch -- I thought it was just a gripe
about ZCTextIndex.  Of course patches are welcome -- where can I find
this particular patch?

--Guido van Rossum (home page: http://www.python.org/~guido/)