The problem seems to be that ZCTextIndex indeed does not do the splitting "right" if German Umlauts are used. There is no option for "Unicode-aware splitter". Instead of a Vocabulary it uses a Lexicon, which just offers two options: "HTML aware splitter" and "Whitespace splitter". I haven't tested the whitespace splitter yet, but the HTML aware splitter did not do the Umlaut thing right without the patch, i.e. it used umlauts as splitting characters ...
That's just what the default ZMI interface for ZCTextIndex offers. It's easy to add your own splitter by writing a few lines of Python code. RTSL.
of course everyone can write his own Splitter... one for german, one for french, etc.pp. but what is the problem with the patch? is pythons-regexp (?L) not just intended for this simple way of "localizing" software?
and think of the european market:
no one will "buy" Zope, if it is not working with your native language out of the box. and that's what the patch for...
I must've missed the start of this thread (I only just signed up for this list). I didn't see any patch -- I thought it was just a gripe about ZCTextIndex. Of course patches are welcome -- where can I find this particular patch? --Guido van Rossum (home page: http://www.python.org/~guido/)