[Zope] zope, latin-1 and accented words

14 Jun 2005

      How could I can tell the Splitter of ZCText intedex to not split words 
as "aaaèbbb" in "aaa" and "bbb"?

 I would like to tell zope that è,à and so on are alphanumeric 
letters... In Splitter.c I have:

class Splitter:

    import re
    rx = re.compile(r"(?L)\w+")

 ?L match "as the locale", but I have multilingual latin-1 contents... 
\w would match only [a..z,A..Z]!

 TIA

 P.S. I've written a small Class for the ZCTextindex pipeline that 
convert all the accented characters in non accented ones, so I can index 
"perchè" as "perche". It would work only if I can solve this splitter 
problem...

[Zope] zope, latin-1 and accented words

Yuri