[Zope] [REQ] Support for multi-lingual components of TextIndexNG wanted

Andreas Jung Andreas Jung <andreas@andreas-jung.com>
Mon, 17 Jun 2002 09:08:40 -0400


--On Monday, June 17, 2002 17:03 +0400 Oleg Broytmann <phd@phd.pp.ru> =
wrote:

>    What about non-iso8859 languages? How can I create normalization rules
> if my language does not have any mapping to latin alphabet?
>

In the current implementation normalizers can be specified through a text=20
file.
Inside the file you can declare the language and the used encoding, e.g.

        # german normalizer
        # $Id: de.txt,v 1.2.2.1 2002/06/13 12:50:08 ajung Exp $

        # language =3D german
        # encoding =3D iso-8859-1

        =C4  Ae
        =D6  Oe
        =DC  Ue
        =E4  ae
        =F6  oe
        =FC  ue
        =DF  ss

When the file is parsed every rule is translated to unicode using the=20
specified
encoding.

-aj