[Zope] [REQ] Support for multi-lingual components of
TextIndexNG wanted
Andreas Jung
Andreas Jung <andreas@andreas-jung.com>
Mon, 17 Jun 2002 09:08:40 -0400
--On Monday, June 17, 2002 17:03 +0400 Oleg Broytmann <phd@phd.pp.ru> =
wrote:
> What about non-iso8859 languages? How can I create normalization rules
> if my language does not have any mapping to latin alphabet?
>
In the current implementation normalizers can be specified through a text=20
file.
Inside the file you can declare the language and the used encoding, e.g.
# german normalizer
# $Id: de.txt,v 1.2.2.1 2002/06/13 12:50:08 ajung Exp $
# language =3D german
# encoding =3D iso-8859-1
=C4 Ae
=D6 Oe
=DC Ue
=E4 ae
=F6 oe
=FC ue
=DF ss
When the file is parsed every rule is translated to unicode using the=20
specified
encoding.
-aj