ZCatalog in different languages
Hello! How can I use ZCatalog in different from English language? For example Hungarian, Czech, Polish, France? The catalogized words are cutted at every non ASCII characters. /a', e', o".../ I dont want to catalog the HTML and DTML tags. How can I set it at the ZCatalog. Thnak you: Lajos Kerekes
The splitters of Zope 2.4 suppport Latin-1 (ISO-8859-1). I think the eastern-european lanuguages are not covered by Latin-1. I sugguest to modify the corresponding sources of the splitter. Zope 2.4 allows you to have multiple splitters and select a different splitter for every vocabulary. Andreas ----- Original Message ----- From: "Kerekes Lajos" <lkerekes@xperts.hu> To: <zope@zope.org> Sent: Wednesday, September 12, 2001 10:28 Subject: [Zope] ZCatalog in different languages
Hello!
How can I use ZCatalog in different from English language? For example Hungarian, Czech, Polish, France? The catalogized words are cutted at every non ASCII characters. /a', e', o".../ I dont want to catalog the HTML and DTML tags. How can I set it at the ZCatalog.
Thnak you:
Lajos Kerekes
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
Andreas Jung wrote:
The splitters of Zope 2.4 suppport Latin-1 (ISO-8859-1). I think the eastern-european lanuguages are not covered by Latin-1. I sugguest to modify the corresponding sources of the splitter. Zope 2.4 allows you to have multiple splitters and select a different splitter for every vocabulary.
...and it's completely impossible to register new splitters unless you hack the PlugginIndexes source ;-) Chris
Andreas Jung wrote:
The splitters of Zope 2.4 suppport Latin-1 (ISO-8859-1). I think the eastern-european lanuguages are not covered by Latin-1. I sugguest to modify the corresponding sources of the splitter. Zope 2.4 allows you to have multiple splitters and select a different splitter for every vocabulary.
Thanks. And how can I modify the splitter? Where is splitter? What is splitter? Lajos
Kerekes Lajos writes:
How can I use ZCatalog in different from English language? For example Hungarian, Czech, Polish, France? The catalogized words are cutted at every non ASCII characters. /a', e', o".../ I dont want to catalog the HTML and DTML tags. How can I set it at the ZCatalog. Zope's splitter may be locale aware (if not, it may be quite easily made so). In this case, you would start Zope with the "-L" switch to tell it the locale to use and everything else may work automatically.
Of course, this will work only for one language per site (Zope installation). Dieter
On Wed, Sep 12, 2001 at 10:24:32PM +0200, Dieter Maurer wrote:
Zope's splitter may be locale aware (if not, it may be quite easily made so). In this case, you would start Zope with the "-L" switch to tell it the locale to use and everything else may work automatically.
Zope's Splitter is perfectly locale aware now - I spent so much time debugging and fixing and testing new releases.
Of course, this will work only for one language per site (Zope installation).
What is worse - one encoding per Zope installation (we the Russians use at least two different encodings - windows-1251 and koi8-r). Oleg. ---- Oleg Broytmann http://www.zope.org/Members/phd/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN.
Hi, --On Donnerstag, 13. September 2001 12:13 +0400 Oleg Broytmann <phd@phd.pp.ru> wrote:
On Wed, Sep 12, 2001 at 10:24:32PM +0200, Dieter Maurer wrote:
Zope's splitter may be locale aware (if not, it may be quite easily made so). In this case, you would start Zope with the "-L" switch to tell it the locale to use and everything else may work automatically.
Zope's Splitter is perfectly locale aware now - I spent so much time debugging and fixing and testing new releases.
Whats the use of this? At least I want more then one language per site. For now the usual OSes do not support more then one locale at a given time.
Of course, this will work only for one language per site (Zope installation).
What is worse - one encoding per Zope installation (we the Russians use at least two different encodings - windows-1251 and koi8-r).
What can we do to change this? Regards Tino
Current model of data processing, mostly created in the country with one langauge and monoculture, is completely inadequate for miltilingual multiencoding models. I don't know how hard it would be to change this. If you want to chamge only Zope, you have to untie it from OS (stop using locales at all), write your own splitters and other components. Perhaps it's not a task for lazy Sunday :( May be Unicode will be of some help, but I am not sure. Certainly it is possible. Oracle, for example, allows to create multilingual multiencoding databases. But it is not so simple. On Thu, Sep 13, 2001 at 11:05:06AM +0200, Tino Wildenhain wrote:
On Wed, Sep 12, 2001 at 10:24:32PM +0200, Dieter Maurer wrote:
Zope's splitter may be locale aware (if not, it may be quite easily made so). In this case, you would start Zope with the "-L" switch to tell it the locale to use and everything else may work automatically.
Zope's Splitter is perfectly locale aware now - I spent so much time debugging and fixing and testing new releases.
Whats the use of this? At least I want more then one language per site. For now the usual OSes do not support more then one locale at a given time.
Of course, this will work only for one language per site (Zope installation).
What is worse - one encoding per Zope installation (we the Russians use at least two different encodings - windows-1251 and koi8-r).
What can we do to change this?
Oleg. ---- Oleg Broytmann http://www.zope.org/Members/phd/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN.
Hi, --On Donnerstag, 13. September 2001 13:16 +0400 Oleg Broytmann <phd@phd.pp.ru> wrote:
Current model of data processing, mostly created in the country with one langauge and monoculture, is completely inadequate for miltilingual multiencoding models. I don't know how hard it would be to change this. If you want to chamge only Zope, you have to untie it from OS (stop using locales at all), write your own splitters and other components. Perhaps it's not a task for lazy Sunday :( May be Unicode will be of some help, but I am not sure. Certainly it is possible. Oracle, for example, allows to create multilingual multiencoding databases. But it is not so simple.
Maybe its possible to define the splitter per field? Regards Tino
On Thu, Sep 13, 2001 at 11:38:42AM +0200, Tino Wildenhain wrote:
Certainly it is possible. Oracle, for example, allows to create multilingual multiencoding databases. But it is not so simple.
Maybe its possible to define the splitter per field?
The problem is not in the Splitter itself - the problem is in locale data. Locale data is the accumulated result of long research. Many people collected the data (how to represent national symbols, how to sort them, which are printable, which are alphas, etc), organized it and put into locales. If you are going to get rid of locale - where you will get this information? You need it for every language/encoding you are going to support. Oleg. ---- Oleg Broytmann http://www.zope.org/Members/phd/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN.
Hi Oleg, --On Donnerstag, 13. September 2001 13:45 +0400 Oleg Broytmann <phd@phd.pp.ru> wrote:
On Thu, Sep 13, 2001 at 11:38:42AM +0200, Tino Wildenhain wrote:
Certainly it is possible. Oracle, for example, allows to create multilingual multiencoding databases. But it is not so simple.
Maybe its possible to define the splitter per field?
The problem is not in the Splitter itself - the problem is in locale data. Locale data is the accumulated result of long research. Many people collected the data (how to represent national symbols, how to sort them, which are printable, which are alphas, etc), organized it and put into locales. If you are going to get rid of locale - where you will get this information? You need it for every language/encoding you are going to support.
Specifically by defining the locale per field. The problem however raises, if you have multiple representations of the same document and have to serve them according to existence and language preference of the reader. Even harder, if one document has more then one language in it! The way to go is unicode of course, but how do we proceed? Any ideas? I think this must be solveable. Regards Tino
Oleg. ---- Oleg Broytmann http://www.zope.org/Members/phd/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN.
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
Oleg Broytmann writes:
On Thu, Sep 13, 2001 at 11:38:42AM +0200, Tino Wildenhain wrote:
Certainly it is possible. Oracle, for example, allows to create multilingual multiencoding databases. But it is not so simple.
Maybe its possible to define the splitter per field?
The problem is not in the Splitter itself - the problem is in locale data. Locale data is the accumulated result of long research. Many people collected the data (how to represent national symbols, how to sort them, which are printable, which are alphas, etc), organized it and put into locales. If you are going to get rid of locale - where you will get this information? You need it for every language/encoding you are going to support. Some time ago, I read the GNU locale implementation. They complained about the global-ness of Posix Locale support and I think they provided a more object oriented implementation.
Maybe, such an implementation could be used to switch Locales thread specific and thereby switch Locales as necessary for the application at hand. However, one would probably need a GNU C-library to make use of it. Dieter
On Wed, 12 Sep 2001, Dieter Maurer wrote:
Kerekes Lajos writes:
How can I use ZCatalog in different from English language? For example Hungarian, Czech, Polish, France? The catalogized words are cutted at every non ASCII characters. /a', e', o".../ I dont want to catalog the HTML and DTML tags. How can I set it at the ZCatalog. Zope's splitter may be locale aware (if not, it may be quite easily made so). In this case, you would start Zope with the "-L" switch to tell it the locale to use and everything else may work automatically.
Of course, this will work only for one language per site (Zope installation).
How difficult would it be to make it SiteRoot dependent ? Each SiteRoot object would have a "locale" field to fill in the creation form, an empty field meaning take the locale from the -L option. any idea on how to implement this ? my 0.000000000000000002 euros Jerome Alet
On Thu, Sep 13, 2001 at 11:55:55AM +0200, Jerome Alet wrote:
Of course, this will work only for one language per site (Zope installation).
How difficult would it be to make it SiteRoot dependent ?
Each SiteRoot object would have a "locale" field to fill in the creation form, an empty field meaning take the locale from the -L option.
Near to impossible. Locale code in "modern" OSes is not thread safe, so you cannot incorporate it into Zope. You will need to write your own thread-safe locales. Oleg. ---- Oleg Broytmann http://www.zope.org/Members/phd/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN.
Jerome Alet writes:
... locale controling splitter ... How difficult would it be to make it SiteRoot dependent ?
Each SiteRoot object would have a "locale" field to fill in the creation form, an empty field meaning take the locale from the -L option. Some time ago, I read serious critics about the posix locale API.
If the critics still apply, the locale setting is global and not thread specific. This would mean, you cannot switch locales dynamically in a multi-threaded environment (such as Zope) using different locales in different threads. Dieter
participants (7)
-
Andreas Jung -
Chris Withers -
Dieter Maurer -
Jerome Alet -
Kerekes Lajos -
Oleg Broytmann -
Tino Wildenhain