[Zope-dev] fmt=structured-text doesn't work with accented chars

Leonardo Rochael Almeida leo@hiper.com.br
Wed, 12 Sep 2001 19:09:22 -0300


Hi,

I tried using structured text for some documentation I wrote in portugues=
e,
but the accented characters like =E9, =E1, =E3, or =E7 break the parsing =
of markups
like *emphasis*, 'code' or _underline_, and in portuguese we get a bunch =
of
those with every sentence, you know... :-)

The result is the literal apearance of some characters that should be mar=
kup
and some opening and closing mismatch of markup when there are more than =
one
occurrence of the same markup, like when you try to *emphasize* twice in =
the
*same* paragraph.

This is probably due to parsing with python's
string.letters which doesn't include accented letters unless you set the
locale. Now the funny thing is that even when I set -L to, say, pt_BR at =
Zope
start (and my glibc has correctly configured locales for pt_BR),
structured-text still doesn't parse accented letters correctly.

Investigating this, I found out that if you put the following in a=20
'Script (Python)'::

  import string
  return string.lowercase

it will print 'abcdefghijklmnopqrstuvwxyz' instead of
'abcdefghijklmnopqrstuvwxyz=B5=DF=E0=E1=E2=E3=E4=E5=E6=E7=E8=E9=EA=EB=EC=ED=
=EE=EF=F0=F1=F2=F3=F4=F5=F6=F8=F9=FA=FB=FC=FD=FE=FF'

Which is what you get in a python interpreter if you type::

  import locale
  locale.setlocale(locale.LC_ALL, "pt_BR")
  import string
  print string.lowercase

Assuming you have a correctly configured pt_BR locale in a Linux machine.
This exercise also gives the same results with a correctly configured en_=
US
locale instead of pt_BR.

Has anyone else stumbled onto this? Is there a patch somewhere?

Speaking of locales, how do I set locale for Zope in a WinNT/2000
environment? It seems like posix locale strings aren't valid strings for
locale.setlocale() in Windows.

  Cheers, Leo