[Zope] Dealing with non-ASCII strings?
Toby Dickenson
tdickenson@geminidataloggers.com
Wed, 2 Oct 2002 09:32:49 +0100
On Wednesday 02 Oct 2002 9:06 am, Jean Jordaan wrote:
> Hi there
>
> In the data that we have to work with, there are names in French,
> Turkish, German, Greek, etc. A sample string, when printed from Python,
> is: 'Rabia-r\xddza Bi\xe7en \xf6grenci Yurdu.G\xf6r\xfckle'
> We'd like to store this data in LDAP and in Zope.
>
> Questions:
>
> - How do we find out what the current encoding of the strings are?
> Guess?
guessing is your only option if you cant ask the person who supplied you =
with=20
your data.
> - Say we decide it's Latin-7. How do we convert from the current
> string to Unicode, taking into account the fact that the source is
> taken to be Latin-7?
unicode_string =3D unicode(encoded_8bit_string, 'data character encoding'=
)
> - Do we need to move to Zope 2.6 in order to cope with such strings?
It depends what you want to do with them. You need 2.6 if you want to use=
them=20
in property pages, in dtml, or allow them to be edited in forms.
(you could get patches for Zope 2.4 from=20
http://www.zope.org/Members/htrd/wstring. They dont apply cleanly to 2.5,=
but=20
are known to work after a little manual merging. Overall I think the 2.6=20
upgrade will be less pain)
If you want to continue using an unpatched 2.5.x then you will need to=20
manually call the unicode string's encode method every time you use it:
unicode_string.encode('page character encoding')