Plagued by UnicodeEncodeError and UnicodeDecodeError
Hi, We have a ton of AT objects that we've imported from our databases, some of which are Japanese characters imported as unicode. In some cases, those could not be imported because of UnicodeEncodeErrors or UnicodeDecodeErrors loading the object from our DB or loading them into the ZODB. It would have been really nice to keep these entries, but we just put them in a try:except: block to get past them. Our current problem is that now we have the data in our DB, and they all display fine. But when we go to edit them through a base_edit and click Save, we're back to this: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 5: ordinal not in range(128). The traceback shows the field which is causing the error, and I don't see any unicode chars there! Zope 2.7.3-0 python 2.3.4 linux CMF 1.4.7 Plone 2.0.4 AT 1.3.1-final + MySqlSqlstorage
davelists2@peoplemerge.com wrote at 2005-1-21 16:58 -0800:
We have a ton of AT objects ... ... UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 5: ordinal not in range(128).
The traceback shows the field which is causing the error, and I don't see any unicode chars there!
The current Archetypes has a strange handling of text data: Internally, it stores it as unicode (good!) but it tries hard to deliver the content encoded. I am not even sure, there is a way to tell it to deliver unicode; but at least you can pass in the encoding your want ("encoding" keyword argument). AT 1.3.1 uses "instance.getCharset()" as "encoding", if not specified explicitely. This is better than previous versions that used the "original encoding" (which could mix different encodings in a single page). By the way: the Archetypes mailing lists are probably better suited to discuss Archetype related questions. -- Dieter
participants (2)
-
davelists2@peoplemerge.com -
Dieter Maurer