Hi! I would like to resurrect this thread with my own recent experience. My problem was to spread a set of zodb objects over several zope installations. Every installation required some slight modifications to be applied to original objects (such as usernames and passwords). So I decided to export the original data in an xml dump and write a small application which modifies the dump as necessary before import. But to my distress I couldn't import neither a modified dump nor even the original dump. The process failed with UnicodeDecodeError exception. After some investigation I realized what was the problem (at least in my case). The xml parser extracts all texts as unicode strings. But among them are base64-encoded strings which are decoded into non-unicode strings containing binary data. Of course this data can't be decoded by any codec. The code in ppml.py sometimes concatenates the raw and unicode strings and this raises UnicodeDecodeError. I worked this around by converting the unicode strings into non-unicode ones. Please look at the patch attached. Zope version 2.9.4. -- Best regards, Alexei On 25. March 2006 21:40:48 +0100 Yoshinori Okuji <yo at nexedi.com> wrote:
On Saturday 25 March 2006 15:56, Andreas Jung wrote:
Zope 2.7 throws a BadPickleGet, 12 exception, Zope 2.8 throws BadPickleGet, 13 and Zope 2.9 raises the described UnicodeDecodeError. I don't expect that the import functionality works for even more complex objects. So I consider the whole functionality as totally broken. The generated XML might be useful to perform any processing outside Zope but using it for re-importing it into another Zope systems definitely does _not_ work. So if the functionality should remain in Zope then it should be fixed for Zope 2.10 lately.
Here is a quick patch for this problem (against 2.9.1). There were two different problems:
- the id attributes were not generated, because the conditional was reverse.
- unlike xmllib, expat always returns Unicode data, so simply concatenating binary values generates Unicode objects with non-ascii characters.