[Zope3-dev] Re: unicode problems !?
Martijn Faassen
faassen at infrae.com
Tue Oct 12 09:27:24 EDT 2004
Hey,
I'm not sure I understood the entire debate, but I'll summarize what I
think should be happening:
* if a user edits a textarea, then assume the encoding of form submit is
that of the presented form, or alternatively generate some explicit
encoding setting in the form, as we previously discussed on this list.
The default for this encoding in Zope should be UTF-8. Contents that is
saved is decoded from UTF-8 and stored as unicode. In my experience
browsers, including IE, do submit form data in the same encoding as the
way the form was presented; we rely on this heavily in Silva, for
instance. Silva uses unicode internally throughout.
* if a user uploads a file in some way, and the file is intended to be
textual data, then the encoding of this file is assumed to be UTF-8 by
default. However the user can specify an encoding to override this. The
textual data is decoded using this encoding, and stored as unicode. If
the decoding fails, then the user needs to be presented with an error.
We have some experience implementing something like this in Silva, where
we provide a Comma Separated Value object (in the SilvaExternalSources
extension). Users explicitly specify the encoding of the uploaded CSV
data here, and data is stored as unicode.
* if a user uploads a file and this file is *not* intended to be textual
data but binary data, then Zope doesn't do a thing, and just stores the
bytes. If the developer still uses this data as text at any stage, they
should be aware of encoding issues and decode in whatever encoding they
see fit. Of course the developer is better off using a stored text file
in that case, where unicode is already guaranteed.
Regards,
Martijn
More information about the Zope3-dev
mailing list