On Mon, 2002-09-30 at 09:17, Toby Dickenson wrote:
On Saturday 28 Sep 2002 4:38 pm, Florent Guillaume wrote:
<form action="foo" ... accept-charset="UTF-8"> ... </form>
This instructs the browser it should send the content of the form in the accepted charset.
Yes, accept-charset could be part of a full solution to this problem, but I dont think it is a whole solution....
Are you suggesting that a method could assume its form submissions would always be made in utf-8? That would cause problems if a submission was made from: * some other form that didnt have an accept-charset * some non-browser code that synthesizes http requests
Yes, there is no good way.
A further problem is that we want this decoding to be performed in ZPublisher, but that that point in the publishing process it doent know which method is going to be called. That means the utf8 assumption cant be made independantly for each method.
One answer to this problem is when browsers include the charset attribute in "multipart/form-data" POST requests. ZPublisher knows unambiguously what encoding was used by the browser.
This really sucks, you'd think that by 2002 all recent browsers would send a content-type:text/plain;charset=foobar in multipart/form-data, as the spec (from 1998) recommends... But even Mozilla 1.1 doesn't do it.
Sadly I cant see a nice way to do the same for GET requests
As an aside, one interesting tidbit about Mozilla: if you paste Unicode into a field of a form without an explicit encoding (accept-charset or document-charset), it encodes Unicode characters into xxx; and sends that on the wire. Anyway, in the near future I see no alternative to putting :utf8: into field names, and using accept-charset="UTF-8" or an utf-8 encoding for the document. Florent -- Florent Guillaume, Nuxeo (Paris, France) +33 1 40 33 79 87 http://nuxeo.com mailto:fg@nuxeo.com