Matt writes:
... browser does not send "charset" parameter for "form" data ...
POST /hi HTTP/1.0 ...
Content-type: multipart/form-data; boundary=---------------------------17670043309955870831526446972 Content-Length: 180 You should not expect a "charset" parameter to the "multipart/form-data" content type. The parameter can appear in each single part (when applicable) not the multipart wrapper.
HTML 4.0 specifies: As with all multipart MIME types, each part has an optional "Content-Type" header that defaults to "text/plain". User agents should supply the "Content-Type" header, accompanied by a "charset" parameter.
... detecting used charsets ... We use UTF-8 and ISO-8859-1 encodings.
Our experience is, that browsers use the encoding for form posts that they used to display the form itself. Of cause, the browser must have been explicitly told, which encoding it has to use for form rendering. Otherwise, it uses the default encoding (defined by the user). To be precise: If we send a page (containing a form) to a browser with a "Content-Type: text/html; charset=UTF-8" HTTP header, then we will get the form data back in an UTF-8 encoding. If the page has instead a Content-Type: text/html; charset=ISO-8859-1" HTTP header, the delivered form data is encoded in ISO-8859-1. This is as I would expect it to be. Dieter