Hi! Laurence Rowe wrote:
On 2 March 2011 11:29, yuppie<y.2011@wcm-solutions.de> wrote:
Laurence Rowe wrote:
On 2 March 2011 10:00, yuppie<y.2011@wcm-solutions.de> wrote:
Martin Aspeli wrote:
I don't know what setPageEncoding() does, though.
It sets a response Content-Type header with the first charset processInputs tries for decoding.
Is the charset of the request necessarily the right choice for the response? In Plone we always serve UTF-8 encoded.
getPreferredCharsets()[0] always returns 'utf-8' **if** UTF-8 is accepted.
If 'utf-8' is not in getPreferredCharsets(), it is not very likely that the browser speaks UTF-8 and processInputs will not even try to decode with UTF-8. In that case it might be better to respond with an accepted encoding.
If you serve differently encoded pages then you should set Vary: Accept-Charset.
That seems to be correct. So you found a bug in zope.publisher and five.formlib. If they do charset negotiation, they have to set Vary.
But then without normalization you'd get an explosion of different page variations.
AFAICS that normalization can't be done by the server and we can't prevent ineffective caching.
Without the Vary, it means that a visitor can poison the cache by supplying (only) a weird charset in Accept-Encoding. The page would then be served in this encoding, cached downstream, and if other visitor's browsers don't support that charset then they have a problem.
That sounds like charset negotiation isn't a good idea and neither zope.publisher nor five.formlib should do it. If we don't negotiate the charset, we should still have a setPageEncoding method that overrides the ZPublisher default_encoding with UTF-8. But what does all that mean for the processInputs methods in Five (used by five.formlib) and in plone.z3cform? If we always send UTF-8, their current implementation doesn't make much sense to me. Don't know if we really should try to fall back to all the charsets mentioned in Accept-Charset. But at least we should *always* try UTF-8 decoding first. Cheers, Yuppie