[Zope-Checkins] CVS: Zope3/lib/python/Zope/Publisher/HTTP - HTTPCharsets.py:1.4
Guido van Rossum
guido@python.org
Fri, 14 Jun 2002 15:33:57 -0400
> + # UTF-8 is **always** preferred over anything else.
> + # XXX Please give more details as to why!
I'm guessing that is because all UTF-8 strings are legal Latin-1
strings, (and probably also legal in most other "mode-less" 8-bit
encodings), but in practice, *most* Latin-1 strings aren't valid
UTF-8. So if you see a string that's legal UTF-8 and also legal
Latin-1, it's more likely that the UTF-8 interpretation is what was
intended, because it's statistically very unlikely that you'd arrive
at a legal UTF-8 string by typing something meaningful in Latin-1.
--Guido van Rossum (home page: http://www.python.org/~guido/)