----- Original Message ----- From: "Martijn Pieters" <mj@zopatista.com> To: "Alexander Limi" <limi@plone.org> Cc: <zope-dev@zope.org> Sent: Monday, April 07, 2008 4:39 AM Subject: Re: [Zope-dev] Non-ASCII characters in URLs
On Mon, Apr 7, 2008 at 1:37 AM, Alexander Limi <limi@plone.org> wrote:
Is there a good technical explanation for why Zope doesn't allow non-ASCII characters in URLs?
Because URLs don't allow non-ASCII characters?
I'd like to be able to let URLs work like this example from Wikipedia:
Your browser translates that into http://ja.wikipedia.org/wiki/%E3%83%A1%E3%82%A4%E3%83%B3%E3%83%9A%E3%83%BC%E...
Is there a fundamental reason (ie. Python objects can only be ASCII) or is it simply bugs that need to be fixed?
RFC 1738 (http://www.ietf.org/rfc/rfc1738.txt) doesn't allow non-ascii characters in URLs.
No corresponding graphic US-ASCII:
URLs are written only with the graphic printable characters of the US-ASCII coded character set. The octets 80-FF hexadecimal are not used in US-ASCII, and the octets 00-1F and 7F hexadecimal represent control characters; these must be encoded.
Now, Zope could well support UTF-8 ids, and translate URLs appropriately, but in the meantime you could use the same scheme?
IDNA (http://www.ietf.org/rfc/rfc3490.txt) and Punycode (http://www.faqs.org/rfcs/rfc3492.html) may be of some use. Jonathan