[Zope-dev] Non-ASCII characters in URLs
Jonathan
dev101 at magma.ca
Mon Apr 7 08:32:17 EDT 2008
----- Original Message -----
From: "Martijn Pieters" <mj at zopatista.com>
To: "Alexander Limi" <limi at plone.org>
Cc: <zope-dev at zope.org>
Sent: Monday, April 07, 2008 4:39 AM
Subject: Re: [Zope-dev] Non-ASCII characters in URLs
> On Mon, Apr 7, 2008 at 1:37 AM, Alexander Limi <limi at plone.org> wrote:
>> Is there a good technical explanation for why Zope doesn't allow
>> non-ASCII
>> characters in URLs?
>
> Because URLs don't allow non-ASCII characters?
>
>> I'd like to be able to let URLs work like this example from Wikipedia:
>>
>> http://ja.wikipedia.org/wiki/メインページ
>
> Your browser translates that into
> http://ja.wikipedia.org/wiki/%E3%83%A1%E3%82%A4%E3%83%B3%E3%83%9A%E3%83%BC%E3%82%B8
>
>> Is there a fundamental reason (ie. Python objects can only be ASCII) or
>> is
>> it simply bugs that need to be fixed?
>
> RFC 1738 (http://www.ietf.org/rfc/rfc1738.txt) doesn't allow non-ascii
> characters in URLs.
>
> No corresponding graphic US-ASCII:
>
> URLs are written only with the graphic printable characters of the
> US-ASCII coded character set. The octets 80-FF hexadecimal are not
> used in US-ASCII, and the octets 00-1F and 7F hexadecimal represent
> control characters; these must be encoded.
>
> Now, Zope could well support UTF-8 ids, and translate URLs
> appropriately, but in the meantime you could use the same scheme?
IDNA (http://www.ietf.org/rfc/rfc3490.txt) and Punycode
(http://www.faqs.org/rfcs/rfc3492.html) may be of some use.
Jonathan
More information about the Zope-Dev
mailing list