[Zope-dev] Non-ASCII characters in URLs

Wichert Akkerman wichert at wiggy.net
Mon Apr 7 14:45:23 EDT 2008


Previously Dieter Maurer wrote:
> Martijn Pieters wrote at 2008-4-7 10:39 +0200:
> >On Mon, Apr 7, 2008 at 1:37 AM, Alexander Limi <limi at plone.org> wrote:
> >>  Is there a good technical explanation for why Zope doesn't allow non-ASCII
> >> characters in URLs?
> >
> >Because URLs don't allow non-ASCII characters?
> 
> Almost surely, Alexander wants to ask why Zope does not allow
> non-ASCII characters in ids.
> 
> And, in fact, there are only two reasons:
> 
>   *  lazyness of the Zope developpers:
> 
>      without the restriction to ASCII characters
>      careful quoting (and unquoting) is necessary
>      in order to adhere to RFC 2396 (the modern uri syntax specification)

This is becoming increasingly painful: it means we can't really use Active
Directory's ObjectGUID as userid, it breaks with LDAP DN's with
non-ASCII characters (all too common). I really wish Zope ID's were
either binary strings or unicode strings.

>   *  there is no way to specify the encoding used for non ASCII characters.
> 
>      HTML 4 suggests to convert non ASCII characters first to
>      UTF-8 and then url escape the result
>      but most HTTP clients do not follow this suggestion.
>      Instead, they use the charset found one the page
>      that cause them to construct the uri.
> 
>      I have observed that MS WebDAV from some WebDAV commands
>      transfers the url as given and for some other
>      commands recodes them into utf-8.
> 
>      Thus, supporting non ASCII ids occationally may cause
>      surprises.

You mean non ASCII URI's, not non ASCII ids here I suspect.  Somehow I'm
not surprised those are painful :(

Wichert.

-- 
Wichert Akkerman <wichert at wiggy.net>    It is simple to make things.
http://www.wiggy.net/                   It is hard to make things simple.


More information about the Zope-Dev mailing list