[Zope-dev] Non-ASCII characters in URLs
Dieter Maurer
dieter at handshake.de
Mon Apr 7 15:45:00 EDT 2008
Wichert Akkerman wrote at 2008-4-7 20:45 +0200:
> ...
>> Almost surely, Alexander wants to ask why Zope does not allow
>> non-ASCII characters in ids.
>>
>> And, in fact, there are only two reasons:
>>
>> * lazyness of the Zope developpers:
>>
>> without the restriction to ASCII characters
>> careful quoting (and unquoting) is necessary
>> in order to adhere to RFC 2396 (the modern uri syntax specification)
>
>This is becoming increasingly painful
I will soon have a patch against Zope 2.11b1
which gets rid of this restriction.
If there is consense, I can add it to the Zope repository.
> ...
>> * there is no way to specify the encoding used for non ASCII characters.
>>
>> HTML 4 suggests to convert non ASCII characters first to
>> UTF-8 and then url escape the result
>> but most HTTP clients do not follow this suggestion.
>> Instead, they use the charset found one the page
>> that cause them to construct the uri.
>>
>> I have observed that MS WebDAV from some WebDAV commands
>> transfers the url as given and for some other
>> commands recodes them into utf-8.
>>
>> Thus, supporting non ASCII ids occationally may cause
>> surprises.
>
>You mean non ASCII URI's, not non ASCII ids here I suspect. Somehow I'm
>not surprised those are painful :(
No, I mean non-ASCII ids.
They lead to uris with some escaped characters and MS WebDAV for some commands
unescapes the uris, interprets them in some default charset ("windows-1252"
in our case), recodes them in utf-8,
escapes them again and then uses them in the commands.
Examples are the COPY and MOVE commands. If an object has
a non ASCII charater in its id, say "tüv", its url
may look like "http:.../t%FCv". Used in a "COPY" or "MOVE",
it is however represented as "http:.../t%C2%BCb".
--
Dieter
More information about the Zope-Dev
mailing list