[Zope-dev] redirect burps on unicode URLs

Martin Aspeli optilude+lists at gmail.com
Mon Mar 1 08:28:51 EST 2010


Wichert Akkerman wrote:
> On 3/1/10 13:41 , Tres Seaver wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Marius Gedminas wrote:
>>> On Sun, Feb 28, 2010 at 05:05:51PM +0100, Wichert Akkerman wrote:
>>>> On 2010-2-26 18:25, Tres Seaver wrote:
>>>>> Wichert Akkerman wrote:
>>>>>> I see this as naming confusion. In this day and age every URL is
>>>>>> effectively an IRI, and every modern browser treats them that way. If
>>>>>> you look at http://jp.wikipedia.org/ you can see how well that works. I
>>>>>> do not see why zope.publisher should not be able to support that
>>>>>> transparently. Other systems such as Routes and repoze.bfg do.
>>>>> Browseers *display* what looks like unicode to the user, but they *pass*
>>>>> URL-encoded ASCII bytes to the server.
>>>> But why can't zope.publisher do that conversion? It don't see the point
>>>> in requiring all the thousands of routines that call those functions to
>>>> do that conversion when zope.publisher can easily do so itself.
>>> +1
>>>
>>> Just like zope.publisher converts Unicode strings returned by views into
>>> UTF-8 (or whatever encoding negotiated via Accept-Charset),
>>> response.redirect() ought to Do The Right Thing with Unicode URLs or
>>> IRLs or whatever they're called.
>> - -1.
>
> --1 is the same as +1, but I suspect that is not what you meant.
>
>
>> Where is this "unicode URL" coming from?  URLs generated from code
>> should already be "correct".
>
> The only change is changing the point where 'correct' changes from
> unicode to an escaped UTF-8 encoded string. That change can made without
> breaking any backwards compatibility.

I'm with Wichert here.

In most places, we tend to carry around unicode strings internally, and 
only encode on the boundaries, e.g. when the URL is "rendered". I don't 
see why redirect() can't have a sensible and predictable policy for 
unicode strings, making life easier for everyone.

If we think that non-ASCII URLs are illegal, then maybe we should 
validate for that and throw an error. However, I don't think that's the 
case (anymore?). In that case, passing a unicode object to the function 
seems entirely consistent with other places, e.g. when we pass unicode 
to the page template engine or return unicode from a view, which the 
publisher then encodes before it's pushed down to the client.

Martin

-- 
Author of `Professional Plone Development`, a book for developers who
want to work with Plone. See http://martinaspeli.net/plone-book



More information about the Zope-Dev mailing list