Unicode id's and utf8 url encoding
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi. I think the last thread on this issue was: http://aspn.activestate.com/ASPN/Mail/Message/zope-Dev/1843637 From my reading of RFC2396 and RFC2277, the character set encoding for URLs is UTF8. This is confirmed by http://www.w3.org/International/O-URL-code.html . Does this settle the issue on how to handle non-ascii strings in URLs? If so, is their anything stopping us allowing Unicode ids in Zope? This would involve patching OFS.ObjectManager.checkValidId to accept strings not in [0-9a-zA-Z\$\-_\.\+!\*'(),], and writing replacement urllib.quote and urllib.unquote methods for use by HTTPRequest.py. If this is deemed a sane plan, I'd like to try getting this into Zope 2.7. It would also be worth fixing Python's urllib.quote and urllib.unquote methods, but that is an issue for a seperate mailing list... - -- Stuart Bishop <stuart@stuartbishop.net> ☞ http://www.stuartbishop.net/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (Darwin) iD8DBQE/sFP1AfqZj7rGN0oRAhGGAJ9dGMyhxFhG6GFr877qNx5+GP/S3wCghcOy T6lXZz60sR1VJLWjK76BmVk= =rQ0I -----END PGP SIGNATURE-----
--On Dienstag, 11. November 2003 14:13 Uhr +1100 Stuart Bishop <stuart.b@commonground.com.au> wrote:
Does this settle the issue on how to handle non-ascii strings in URLs? If so, is their anything stopping us allowing Unicode ids in Zope? This would involve patching OFS.ObjectManager.checkValidId to accept strings not in [0-9a-zA-Z\$\-_\.\+!\*'(),], and writing replacement urllib.quote and urllib.unquote methods for use by HTTPRequest.py.
If this is deemed a sane plan, I'd like to try getting this into Zope 2.7.
Too late for 2.7 in my opinion since we are heading for a beta (means no new features). -aj
On Mon, 2003-11-10 at 22:13, Stuart Bishop wrote:
Hi. I think the last thread on this issue was: http://aspn.activestate.com/ASPN/Mail/Message/zope-Dev/1843637
From my reading of RFC2396 and RFC2277, the character set encoding for URLs is UTF8. This is confirmed by http://www.w3.org/International/O-URL-code.html .
Does this settle the issue on how to handle non-ascii strings in URLs? If so, is their anything stopping us allowing Unicode ids in Zope? This would involve patching OFS.ObjectManager.checkValidId to accept strings not in [0-9a-zA-Z\$\-_\.\+!\*'(),], and writing replacement urllib.quote and urllib.unquote methods for use by HTTPRequest.py.
If this is deemed a sane plan, I'd like to try getting this into Zope 2.7.
-1 for 2.7, as each introduction of new Unicode features has historically been accompanied by a whole raft of new bugs, which are mostly not reproducible (or at least hide) on the system where the feature was developed. We are trying hard to get to stability for 2.7, with a beta planned for this week. +1 for 2.7.1 and 2.8.
It would also be worth fixing Python's urllib.quote and urllib.unquote methods, but that is an issue for a seperate mailing list...
ROIGHT! Tres. -- =============================================================== Tres Seaver tseaver@zope.com Zope Corporation "Zope Dealers" http://www.zope.com
On Tuesday 11 November 2003 14:23, Tres Seaver wrote:
On Mon, 2003-11-10 at 22:13, Stuart Bishop wrote:
Hi. I think the last thread on this issue was: http://aspn.activestate.com/ASPN/Mail/Message/zope-Dev/1843637
From my reading of RFC2396 and RFC2277, the character set encoding for URLs is UTF8. This is confirmed by http://www.w3.org/International/O-URL-code.html .
Does this settle the issue on how to handle non-ascii strings in URLs?
if only browser always worked that way :-(
If so, is their anything stopping us allowing Unicode ids in Zope? This would involve patching OFS.ObjectManager.checkValidId to accept strings not in [0-9a-zA-Z\$\-_\.\+!\*'(),], and writing replacement urllib.quote and urllib.unquote methods for use by HTTPRequest.py.
+1 on starting a prototype now, but I cant see this landing before 2.8. This is going to break every use equivalent to getattr(some_object, id). Its not obvious to me how this can be cleanly resolved in zope 2. -- Toby Dickenson
--On Dienstag, 11. November 2003 14:41 Uhr +0000 Toby Dickenson <tdickenson@geminidataloggers.com> wrote:
if only browser always worked that way :-(
If so, is their anything stopping us allowing Unicode ids in Zope? This would involve patching OFS.ObjectManager.checkValidId to accept strings not in [0-9a-zA-Z\$\-_\.\+!\*'(),], and writing replacement urllib.quote and urllib.unquote methods for use by HTTPRequest.py.
+1 on starting a prototype now, but I cant see this landing before 2.8.
This is going to break every use equivalent to getattr(some_object, id). Its not obvious to me how this can be cleanly resolved in zope 2.
Best to come up with a proposal to discuss all related issues. Btw. I heard some requests from the community to support the new international naming scheme for domain names (nothing I care about very much but it sounds that is very much related). -aj
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/11/2003, at 1:41 AM, Toby Dickenson wrote:
This is going to break every use equivalent to getattr(some_object, id). Its not obvious to me how this can be cleanly resolved in zope 2.
Your right - this is the killer :-( I had naively assumed setattr(ob, u'\N{TRADEMARK SIGN}', val) would work. It will be quite possible to make ObjectManager.__getitem__ handle Unicode ids, but hacking ExtensionClass to accept Unicode ids is beyond my ken and I'm unsure if this is even possible. - -- Stuart Bishop <stuart@stuartbishop.net> ☞ http://www.stuartbishop.net/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (Darwin) iD8DBQE/sgz8AfqZj7rGN0oRAmCSAJ9zZK9T7+dR22BuF+0PRo/hG2jEXwCgnHIa LhkA45RB4HP/GCujHiHONQc= =obky -----END PGP SIGNATURE-----
participants (5)
-
Andreas Jung -
Stuart Bishop -
Stuart Bishop -
Toby Dickenson -
Tres Seaver