DTML-SQLVAR and Unicode and PostgreSQL
Hi All, I am trying to set up a simple retrieval of data in my PostgreSQL database. The input (in Arabic) has to be in unicode. I can specify an argument of 'arabic_name:ustring' and that is accepted okay. However, something happens between the submission of the form and the execution of the query in the PostgreSQL backend, and no data is returned. If I cut the query, paste it into a terminal window, and excute it in psql, I get the desired data back. Anyone have any ideas? I notice that 'ustring' is not a type you can specify in the dtml-sqlvar tag. (I am running Redhat Linux 9.0, Zope 2.61, and the ZPsycopgDA adaptor). TIA, Glenn -- Glenn R Williams Ghoti Software ---------------------------------------------------------------------- Mathematical theories are the part of science you could continue to do if you woke up tomorrow and discovered the universe was gone.
Glenn R Williams wrote at 2003-5-20 16:42 -0400:
I am trying to set up a simple retrieval of data in my PostgreSQL database. The input (in Arabic) has to be in unicode. I can specify an argument of 'arabic_name:ustring' and that is accepted okay. However, something happens between the submission of the form and the execution of the query in the PostgreSQL backend, and no data is returned.
The unicode string will probably be somehow encoded before it is sent to Postgres. This encoding is probably the wrong one. I would increase the PostgreSQL log level sufficiently that it logs the queries. Then, I would compare the failing (from Zope) and succeeding (from "psql") queries and determine the correct encoding. Then, I would explicitely call for this encoding. Dieter
Thanks Dieter, I got one step further: the PostgreSQL log shows my query with converted HTML entites: LOG: query: select * from xp_name where original_name = 'عامررشيد' As you can see, this is not what I want! But how do I tell Zope to use utf8 encoding for the query? When building the query in Zope, I specified as an argumen "original_name:ustring". However, the actual query in the Z SQL method only allows a type of "string". Am I doing something wrong? Thanks, Glenn On Wed, 2003-05-21 at 14:59, Dieter Maurer wrote:
Glenn R Williams wrote at 2003-5-20 16:42 -0400:
I am trying to set up a simple retrieval of data in my PostgreSQL database. The input (in Arabic) has to be in unicode. I can specify an argument of 'arabic_name:ustring' and that is accepted okay. However, something happens between the submission of the form and the execution of the query in the PostgreSQL backend, and no data is returned.
The unicode string will probably be somehow encoded before it is sent to Postgres. This encoding is probably the wrong one.
I would increase the PostgreSQL log level sufficiently that it logs the queries. Then, I would compare the failing (from Zope) and succeeding (from "psql") queries and determine the correct encoding.
Then, I would explicitely call for this encoding.
Dieter -- Glenn R Williams Ghoti Software
Mathematical theories are the part of science you could continue to do if you woke up tomorrow and discovered the universe was gone.
Glenn R Williams wrote at 2003-5-21 17:58 -0400:
I got one step further: the PostgreSQL log shows my query with converted HTML entites:
LOG: query: select * from xp_name where original_name = 'عامررشيد'
As you can see, this is not what I want! But how do I tell Zope to use utf8 encoding for the query? When building the query in Zope, I specified as an argumen "original_name:ustring". However, the actual query in the Z SQL method only allows a type of "string".
Am I doing something wrong?
Not you, but someone does... I fear it is your browser. We must find out who decides to code the arabic characters as XML/SGML character references. You told us that the value came from an HTML form, right? I expect, it was a form "POST"ed to the server. Apparently, the browser used a Content-Type ("text/sgml" or "text/xml") which Zope does not expect and interpret wrongly. I see several options: * Use 'METHOD="GET"' in your form. This forces a standard compliant browser to use URL-encoding (with characters first encoded as UTF-8 and then url-escaped). * HTML allows the server to control the content-types for form variables. Try to use the corresponding attribute to forbid "text/sgml", "text/xml". * Convert the character references yourself into UTF-8. This is easy: first convert to unicode (with "unichr") then encode the resulting unicode string as UTF-8 (with the unicode object's "encode" method). The last option is probably the most robust one. Dieter
The query I am using was created by a adding a Z Sql method, and I enter the input into the test field. Looking at the Zope log, it is using a GET method and passing the server a string like this (splitting the line for readability): "http://localhost:8080/gloonie/xpersec/sqlFindNames/ manage_test?original_name%3Austring=%D8%B9%D8%A7%D9%85%D8%B1+%D8%B1%D8%B4%D9%8A%D8%AF&SUBMIT=Submit+Query" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030420" This seems correct so far. So the problem may be in the Databas adaptor (ZPsycopgDA). I'll let you know if I find anything. Thanks again for you help, Glenn On Thu, 2003-05-22 at 14:08, Dieter Maurer wrote:
Glenn R Williams wrote at 2003-5-21 17:58 -0400:
I got one step further: the PostgreSQL log shows my query with converted HTML entites:
LOG: query: select * from xp_name where original_name = 'عامررشيد'
As you can see, this is not what I want! But how do I tell Zope to use utf8 encoding for the query? When building the query in Zope, I specified as an argumen "original_name:ustring". However, the actual query in the Z SQL method only allows a type of "string".
Am I doing something wrong?
Not you, but someone does...
I fear it is your browser.
We must find out who decides to code the arabic characters as XML/SGML character references.
You told us that the value came from an HTML form, right? I expect, it was a form "POST"ed to the server. Apparently, the browser used a Content-Type ("text/sgml" or "text/xml") which Zope does not expect and interpret wrongly.
I see several options:
* Use 'METHOD="GET"' in your form. This forces a standard compliant browser to use URL-encoding (with characters first encoded as UTF-8 and then url-escaped).
* HTML allows the server to control the content-types for form variables. Try to use the corresponding attribute to forbid "text/sgml", "text/xml".
* Convert the character references yourself into UTF-8. This is easy: first convert to unicode (with "unichr") then encode the resulting unicode string as UTF-8 (with the unicode object's "encode" method).
The last option is probably the most robust one.
Dieter
_______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev ) -- Glenn R Williams Ghoti Software
Mathematical theories are the part of science you could continue to do if you woke up tomorrow and discovered the universe was gone.
participants (2)
-
Dieter Maurer -
Glenn R Williams