[ZCM] [ZC] 227/ 6 Comment "TextIndex: Can't index unicode strings"
Collector: Zope Bugs and Patches ...
zope-coders@zope.org
Sun, 17 Feb 2002 20:55:26 -0500
Issue #227 Update (Comment) "TextIndex: Can't index unicode strings"
Status Pending, Zope/bug medium
To followup, visit:
http://collector.zope.org/Zope/227
==============================================================
= Comment - Entry #6 by snej on Feb 17, 2002 8:55 pm
I created a folder, containing
a catalog with
a Vocabulary with
a unicode splitter.
And a Script that returns some Unicode string.
(u'\xfe \x2031 Huhu') named PrincipiaSearchSource
Indexing everything in that folder returns:
Error Type: UnicodeError
Error Value: ASCII encoding error: ordinal not in range(128)
Traceback (innermost last):
File /home/jens/work/tests/edbzope/Zope-2.5.0-src/lib/python/ZPublisher/Publish.py, line 150, in publish_module
File /home/jens/work/tests/edbzope/Zope-2.5.0-src/lib/python/ZPublisher/Publish.py, line 114, in publish
File /home/jens/work/tests/edbzope/Zope-2.5.0-src/lib/python/Zope/__init__.py, line 158, in zpublisher_exception_hook
(Object: ztest1)
File /home/jens/work/tests/edbzope/Zope-2.5.0-src/lib/python/ZPublisher/Publish.py, line 98, in publish
File /home/jens/work/tests/edbzope/Zope-2.5.0-src/lib/python/ZPublisher/mapply.py, line 88, in mapply
(Object: manage_catalogFoundItems)
File /home/jens/work/tests/edbzope/Zope-2.5.0-src/lib/python/ZPublisher/Publish.py, line 39, in call_object
(Object: manage_catalogFoundItems)
File /home/jens/work/tests/edbzope/Zope-2.5.0-src/lib/python/Products/ZCatalog/ZCatalog.py, line 330, in manage_catalogFoundItems
(Object: ztest1)
File /home/jens/work/tests/edbzope/Zope-2.5.0-src/lib/python/Products/ZCatalog/ZCatalog.py, line 697, in ZopeFindAndApply
(Object: ztest1)
File /home/jens/work/tests/edbzope/Zope-2.5.0-src/lib/python/Products/ZCatalog/ZCatalog.py, line 480, in catalog_object
(Object: ztest1)
File /home/jens/work/tests/edbzope/Zope-2.5.0-src/lib/python/Products/ZCatalog/Catalog.py, line 367, in catalogObject
File /home/jens/work/tests/edbzope/Zope-2.5.0-src/lib/python/Products/PluginIndexes/TextIndex/TextIndex.py, line 285, in index_object
(Object: PrincipiaSearchSource)
UnicodeError: (see above)
I know that it is possible to work around using _encoding
and UTF-8, but I would prefer to pass my unicode without
encoding and decoding.
________________________________________
= Comment - Entry #5 by ajung on Feb 17, 2002 6:57 pm
Please provide the traceback !
________________________________________
= Comment - Entry #4 by snej on Feb 17, 2002 5:01 pm
The UnicodeSplitter should be able to index UnicodeStrings, though? The workaround you describe works, as described below.
________________________________________
= Comment - Entry #3 by ajung on Feb 17, 2002 3:14 pm
Are you using the UnicodeSplitter ? If you have different encoding
than ASCII either change the default encoding in site.py or
set <index>_encoding to the encoding of the document.
________________________________________
= Comment - Entry #2 by snej on Feb 17, 2002 2:59 pm
Uploaded: "patsch"
- http://collector.zope.org/Zope/227/patsch/view
A test for tests/testTextIndex.py
________________________________________
= Request - Entry #1 by snej on Feb 17, 2002 2:40 pm
index_object() of TextIndex.py raises a
UnicodeError: ASCII encoding error: ordinal not in range(128)
for strings containing actually non-ASCII unicode,
because it applies str() on all input.
Workaround: Use xxx_encoding to pass unicode
in an encoding into the TextIndex.
==============================================================