[Zope] TextIndexNG3 question

Andreas Jung lists at zopyx.com
Mon Feb 18 03:10:10 EST 2008



--On 17. Februar 2008 22:51:47 -0800 Erik Myllymaki 
<erik.myllymaki at aviawest.com> wrote:

> I have an older Zope install and i want to enable searching of Page
> Templates and PDFs.
>
> Because it's an older Zope version (2.8.5) I have had to go back a few
> revisions of TextIndexNG3 (3.1.16) and Five (1.2.6)
>
> Install seems fine including setup of the extensions modules.
>
> I create a textIndexNG Index called PrincipiaSearchSource, Converters
> show that pdftotext is available and HTML to ASCII is 'always' available.
>
> I find all Page Templates and PDFs and Catalog them. They do show up in
> the Catalog, but the Page Templates have all their HTML tags included in
> the catalog(I thought they would be stripped automagically)

Your expectations are wrong. If an object does not provide 
IIndexableContent or if there is no adapter for this then TXNG3 will 
default to the "old" Zope 2 indexing behaviour and index the string 
representation of the content as it is.

> and the PDFs
> have no words cataloged at all.

If you have the external converters installed and if they are in the $PATH
and available to the Python interpreter process then I have strong doubts 
about that. Trible check that. If necessary take the debugger for checking
the calls of the external converters.

-aj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 194 bytes
Desc: not available
Url : http://mail.zope.org/pipermail/zope/attachments/20080218/aba597a4/attachment.bin


More information about the Zope mailing list