All, Doing the text pulling from COM is *SLOW* to say the least, You'd probabnly be better converting them to RTF and then using something like OmniMark to convert to XML. That way you'd have the best of both worlds, including something you can render to HTML when zDOM/zXSLT becomes a reality. I already do this and it's fast enough at the conversion, a 500k doc takes about 2 seconds. hth Phil phil.harris@zope.co.uk ----- Original Message ----- From: Dieter Maurer <dieter@handshake.de> To: Simon Coles <simon@nipltd.com> Cc: <zope@zope.org> Sent: Friday, August 04, 2000 9:06 PM Subject: Re: [Zope] ZCatalog attachments?
Simon Coles writes:
We have binary files stored in Zope, for example Word documents (but could be any of a variety of document types).
We would like to be able to index and search the contents of these files using ZCatalog. So if a Word file contains the word "Fred", then any search for "Fred" would include that file in the list of documents returned. Someone else already told you, that you must create a parameterless method (it need not necessary be named "PrincipiaSearchSource") that returns the files content.
You may not need to keep the rendered version around but may be able to extract the plain text on demand. I think, there is a "word.dll" that provides access to MS Word from applications. Alternatively, you could control Word via COM.
Dieter
_______________________________________________ Zope maillist - Zope@zope.org http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )