Consider using the CMF, perhaps Plone. http://plone.org Zope could be a good document management system, in this case, with the assistance of automation-capible OCR (like OmniPage automated via COM interface). You would bave to be able to extract text-metadata out of a document, so obviously you need a content-type in Zope or the CMF that is capable of taking an image and/or PDF file containing raster data, storing it, and being able to pass it to an external filter for the acquiring of text metadata. You would just need to write a Zope python product that, for example, with the CMF, defined a method called SearchableText() for your 'document'/image that served up (potentially cached) text extracted from an OCR system. You would have to, of course, write an application that fed Zope the scanned images in order, or at least uploaded them and set metadata. There are many possible ways to do this (ZEO, XML-RPC, etc). Combining the pages could be done with commercial PDF libraries, including PageCatcher, which is a commercial, Python-based PDF aggregation library/utility. http://www.reportlab.com/pageCatcher/index.html There are likely open-source solutions to this as well (in the xpdf package?), perhaps. So, yes, Zope could do this job, but you would have to write the code (feed scans into Zope, CMF content type) needed and pick decorating software components (OCR, PDF aggreagation) necessary to do it. Sean -----Original Message----- From: Stephen Liu [mailto:satimis@icare.com.hk] Sent: Monday, March 10, 2003 8:22 AM To: zope@zope.org Subject: [Zope] Whether Zope can do the job Hi all folks, I join this list recently and have no idea whether this is right place for posting following question: I am searching open source for an alternative similar to PaperPort, Pagis Pro, OmniPage or PageKeeper, applications running on Windows. They provide a platform to keep all scanned images which can then be stacked page after page, removing or re-inserting the pages stacked if required. Finally printing them as a pdf file or other format. One more important job is data searching on the scanned images. On PaperPort you can create a database bank on all scanned images and do data searching. Kindly advise whether "zope" can do the same job? Thanks in advance. B.Regards Stephen Liu _______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Hi sean, Thanks for your advice. On Linux I can do all jobs separately with the assistance of several applications but not in one go. I am looking for an out-of-the-box solution. sean.upton@uniontrib.com wrote:
Consider using the CMF, perhaps Plone. http://plone.org
I have been searching for plone.org mailing list on its website but could not find it.
Zope could be a good document management system, in this case, with the assistance of automation-capible OCR (like OmniPage automated via COM interface). You would bave to be able to extract text-metadata out of a document, so obviously you need a content-type in Zope or the CMF that is capable of taking an image and/or PDF file containing raster data, storing it, and being able to pass it to an external filter for the acquiring of text metadata. You would just need to write a Zope python product that, for example, with the CMF, defined a method called SearchableText() for your 'document'/image that served up (potentially cached) text extracted from an OCR system.
You would have to, of course, write an application that fed Zope the scanned images in order, or at least uploaded them and set metadata. There are many possible ways to do this (ZEO, XML-RPC, etc).
ZEO, XML-RPC, etc. are new to me. I will make a search with google later.
Combining the pages could be done with commercial PDF libraries, including PageCatcher, which is a commercial, Python-based PDF aggregation library/utility. http://www.reportlab.com/pageCatcher/index.html There are likely open-source solutions to this as well (in the xpdf package?), perhaps.
I already solve this problem re page combination. There are many solutions. Even a word-processing application can do the job.
So, yes, Zope could do this job, but you would have to write the code (feed scans into Zope, CMF content type) needed and pick decorating software components (OCR, PDF aggreagation) necessary to do it.
Is there any hint to start B.Regards Stephen
-----Original Message----- From: Stephen Liu [mailto:satimis@icare.com.hk] Sent: Monday, March 10, 2003 8:22 AM To: zope@zope.org Subject: [Zope] Whether Zope can do the job
Hi all folks,
I join this list recently and have no idea whether this is right place for posting following question:
I am searching open source for an alternative similar to PaperPort, Pagis Pro, OmniPage or PageKeeper, applications running on Windows.
They provide a platform to keep all scanned images which can then be stacked page after page, removing or re-inserting the pages stacked if required. Finally printing them as a pdf file or other format.
One more important job is data searching on the scanned images. On PaperPort you can create a database bank on all scanned images and do data searching.
Kindly advise whether "zope" can do the same job? Thanks in advance.
B.Regards Stephen Liu
_______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
participants (2)
-
sean.upton@uniontrib.com -
Stephen Liu