What is modification, and why do we care? (was Re: [Zope3-dev]
Missing ObjectContentModifiedEvent)
Dieter Maurer
dieter at handshake.de
Sat May 28 04:06:44 EDT 2005
Jim Fulton wrote at 2005-5-27 10:45 -0400:
> ...
>> You cannot make text extraction cheap (as it handles potentially large
>> data).
>
>You can't make it cheap in all applications. For most applications,
>text extraction and comparison is very cheap.
>
>I'm guessing that you are refering to indexing large (book size)
>documents. I would argue that this is pretty specialized.
No, I am speaking about a repository with office documents (letters,
reports, drafts, documentation, ...) which apparently is not too
rare at least in a Plone like context.
>And it is usually not the case that text extraction is expensive.
I analysed last year text extraction from office documents.
WVware extraction for documents in the order of 1 MB size
took time in the order of seconds; OpenOffice text extraction
in the order of 10 seconds (after optimization; standard - twice
as much).
Definitely, I do not like this time for any change in a metadatum
or a workflow change. While a user accepts some seconds delays
when he uploads large documents, he feels it unacceptable to
wait for seconds when he performs e.g. a workflow action on such
a document.
--
Dieter
More information about the Zope3-dev
mailing list