What is modification, and why do we care? (was Re: [Zope3-dev]
Missing ObjectContentModifiedEvent)
Jim Fulton
jim at zope.com
Fri May 27 10:45:20 EDT 2005
Dieter Maurer wrote:
> Jim Fulton wrote at 2005-5-27 08:29 -0400:
>
>>...
>>
>>>Then, we probably do something wrong...
>>
>>That's always a possibility. I think what we are doing is
>>pretty reasonable. Perhaps you have other suggestions.
>
>
> I think we need more control over what modifications trigger
> what reindexing events.
>
> I am not yet sure about the best (or even a good) approach.
>
>
>>>>...
>>>
>>>Even computing the value for a text index (without any change
>>>to the index itself) can be very expensive: it may
>>>include expensive fetching of a large object,
>>>an expensive conversion (text extraction), expensive splitting
>>>and comparison to what is currently indexes.
>>
>>Perhaps. It depends a lot on the application.
>>
>>I suggest that, if this optimization is important, it might
>>be much easier and cleaner to make text extracttion and comparison
>>cheap, rather than, trying to solve the problem with a more complex
>>event model.
>
>
> You cannot make text extraction cheap (as it handles potentially large
> data).
You can't make it cheap in all applications. For most applications,
text extraction and comparison is very cheap.
I'm guessing that you are refering to indexing large (book size)
documents. I would argue that this is pretty specialized.
> You could make comparison cheap -- e.g. by storing last modification
> dates and comparing them.
> But, I fear, you would just move the problem to when changing the
> modification date.
I think this is a nice solution for those special cases where text
extraction is expensive. The nice thing about this solution is that
it involves a contract between the content and the index without
complicating the event framework.
>
>>...
>>I think it would be very difficult to come up with rules
>>for deciding which events might effect a text value and which would not.
>>For example, I can easily imagine objects who's searchable text
>>depends on their workflow state.
>
>
> Indeed, such objects are easily imaginable.
> But usually, it is not the case.
And it is usually not the case that text extraction is expensive.
> The problem is obviously difficult -- not solvable with
> a trivial event model and trivial reindexing dispatching.
Agreed.
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
More information about the Zope3-dev
mailing list