[Zope-dev] catalog performance: query plan
Hedley Roos
hedleyroos at gmail.com
Mon Nov 10 11:38:10 EST 2008
>> object_implements |KeywordIndex |0.2172234| 4.6
>
> This is clearly not the same issue as the other KeywordIndexes: in
> fact, I am astonished that anybody would be using a KeywordIndex for
> this at all. I would suspect that the real problem here is in the
> appliation, rather than the index itself.
The app is Plone :)
>> portal_type |FieldIndex |0.0025984| 384.84
>
> This one is surprising: its performance should be pretty similar to the
> other FieldIndexes (e.g., 'review_state') which map a controlled
> vocabulary onto the entire corpus. Was the query different than
> 'review_state' (e.g., multi-valued vs. single-valued)?
portal_type queries are usually multivalued in Plone.
>> sourceUID |FieldIndex |0.0004886| 2046.31
>
> Probably bogus, but I don't know how it is used.
Plone's reference_catalog
>
>> UID |FieldIndex |0.0003070| 3257.1
>
> Note that this is the worst-case scenario for a FieldIndex: there is
> exactly one value for every key. This shouldn't be "indexed" at all, in
> fact, beyond a simple BTree (UID -> rid).
I've never even thought of that. Perhaps the catalog is used to
present a familiar API.
>
>> targetUID |FieldIndex |0.0002287| 4372.12
>
> I don't know what this one is used for, but it should probably be
> scrapped as well.
More reference_catalog.
>
>> exact_getUserId |FieldIndex |0.0001931| 5177.79
>> exact_getUserName |FieldIndex |0.0001816| 5504.39
>
> I don't know how the application uses either of those indexes, but they
> are almost certainly bogus in any normal catalog.
Membrane and remember. They're currently tied to Plone but efforts are
being made to make them work with CMF.
>> relationship |FieldIndex |0.0000822| 12153.1
>> id |FieldIndex |0.0000822| 12161.81
>> end |DateIndex |0.0000623| 16027.48
>> getGroups |FieldIndex |0.0000278| 35973.45
>
> This is almost certainly bogus: FieldIndex is not supposed to be used
> with multi-valued terms.
Plone stuff, but I am intrigued by your statement. Why can FieldIndex
not be used with multi-valued terms?
>> Subject |KeywordIndex |0.0000253| 39413.57
>
> This is the use-case for which KeywordIndex is designed. Was the query
> just a single term, by chance?
The simplest term is a list with only a single term (not counting the
trivial case). It should be worse with more terms right?
>> Title |ZCTextIndex |0.0000128| 77809.46
>
> This should be removed: there is no valid use case for doing a
> full-text search restricted only to the title.
Plone specific.
>> Description |ZCTextIndex |0.0000116| 86241.39
>
> Again, should be removed.
Again, Plone specific :)
>> getEmail |ZCTextIndex |0.0000113| 87849.05
>
> Should *definitely* be removed: how can you do full-text search on an
> e-mail address?
I think membrane is responsible for this, but you're right.
>> SearchableText |TextIndex |0.0000113| 88466.69
>
> Where did this one come from? The 'SearchableText' above is a ZCTextIndex.
Membrane!
Kinda pointless for me to continue since this is turning into a
Plone-specific discussion on zope-dev. But at least the whole exercise
has forced us to look in detail into how all these indexes affect
performance with a zodb with many many objects.
Roche investigated Tesdal's queryplan today end it seems to solve
nearly all our performance problems. He'll have to elaborate.
Hedley
More information about the Zope-Dev
mailing list