-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hedley Roos wrote:
object_implements |KeywordIndex |0.2172234| 4.6 This is clearly not the same issue as the other KeywordIndexes: in fact, I am astonished that anybody would be using a KeywordIndex for this at all. I would suspect that the real problem here is in the appliation, rather than the index itself.
The app is Plone :)
I don't know how Plone computes the values here (are they actual interface objects, or their monikers), nor how it is queried. I am suspicious that there is some query-time bogosity, however.
portal_type |FieldIndex |0.0025984| 384.84 This one is surprising: its performance should be pretty similar to the other FieldIndexes (e.g., 'review_state') which map a controlled vocabulary onto the entire corpus. Was the query different than 'review_state' (e.g., multi-valued vs. single-valued)?
portal_type queries are usually multivalued in Plone.
What are the use cases for that?
sourceUID |FieldIndex |0.0004886| 2046.31 Probably bogus, but I don't know how it is used.
Plone's reference_catalog
I'm betting that using a catalog at all is a flawed choice.
UID |FieldIndex |0.0003070| 3257.1 Note that this is the worst-case scenario for a FieldIndex: there is exactly one value for every key. This shouldn't be "indexed" at all, in fact, beyond a simple BTree (UID -> rid).
I've never even thought of that. Perhaps the catalog is used to present a familiar API.
targetUID |FieldIndex |0.0002287| 4372.12 I don't know what this one is used for, but it should probably be scrapped as well.
More reference_catalog.
exact_getUserId |FieldIndex |0.0001931| 5177.79 exact_getUserName |FieldIndex |0.0001816| 5504.39 I don't know how the application uses either of those indexes, but they are almost certainly bogus in any normal catalog.
Membrane and remember. They're currently tied to Plone but efforts are being made to make them work with CMF.
relationship |FieldIndex |0.0000822| 12153.1 id |FieldIndex |0.0000822| 12161.81 end |DateIndex |0.0000623| 16027.48 getGroups |FieldIndex |0.0000278| 35973.45 This is almost certainly bogus: FieldIndex is not supposed to be used with multi-valued terms.
Plone stuff, but I am intrigued by your statement. Why can FieldIndex not be used with multi-valued terms?
Because FieldIndex is designed for either exact-match queries or range queries: range queries are obviously impossible for a 'getGroups' method, and exact-match is nearly as dubious. I would have expected this to be a KeywordIndex, if it needed indexing at all.
Subject |KeywordIndex |0.0000253| 39413.57 This is the use-case for which KeywordIndex is designed. Was the query just a single term, by chance?
The simplest term is a list with only a single term (not counting the trivial case). It should be worse with more terms right?
I am reasonably confident that multi-valued queries against either FieldIndexes or KeywordIndexes will perform worse than single-valued queries against the same index.
Title |ZCTextIndex |0.0000128| 77809.46 This should be removed: there is no valid use case for doing a full-text search restricted only to the title.
Plone specific.
No, still bogus. This is an egregiously stupid choice, with *large* indexing-time downsides.
Description |ZCTextIndex |0.0000116| 86241.39 Again, should be removed.
Again, Plone specific :)
Again, egregiously bogus.
getEmail |ZCTextIndex |0.0000113| 87849.05 Should *definitely* be removed: how can you do full-text search on an e-mail address?
I think membrane is responsible for this, but you're right.
SearchableText |TextIndex |0.0000113| 88466.69 Where did this one come from? The 'SearchableText' above is a ZCTextIndex.
Membrane!
Kinda pointless for me to continue since this is turning into a Plone-specific discussion on zope-dev. But at least the whole exercise has forced us to look in detail into how all these indexes affect performance with a zodb with many many objects.
Roche investigated Tesdal's queryplan today end it seems to solve nearly all our performance problems. He'll have to elaborate.
Ripping out stupid indexes is one of the first things I do to optimize a client's badly-performing Plone site. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFJGHHu+gerLs4ltQ4RAoDgAJ9sjGoOG2KftnqFIbVxFy0sNk8EDwCfaSbH ouY7+FnSbSJvYSBtI//31ZY= =3SeB -----END PGP SIGNATURE-----