[Ann] Incremental filtering: a new way for catalog query optimization
Catalog searches are often slow. IncrementalSearch[2] can lead to drastic gains in query time. However, there is a large class of queries that cannot be significantly sped up: queries that contain huge Or subqueries involving bushy indexes, usually resulting from a time based subquery checking effectiveness and expiration (or other similar use cases) The newest versions of the companions "AdvancedQuery", "ManagableIndex" and "IncrementalSearch[2]" support incremental filtering to execute most such queries efficiently. With incremental filtering, the index is not use in the usual way. Instead, the remaining query parts determine a set of document candidates with is then filtered by the filtering subqueries, dropping documents not matched by these subqueries. Lets look at an example: Suppose you search for news containing 'AdvancedQuery' which are effective and not expired. The standard query would look like Eq('portal_type','News') & Eq('SearchableText', 'AdvancedQuery') & Le('effective', now) & Ge('expires', now) Internally, the "Le('effective', now)" subquery is expanded into "Or(*[Eq('effective', t) for t in 'effective' and t<=now]) which usually is huge. The "Ge" subquery is similarly expanded. The filtering query has the form Eq('portal_type','News') & Eq('SearchableText', 'AdvancedQuery') & Ge('expires', now, filter=True) & Le('effective', now, filter=True) When this query is executed, "Eq('portal_type','News') & Eq('SearchableText', 'AdvancedQuery')" determines a set of candidate objects. From this (probably small) set, objects not satisfying "expires >= now" and (then) "effective <= now" are filtered out. This way, we avoid the construction of huge Or subqueries. Of course, filtering is only efficient in some circumstances: usually, when the other query parts already garantee a small set of candidates and the filtering can avoid the construction of large intermediaries. Otherwise, filtering may not improve the query speed but increase it (maybe even drastically). Incremental filtering is a powerful optimazation tool, which need careful usage... You need the complete bundle ("AdvancedQuery", "IncrementalSearch[2]" and "ManagableIndex") when you want to use incremental filtering. More information and download: <http://www.dieter.handshake.de/pyprojects/zope> -- Dieter
participants (1)
-
Dieter Maurer