Hello, i search for a possibility to query a catalog like this: catalog({'meta_type':myobjecttype, [x for x in 'keyword-index' if x.find(mysubstring) != -1]}) Description: there is a metadata (keyword index) for urls in my catalog and i have to know the catalogentries with a matching substring in this keyword-index Example: catalog-entry1 -> ['/abc/def/123', '/ghi/123'] catalog-entry2 -> ['/ghi/123', '/abc'] catalog-entry3 -> ['/xyz/def/123', '/ghi/123'] and i search with '/abc' and wanna get entry1 and entry2 a simple iteration over all results (only filtered by meta_type) is much too slow! regards jens walte
--On Freitag, 6. August 2004 11:58 Uhr +0000 jens.walte@kk.net wrote:
Hello, i search for a possibility to query a catalog like this:
catalog({'meta_type':myobjecttype, [x for x in 'keyword-index' if x.find(mysubstring) != -1]})
Description: there is a metadata (keyword index) for urls in my catalog and i have to know the catalogentries with a matching substring in this keyword-index
KW indexes can not be search for substrings. You want to use a text index. But none of the existing text indexes provides a general substring search for performance reasons. The only choice is a post-query filter. -aj
----- Original Message ----- From: <jens.walte@kk.net>
i search for a possibility to query a catalog like this:
catalog({'meta_type':myobjecttype, [x for x in 'keyword-index' if x.find(mysubstring) != -1]})
Description: there is a metadata (keyword index) for urls in my catalog and i have to know the catalogentries with a matching substring in this keyword-index
Example: catalog-entry1 -> ['/abc/def/123', '/ghi/123'] catalog-entry2 -> ['/ghi/123', '/abc'] catalog-entry3 -> ['/xyz/def/123', '/ghi/123'] and i search with '/abc' and wanna get entry1 and entry2
The form of query you created above is not supported by zcatalog. To get the results you seem to be looking for I suggest that you create a ZCTextIndex and do a wildcard suffix search. eg. (search in python script) searchterm = 'abc*' searchresults = Catalog.searchResults({'newtextindex' : searchterm}) for res in searchresults: do something with results... where, 'newtextindex' is a ZCTextIndex which was built using your metadata. HTH Jonathan
jens.walte@kk.net wrote at 2004-8-6 11:58 UT:
i search for a possibility to query a catalog like this:
catalog({'meta_type':myobjecttype, [x for x in 'keyword-index' if x.find(mysubstring) != -1]})
Description: there is a metadata (keyword index) for urls in my catalog and i have to know the catalogentries with a matching substring in this keyword-index
A few days ago, I released a new version of ManagableIndex. Its KeywordIndex supports glob and regular expression mathing (this includes substring searches). Your search term should however start with plain text (and not some kind of wildcard) when your index is really large. Otherwise, searches may get too slow. You find "ManagableIndex" at <http://www.dieter.handshake.de/pyprojects/zope> -- Dieter
If you are always searching for whole segments, then this is doable. If this is always a prefix search (which it is in your example) then you should be able to use a range search of the form:: path = '/some/path' catalog({'url':[path, path + '/\x7F'] 'range':'min max'}) If you want to be able match any path segment, Then you need to populate the index differently. So long as you want to match whole segments of the path, this should still perform reasonably. For the path '/my/obj/path' you would index the following values:: '/my/obj/path' 'my/obj/path' 'obj/path' 'path' With those values indexed you should be able to use the same range search above to find objects that match a particular subpath. hth, -Casey On 06 Aug 2004 11:58:06 UT jens.walte@kk.net wrote:
Hello, i search for a possibility to query a catalog like this:
catalog({'meta_type':myobjecttype, [x for x in 'keyword-index' if x.find(mysubstring) != -1]})
Description: there is a metadata (keyword index) for urls in my catalog and i have to know the catalogentries with a matching substring in this keyword-index
Example: catalog-entry1 -> ['/abc/def/123', '/ghi/123'] catalog-entry2 -> ['/ghi/123', '/abc'] catalog-entry3 -> ['/xyz/def/123', '/ghi/123'] and i search with '/abc' and wanna get entry1 and entry2
a simple iteration over all results (only filtered by meta_type) is much too slow!
regards jens walte
_______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
participants (5)
-
Andreas Jung -
Casey Duncan -
Dieter Maurer -
jens.walte@kk.net -
Jonathan Hobbs