[Zope3-checkins] SVN: Zope3/trunk/src/zope/ Merged from
/z3/jim-index-restructure-2004-12 branch:
Jim Fulton
jim at zope.com
Thu Dec 9 15:56:06 EST 2004
Log message for revision 28610:
Merged from /z3/jim-index-restructure-2004-12 branch:
r28574 | jim | 2004-12-06 10:04:19 -0500 (Mon, 06 Dec 2004) | 11 lines
Updated IInjection to emphasize indexing of values (for documents),
rather than documents.
Added IIndexSearch, which provides a search that returns integer sets
or mappings.
Updated field indexes to provide IIndexSearch as their only search
method.
Replaced the field-index tests with a doctest.
------------------------------------------------------------------------
r28576 | jim | 2004-12-06 16:52:05 -0500 (Mon, 06 Dec 2004) | 2 lines
Renamed apply_index to apply.
------------------------------------------------------------------------
r28577 | jim | 2004-12-07 13:15:57 -0500 (Tue, 07 Dec 2004) | 10 lines
r28578 | jim | 2004-12-07 13:17:30 -0500 (Tue, 07 Dec 2004) | 4 lines
- Removed the unused pipeline-element framework. WHUI
- Moved the nbest code out of text, as it should generally be
used by applications that call indexes, not by the indexes
themselves.
- Moved the text-indexing interfaces into text/interfaces.py.
- Converted the interfaces package into a module
------------------------------------------------------------------------
r28579 | jim | 2004-12-07 13:23:58 -0500 (Tue, 07 Dec 2004) | 2 lines
Moved IExtendedQuerying to text interfaces.py
------------------------------------------------------------------------
r28580 | jim | 2004-12-07 17:36:18 -0500 (Tue, 07 Dec 2004) | 4 lines
Refactored the text index to implement IIndexSearch, rather than
IQuerying. Also renamed TextIndexWrapper to TextIndex. (Wrapper was
confusing.)
------------------------------------------------------------------------
r28581 | jim | 2004-12-07 17:54:00 -0500 (Tue, 07 Dec 2004) | 2 lines
Moved the topic- and keyword-index interfaces into their own interface modules
------------------------------------------------------------------------
r28591 | jim | 2004-12-08 18:14:23 -0500 (Wed, 08 Dec 2004) | 2 lines
Updated text indexes to use IF {integer->float} BTrees rather than II BTrees
------------------------------------------------------------------------
r28595 | jim | 2004-12-08 18:35:13 -0500 (Wed, 08 Dec 2004) | 4 lines
Updated catalog code to reflect indexing api changes.
Also removed keyword indexes, which were incomplete.
------------------------------------------------------------------------
r28602 | jim | 2004-12-09 14:41:55 -0500 (Thu, 09 Dec 2004) | 5 lines
Changed indexes using sets to use IFSets rather than IISets.
This is needed ebcause we have switched to using floating-point
scores.
------------------------------------------------------------------------
r28604 | jim | 2004-12-09 15:10:58 -0500 (Thu, 09 Dec 2004) | 4 lines
r28607 | jim | 2004-12-09 15:28:41 -0500 (Thu, 09 Dec 2004) | 2 lines
Moved the functional tests to the browser package and created a new
README.txt aimed at Python programmers. This documents the new
IIndexedSearch API.
Changed:
D Zope3/trunk/src/zope/app/catalog/README.txt
A Zope3/trunk/src/zope/app/catalog/README.txt
A Zope3/trunk/src/zope/app/catalog/browser/README.txt
U Zope3/trunk/src/zope/app/catalog/browser/configure.zcml
A Zope3/trunk/src/zope/app/catalog/browser/ftests.py
U Zope3/trunk/src/zope/app/catalog/catalog.py
U Zope3/trunk/src/zope/app/catalog/configure.zcml
D Zope3/trunk/src/zope/app/catalog/ftests.py
U Zope3/trunk/src/zope/app/catalog/interfaces.py
D Zope3/trunk/src/zope/app/catalog/keyword.py
U Zope3/trunk/src/zope/app/catalog/tests.py
U Zope3/trunk/src/zope/app/catalog/text.py
U Zope3/trunk/src/zope/app/zptpage/textindex/configure.zcml
U Zope3/trunk/src/zope/app/zptpage/textindex/tests.py
U Zope3/trunk/src/zope/app/zptpage/textindex/zptpage.py
A Zope3/trunk/src/zope/index/field/README.txt
U Zope3/trunk/src/zope/index/field/index.py
D Zope3/trunk/src/zope/index/field/tests/
A Zope3/trunk/src/zope/index/field/tests.py
D Zope3/trunk/src/zope/index/interfaces/
A Zope3/trunk/src/zope/index/interfaces.py
U Zope3/trunk/src/zope/index/keyword/index.py
A Zope3/trunk/src/zope/index/keyword/interfaces.py
U Zope3/trunk/src/zope/index/keyword/tests.py
A Zope3/trunk/src/zope/index/nbest.py
A Zope3/trunk/src/zope/index/tests.py
U Zope3/trunk/src/zope/index/text/__init__.py
U Zope3/trunk/src/zope/index/text/baseindex.py
U Zope3/trunk/src/zope/index/text/cosineindex.py
U Zope3/trunk/src/zope/index/text/htmlsplitter.py
A Zope3/trunk/src/zope/index/text/interfaces.py
U Zope3/trunk/src/zope/index/text/lexicon.py
D Zope3/trunk/src/zope/index/text/nbest.py
U Zope3/trunk/src/zope/index/text/okapiindex.py
U Zope3/trunk/src/zope/index/text/parsetree.py
D Zope3/trunk/src/zope/index/text/pipelinefactory.py
U Zope3/trunk/src/zope/index/text/queryparser.py
U Zope3/trunk/src/zope/index/text/setops.py
U Zope3/trunk/src/zope/index/text/tests/queryhtml.py
D Zope3/trunk/src/zope/index/text/tests/test_nbest.py
D Zope3/trunk/src/zope/index/text/tests/test_pipelinefactory.py
U Zope3/trunk/src/zope/index/text/tests/test_queryengine.py
U Zope3/trunk/src/zope/index/text/tests/test_queryparser.py
U Zope3/trunk/src/zope/index/text/tests/test_setops.py
U Zope3/trunk/src/zope/index/text/tests/test_textindexwrapper.py
A Zope3/trunk/src/zope/index/text/textindex.py
A Zope3/trunk/src/zope/index/text/textindex.txt
D Zope3/trunk/src/zope/index/text/textindexwrapper.py
U Zope3/trunk/src/zope/index/topic/filter.py
U Zope3/trunk/src/zope/index/topic/index.py
A Zope3/trunk/src/zope/index/topic/interfaces.py
-=-
Deleted: Zope3/trunk/src/zope/app/catalog/README.txt
===================================================================
--- Zope3/trunk/src/zope/app/catalog/README.txt 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/app/catalog/README.txt 2004-12-09 20:56:05 UTC (rev 28610)
@@ -1,391 +0,0 @@
-Catalogs
-========
-
-Catalogs are simple tools used to suppprt searching. A catalog
-manages a collection of indexes, and aranges for objects to indexed
-with it's contained indexes.
-
-TODO: Filters
- Catalogs should provide the option to filter the objects the
- catalog. This would facilitate the use of separate catalogs for
- separate purposes. It should be possible to specify a a
- collection of types (interfaces) to be cataloged and a filtering
- expression. Perhaps another option would be to be the ability
- to spcify a names filter adapter.
-
-Catalogs use a unique-id tool to assign short (integer) ids to
-objects. Before creating a catalog, you must create a intid tool:
-
- >>> print http(r"""
- ... POST /++etc++site/default/AddUtility/action.html HTTP/1.1
- ... Authorization: Basic bWdyOm1ncnB3
- ... Content-Length: 78
- ... Content-Type: application/x-www-form-urlencoded
- ... Referer: http://localhost:8081/++etc++site/default/AddUtility
- ...
- ... type_name=BrowserAdd__zope.app.intid.IntIds&id=&add=+Add+""")
- HTTP/1.1 303 ...
-
-And register it:
-
- >>> print http(r"""
- ... POST /++etc++site/default/IntIds/addRegistration.html HTTP/1.1
- ... Authorization: Basic bWdyOm1ncnB3
- ... Content-Length: 864
- ... Content-Type: multipart/form-data; boundary=---------------------------68417209514430962931254091825
- ... Referer: http://localhost:8081/++etc++site/default/IntIds/addRegistration.html
- ...
- ... -----------------------------68417209514430962931254091825
- ... Content-Disposition: form-data; name="field.name"
- ...
- ...
- ... -----------------------------68417209514430962931254091825
- ... Content-Disposition: form-data; name="field.interface"
- ...
- ... zope.app.intid.interfaces.IIntIds
- ... -----------------------------68417209514430962931254091825
- ... Content-Disposition: form-data; name="field.interface-empty-marker"
- ...
- ... 1
- ... -----------------------------68417209514430962931254091825
- ... Content-Disposition: form-data; name="field.permission"
- ...
- ... zope.Public
- ... -----------------------------68417209514430962931254091825
- ... Content-Disposition: form-data; name="field.permission-empty-marker"
- ...
- ... 1
- ... -----------------------------68417209514430962931254091825
- ... Content-Disposition: form-data; name="UPDATE_SUBMIT"
- ...
- ... Add
- ... -----------------------------68417209514430962931254091825--
- ... """)
- HTTP/1.1 303 ...
-
-
-Moving short-id management outside of catalogs make it possible to
-join searches accross multiple catalogs and indexing tools
-(e.g. relationship indexes).
-
-TODO: Filters?
- Maybe unique-id tools should be filtered as well, however, this
- would limit the value of unique id tools for providing
- cross-catalog/cross-index merging. At least the domain for a
- unique id tool would be broader than the domain of a catalog.
- The value of filtering in the unique id tool is that it limits
- the amount of work that needs to be done by catalogs.
- One obvious aplication is to provide separate domains for
- ordinary and meta content. If we did this, then we'd need to be
- able to select, and, perhaps, alter, the unique-id tool used by
- a catalog.
-
-Once we have a unique-id tool, you can add a catalog:
-
- >>> print http(r"""
- ... POST /++etc++site/default/AddUtility/action.html HTTP/1.1
- ... Authorization: Basic bWdyOm1ncnB3
- ... Content-Length: 77
- ... Content-Type: application/x-www-form-urlencoded
- ... Referer: http://localhost:8081/++etc++site/default/AddUtility
- ...
- ... type_name=BrowserAdd__zope.app.catalog.catalog.Catalog&id=&add=+Add+""")
- HTTP/1.1 303 ...
-
-and register it:
-
-
- >>> print http(r"""
- ... POST /++etc++site/default/Catalog/addRegistration.html HTTP/1.1
- ... Authorization: Basic bWdyOm1ncnB3
- ... Content-Length: 855
- ... Content-Type: multipart/form-data; boundary=---------------------------17974048709381505781405189947
- ... Referer: http://localhost:8081/++etc++site/default/Catalog/addRegistration.html
- ...
- ... -----------------------------17974048709381505781405189947
- ... Content-Disposition: form-data; name="field.name"
- ...
- ...
- ... -----------------------------17974048709381505781405189947
- ... Content-Disposition: form-data; name="field.interface"
- ...
- ... zope.app.catalog.interfaces.ICatalog
- ... -----------------------------17974048709381505781405189947
- ... Content-Disposition: form-data; name="field.interface-empty-marker"
- ...
- ... 1
- ... -----------------------------17974048709381505781405189947
- ... Content-Disposition: form-data; name="field.permission"
- ...
- ... zope.Public
- ... -----------------------------17974048709381505781405189947
- ... Content-Disposition: form-data; name="field.permission-empty-marker"
- ...
- ... 1
- ... -----------------------------17974048709381505781405189947
- ... Content-Disposition: form-data; name="UPDATE_SUBMIT"
- ...
- ... Add
- ... -----------------------------17974048709381505781405189947--
- ... """)
- HTTP/1.1 303 ...
-
-
-Once we have a catalog, we can add indexes to it. Before we add an
-index, let's add a templated page. When adding indexes, existing
-objects are indexed, so the document we add now will appear in the
-index:
-
- >>> print http(r"""
- ... POST /+/zope.app.zptpage.ZPTPage%3D HTTP/1.1
- ... Authorization: Basic bWdyOm1ncnB3
- ... Content-Length: 780
- ... Content-Type: multipart/form-data; boundary=---------------------------1425445234777458421417366789
- ... Referer: http://localhost:8081/+/zope.app.zptpage.ZPTPage=
- ...
- ... -----------------------------1425445234777458421417366789
- ... Content-Disposition: form-data; name="field.source"
- ...
- ... <html>
- ... <body>
- ... Now is the time, for all good dudes to come to the aid of their country.
- ... </body>
- ... </html>
- ... -----------------------------1425445234777458421417366789
- ... Content-Disposition: form-data; name="field.expand.used"
- ...
- ...
- ... -----------------------------1425445234777458421417366789
- ... Content-Disposition: form-data; name="field.evaluateInlineCode.used"
- ...
- ...
- ... -----------------------------1425445234777458421417366789
- ... Content-Disposition: form-data; name="UPDATE_SUBMIT"
- ...
- ... Add
- ... -----------------------------1425445234777458421417366789
- ... Content-Disposition: form-data; name="add_input_name"
- ...
- ... dudes
- ... -----------------------------1425445234777458421417366789--
- ... """)
- HTTP/1.1 303 ...
-
-Perhaps the most common type of index to be added is a text index.
-Most indexes require the specification of an interface, an attribute,
-and an indication of whether the attribute must be called.
-
-TODO: Simplify the UI for selecting interfaces and attributes
- There are a number of ways the UI for this could be made more
- user friendly:
-
- - If the user selects an interface, we could then provide a
- select list of possible attributes and we could determine the
- callability. Perhaps selection of an interface should be
- required.
-
- - An index should have a way to specify default values. In
- particular, text indexes usially use ISearchableText and
- searchableText.
-
-For text indexes, one generally uses
-`zope.index.interfaces.searchabletext.ISearchableText`,
-`getSearchableText` and True.
-
- >>> print http(r"""
- ... POST /++etc++site/default/Catalog/+/AddTextIndex%3D HTTP/1.1
- ... Authorization: Basic bWdyOm1ncnB3
- ... Content-Length: 1003
- ... Content-Type: multipart/form-data; boundary=---------------------------12609588153518590761493918424
- ... Referer: http://localhost:8081/++etc++site/default/Catalog/+/AddTextIndex=
- ...
- ... -----------------------------12609588153518590761493918424
- ... Content-Disposition: form-data; name="field.interface"
- ...
- ... zope.index.interfaces.searchabletext.ISearchableText
- ... -----------------------------12609588153518590761493918424
- ... Content-Disposition: form-data; name="field.interface-empty-marker"
- ...
- ... 1
- ... -----------------------------12609588153518590761493918424
- ... Content-Disposition: form-data; name="field.field_name"
- ...
- ... getSearchableText
- ... -----------------------------12609588153518590761493918424
- ... Content-Disposition: form-data; name="field.field_callable.used"
- ...
- ...
- ... -----------------------------12609588153518590761493918424
- ... Content-Disposition: form-data; name="field.field_callable"
- ...
- ... on
- ... -----------------------------12609588153518590761493918424
- ... Content-Disposition: form-data; name="UPDATE_SUBMIT"
- ...
- ... Add
- ... -----------------------------12609588153518590761493918424
- ... Content-Disposition: form-data; name="add_input_name"
- ...
- ...
- ... -----------------------------12609588153518590761493918424--
- ... """, handle_errors=False)
- HTTP/1.1 303 ...
-
-
-We can visit the advanced tab of the catalog to get some index
-statistics. Doing so, we see that we have a single document and that
-the total word count is 8. The word count is only 8 because ssome stop
-words have been eliminated.
-
-
- >>> print http(r"""
- ... GET /++etc++site/default/Catalog/@@advanced.html HTTP/1.1
- ... Authorization: Basic bWdyOm1ncnB3
- ... Referer: http://localhost:8081/++etc++site/default/Catalog/@@contents.html
- ... """)
- HTTP/1.1 200 Ok
- ...
- <table border="0">
- <tr><th>Index</th>
- <th>Document Count</th>
- <th>Word Count</th>
- </tr>
- <tr>
- <td>TextIndex</td>
- <td>1</td>
- <td>8</td>
- </tr>
- </table>
- ...
-
-Now lets add some more pages:
-
- >>> print http(r"""
- ... POST /+/zope.app.zptpage.ZPTPage%3D HTTP/1.1
- ... Authorization: Basic bWdyOm1ncnB3
- ... Content-Length: 754
- ... Content-Type: multipart/form-data; boundary=---------------------------1213614620286666602740364725
- ... Referer: http://localhost:8081/+/zope.app.zptpage.ZPTPage=
- ...
- ... -----------------------------1213614620286666602740364725
- ... Content-Disposition: form-data; name="field.source"
- ...
- ... <html>
- ... <body>
- ... Dudes, we really need to switch to Zope 3 now.
- ... </body>
- ... </html>
- ... -----------------------------1213614620286666602740364725
- ... Content-Disposition: form-data; name="field.expand.used"
- ...
- ...
- ... -----------------------------1213614620286666602740364725
- ... Content-Disposition: form-data; name="field.evaluateInlineCode.used"
- ...
- ...
- ... -----------------------------1213614620286666602740364725
- ... Content-Disposition: form-data; name="UPDATE_SUBMIT"
- ...
- ... Add
- ... -----------------------------1213614620286666602740364725
- ... Content-Disposition: form-data; name="add_input_name"
- ...
- ... zope3
- ... -----------------------------1213614620286666602740364725--
- ... """)
- HTTP/1.1 303 ...
-
- >>> print http(r"""
- ... POST /+/zope.app.zptpage.ZPTPage%3D HTTP/1.1
- ... Authorization: Basic bWdyOm1ncnB3
- ... Content-Length: 838
- ... Content-Type: multipart/form-data; boundary=---------------------------491825988706308579952614349
- ... Referer: http://localhost:8081/+/zope.app.zptpage.ZPTPage=
- ...
- ... -----------------------------491825988706308579952614349
- ... Content-Disposition: form-data; name="field.source"
- ...
- ... <html>
- ... <body>
- ... <p>Writing tests as doctests makes them much more understandable.</p>
- ... <p>Python 2.4 has major enhancements to the doctest module.</p>
- ... </body>
- ... </html>
- ... -----------------------------491825988706308579952614349
- ... Content-Disposition: form-data; name="field.expand.used"
- ...
- ...
- ... -----------------------------491825988706308579952614349
- ... Content-Disposition: form-data; name="field.evaluateInlineCode.used"
- ...
- ...
- ... -----------------------------491825988706308579952614349
- ... Content-Disposition: form-data; name="UPDATE_SUBMIT"
- ...
- ... Add
- ... -----------------------------491825988706308579952614349
- ... Content-Disposition: form-data; name="add_input_name"
- ...
- ... doctest
- ... -----------------------------491825988706308579952614349--
- ... """)
- HTTP/1.1 303 ...
-
-Now, if we visit the catalog advanced tab, we can see that the 3
-documents have been indexed and that the word count has increased to 30:
-
- >>> print http(r"""
- ... GET /++etc++site/default/Catalog/@@advanced.html HTTP/1.1
- ... Authorization: Basic bWdyOm1ncnB3
- ... Referer: http://localhost:8081/++etc++site/default/Catalog/@@contents.html
- ... """)
- HTTP/1.1 200 Ok
- ...
- <table border="0">
- <tr><th>Index</th>
- <th>Document Count</th>
- <th>Word Count</th>
- </tr>
- <tr>
- <td>TextIndex</td>
- <td>3</td>
- <td>30</td>
- </tr>
- </table>
- ...
-
-
-Now that we have a catalog with some documents indexed, we can search
-it. The catalog is really meant to be used from Python:
-
- >>> root = getRootFolder()
-
-We'll make our root folder the site (this would normally be done by
-the publisher):
-
- >>> from zope.app.component.hooks import setSite
- >>> setSite(root)
-
-Now, we'll get the catalog:
-
- >>> from zope.app import zapi
- >>> from zope.app.catalog.interfaces import ICatalog
- >>> catalog = zapi.getUtility(ICatalog)
-
-And search it to find the names of all of the documents that contain
-the word 'now':
-
- >>> results = catalog.searchResults(TextIndex='now')
- >>> [result.__name__ for result in results]
- [u'dudes', u'zope3']
-
-TODO
- This stuff needs a lot of work. The indexing interfaces, despite
- being rather elaborate are still a bit too simple. There really
- should be more provision for combining result. In particular,
- catalog should have a search interface that returns ranked docids,
- rather than documents.
-
-You don't have to use the search algorithm build into the catalog. You
-can implement your own search algorithms and use them with a catalog's
-indexes.
Copied: Zope3/trunk/src/zope/app/catalog/README.txt (from rev 28607, Zope3/branches/jim-index-restructure-2004-12/src/zope/app/catalog/README.txt)
Property changes on: Zope3/trunk/src/zope/app/catalog/README.txt
___________________________________________________________________
Name: svn:eol-style
+ native
Copied: Zope3/trunk/src/zope/app/catalog/browser/README.txt (from rev 28607, Zope3/branches/jim-index-restructure-2004-12/src/zope/app/catalog/browser/README.txt)
Property changes on: Zope3/trunk/src/zope/app/catalog/browser/README.txt
___________________________________________________________________
Name: cvs2svn:cvs-rev
+ 1.2
Name: svn:eol-style
+ native
Modified: Zope3/trunk/src/zope/app/catalog/browser/configure.zcml
===================================================================
--- Zope3/trunk/src/zope/app/catalog/browser/configure.zcml 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/app/catalog/browser/configure.zcml 2004-12-09 20:56:05 UTC (rev 28610)
@@ -70,33 +70,6 @@
/>
<addform
- name="AddKeywordIndex"
- label="Add a keyword index"
- schema="..interfaces.IAttributeIndex"
- permission="zope.ManageServices"
- content_factory="..keyword.KeywordIndex"
- arguments="field_name"
- keyword_arguments="interface field_callable"
- />
-
-<addMenuItem
- title="Keyword Index"
- description="Index items based on multi-value fields with
- orderable values"
- class="..keyword.KeywordIndex"
- permission="zope.ManageServices"
- view="AddKeywordIndex"
- />
-
-<schemadisplay
- name="index.html"
- schema="..keyword.IKeywordIndex"
- label="Keyword Index"
- permission="zope.ManageServices"
- menu="zmi_views" title="Configuration"
- />
-
-<addform
name="AddTextIndex"
label="Add a text index"
schema="..interfaces.IAttributeIndex"
Copied: Zope3/trunk/src/zope/app/catalog/browser/ftests.py (from rev 28607, Zope3/branches/jim-index-restructure-2004-12/src/zope/app/catalog/browser/ftests.py)
Modified: Zope3/trunk/src/zope/app/catalog/catalog.py
===================================================================
--- Zope3/trunk/src/zope/app/catalog/catalog.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/app/catalog/catalog.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -20,15 +20,15 @@
from zope.app.zapi import getUtility
from zope.security.proxy import removeSecurityProxy
from zope.app.container.btree import BTreeContainer
+import zope.index.interfaces
from zope.app import zapi
from zope.app.annotation.interfaces import IAttributeAnnotatable
from zope.app.container.interfaces import IContainer
from zope.app.catalog.interfaces import ICatalog
from zope.app.intid.interfaces import IIntIds
-from zope.index.interfaces import ISimpleQuery
+from BTrees.IFBTree import weightedIntersection
-
class ResultSet:
"""Lazily accessed set of objects."""
@@ -47,7 +47,11 @@
class Catalog(BTreeContainer):
- implements(ICatalog, IContainer, IAttributeAnnotatable)
+ implements(ICatalog,
+ IContainer,
+ IAttributeAnnotatable,
+ zope.index.interfaces.IIndexSearch,
+ )
def clear(self):
for index in self.values():
@@ -65,39 +69,46 @@
def updateIndex(self, index):
uidutil = zapi.getUtility(IIntIds)
- for uid, ref in uidutil.items():
- obj = ref()
+ for uid in uidutil:
+ obj = uidutil.getObject(uid)
index.index_doc(uid, obj)
def updateIndexes(self):
uidutil = zapi.getUtility(IIntIds)
- for uid, ref in uidutil.items():
- obj = ref()
+ for uid in uidutil:
+ obj = uidutil.getObject(uid)
for index in self.values():
index.index_doc(uid, obj)
+ def apply(self, query):
+ results = []
+ for index_name, index_query in query.items():
+ index = self[index_name]
+ r = index.apply(index_query)
+ if r is None:
+ continue
+ if not r:
+ # empty results
+ return r
+ results.append((len(r), r))
+
+ if not results:
+ # no applicable indexes, so catalog was not applicable
+ return None
+
+ results.sort() # order from smallest to largest
+
+ _, result = results.pop(0)
+ for _, r in results:
+ _, result = weightedIntersection(result, r)
+
+ return result
+
def searchResults(self, **searchterms):
- from BTrees.IIBTree import intersection
- pendingResults = None
- for key, value in searchterms.items():
- index = self.get(key)
- if not index:
- raise ValueError, "no such index %s" % (key, )
- index = ISimpleQuery(index)
- results = index.query(value)
- # Hm. As a result of calling getAdapter, I get back
- # security proxy wrapped results from anything that
- # needed to be adapted.
- results = removeSecurityProxy(results)
- if pendingResults is None:
- pendingResults = results
- else:
- pendingResults = intersection(pendingResults, results)
- if not pendingResults:
- break # nothing left, short-circuit
- # Next we turn the IISet of docids into a generator of objects
- uidutil = zapi.getUtility(IIntIds)
- results = ResultSet(pendingResults, uidutil)
+ results = self.apply(searchterms)
+ if results is not None:
+ uidutil = zapi.getUtility(IIntIds)
+ results = ResultSet(results, uidutil)
return results
def indexAdded(index, event):
Modified: Zope3/trunk/src/zope/app/catalog/configure.zcml
===================================================================
--- Zope3/trunk/src/zope/app/catalog/configure.zcml 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/app/catalog/configure.zcml 2004-12-09 20:56:05 UTC (rev 28610)
@@ -59,16 +59,6 @@
/>
</content>
-<content class=".keyword.KeywordIndex">
- <require
- permission="zope.ManageServices"
- interface=".interfaces.IAttributeIndex
- zope.index.interfaces.IStatistics
- "
- set_schema=".interfaces.IAttributeIndex"
- />
-</content>
-
<content class=".text.TextIndex">
<require
permission="zope.ManageServices"
Deleted: Zope3/trunk/src/zope/app/catalog/ftests.py
===================================================================
--- Zope3/trunk/src/zope/app/catalog/ftests.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/app/catalog/ftests.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -1,25 +0,0 @@
-##############################################################################
-#
-# Copyright (c) 2004 Zope Corporation and Contributors.
-# All Rights Reserved.
-#
-# This software is subject to the provisions of the Zope Public License,
-# Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution.
-# THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED
-# WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
-# WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS
-# FOR A PARTICULAR PURPOSE.
-#
-##############################################################################
-"""Functional tests for xmlrpc
-
-$Id: ftests.py 27323 2004-08-28 19:31:22Z jim $
-"""
-
-def test_suite():
- from zope.app.tests.functional import FunctionalDocFileSuite
- return FunctionalDocFileSuite('README.txt')
-
-if __name__ == '__main__':
- import unittest
- unittest.main(defaultTest='test_suite')
Modified: Zope3/trunk/src/zope/app/catalog/interfaces.py
===================================================================
--- Zope3/trunk/src/zope/app/catalog/interfaces.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/app/catalog/interfaces.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -39,7 +39,7 @@
class ICatalogIndex(zope.index.interfaces.IInjection,
- zope.index.interfaces.ISimpleQuery,
+ zope.index.interfaces.IIndexSearch,
):
"""An index to be used in a catalog
"""
Deleted: Zope3/trunk/src/zope/app/catalog/keyword.py
===================================================================
--- Zope3/trunk/src/zope/app/catalog/keyword.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/app/catalog/keyword.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -1,37 +0,0 @@
-##############################################################################
-#
-# Copyright (c) 2003 Zope Corporation and Contributors.
-# All Rights Reserved.
-#
-# This software is subject to the provisions of the Zope Public License,
-# Version 2.0 (ZPL). A copy of the ZPL should accompany this distribution.
-# THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED
-# WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
-# WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS
-# FOR A PARTICULAR PURPOSE.
-#
-##############################################################################
-"""Keyword catalog indexes
-
-$Id$
-"""
-
-import zope.index.keyword
-import zope.interface
-
-import zope.app.container.contained
-import zope.app.catalog.attribute
-import zope.app.catalog.interfaces
-
-class IKeywordIndex(zope.app.catalog.interfaces.IAttributeIndex,
- zope.app.catalog.interfaces.ICatalogIndex,
- ):
- """Interface-based catalog keyword index
- """
-
-class KeywordIndex(zope.app.catalog.attribute.AttributeIndex,
- zope.index.keyword.KeywordIndex,
- zope.app.container.contained.Contained):
-
- zope.interface.implements(IKeywordIndex)
-
Modified: Zope3/trunk/src/zope/app/catalog/tests.py
===================================================================
--- Zope3/trunk/src/zope/app/catalog/tests.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/app/catalog/tests.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -21,14 +21,16 @@
import unittest
import doctest
+import BTrees.IFBTree
+
from zope.interface import implements
from zope.interface.verify import verifyObject
from zope.app.tests import ztapi, setup
-from zope.app.tests.placelesssetup import PlacelessSetup
-from BTrees.IIBTree import IISet
+import zope.app.tests.placelesssetup
+from BTrees.IFBTree import IFSet
from zope.app.intid.interfaces import IIntIds
-from zope.index.interfaces import IInjection, ISimpleQuery
+from zope.index.interfaces import IInjection, IIndexSearch
from zope.app.catalog.interfaces import ICatalog
from zope.app.catalog.catalog import Catalog
from zope.app import zapi
@@ -81,14 +83,14 @@
def queryId(self, ob, default=None):
return self.ids.get(ob, default)
- def items(self):
- return [(id, ReferenceStub(obj)) for id, obj in self.objs.items()]
+ def __iter__(self):
+ return self.objs.iterkeys()
class StubIndex:
"""A stub for Index."""
- implements(ISimpleQuery, IInjection)
+ implements(IIndexSearch, IInjection)
def __init__(self, field_name, interface=None):
self._field_name = field_name
@@ -101,14 +103,14 @@
def unindex_doc(self, docid):
del self.doc[docid]
- def query(self, term, start=0, count=None):
+ def apply(self, term):
results = []
for docid in self.doc:
obj = self.doc[docid]
fieldname = getattr(obj, self._field_name, '')
if fieldname == term:
results.append(docid)
- return IISet(results)
+ return IFSet(results)
class stoopid:
@@ -116,7 +118,7 @@
self.__dict__ = kw
-class Test(PlacelessSetup, unittest.TestCase):
+class Test(zope.app.tests.placelesssetup.PlacelessSetup, unittest.TestCase):
def test_catalog_add_del_indexes(self):
catalog = Catalog()
@@ -202,7 +204,7 @@
res = catalog.searchResults(simiantype='ape', name='mwumi')
self.assertEqual(len(res), 0)
- self.assertRaises(ValueError, catalog.searchResults,
+ self.assertRaises(KeyError, catalog.searchResults,
simiantype='monkey', hat='beret')
@@ -289,33 +291,17 @@
self.assertEqual(self.cat.regs, [])
-def test_textindex_simple_query():
- """
- >>> class Doc:
- ... def __init__(self, text):
- ... self.text = text
- >>> from zope.app.catalog.text import TextIndex
- >>> index = TextIndex('text')
- >>> index.index_doc(1, Doc('now time for all good men to come to the aid'))
- >>> index.index_doc(2, Doc('we should use Zope3 now'))
- >>> index.index_doc(3, Doc('doctest makes tests more readable'))
- >>> r = index.query('now')
- >>> r.sort()
- >>> r
- [1, 2]
- >>> index.query('doctest')
- [3]
-
-
- """
-
def test_suite():
- from zope.testing.doctestunit import DocTestSuite
+ from zope.testing import doctest
suite = unittest.TestSuite()
suite.addTest(unittest.makeSuite(Test))
suite.addTest(unittest.makeSuite(TestEventSubscribers))
- suite.addTest(DocTestSuite('zope.app.catalog.attribute'))
- suite.addTest(DocTestSuite())
+ suite.addTest(doctest.DocTestSuite('zope.app.catalog.attribute'))
+ suite.addTest(doctest.DocFileSuite(
+ 'README.txt',
+ setUp=zope.app.tests.placelesssetup.setUp,
+ tearDown=zope.app.tests.placelesssetup.tearDown,
+ ))
return suite
Modified: Zope3/trunk/src/zope/app/catalog/text.py
===================================================================
--- Zope3/trunk/src/zope/app/catalog/text.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/app/catalog/text.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -16,7 +16,7 @@
$Id$
"""
import zope.index.text
-import zope.index.interfaces.searchabletext
+import zope.index.text.interfaces
import zope.interface
import zope.app.catalog.attribute
@@ -34,7 +34,7 @@
description=_(u"Objects will be adapted to this interface"),
vocabulary=_("Interfaces"),
required=False,
- default=zope.index.interfaces.searchabletext.ISearchableText,
+ default=zope.index.text.interfaces.ISearchableText,
)
field_name = zope.schema.BytesLine(
@@ -55,13 +55,3 @@
zope.app.container.contained.Contained):
zope.interface.implements(ITextIndex)
-
- def query(self, text, start=0, count=None):
- """Return a list of ids matching the text
-
- This a dumbed-down implementation that matches ISimpleQuery.
-
- """
- result = super(TextIndex, self).query(text, start, count)
- return [id for (id, rank) in result[0]]
-
Modified: Zope3/trunk/src/zope/app/zptpage/textindex/configure.zcml
===================================================================
--- Zope3/trunk/src/zope/app/zptpage/textindex/configure.zcml 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/app/zptpage/textindex/configure.zcml 2004-12-09 20:56:05 UTC (rev 28610)
@@ -2,7 +2,7 @@
<adapter
for="..interfaces.IZPTPage"
- provides="zope.index.interfaces.searchabletext.ISearchableText"
+ provides="zope.index.text.interfaces.ISearchableText"
factory=".zptpage.SearchableText"
/>
Modified: Zope3/trunk/src/zope/app/zptpage/textindex/tests.py
===================================================================
--- Zope3/trunk/src/zope/app/zptpage/textindex/tests.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/app/zptpage/textindex/tests.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -16,7 +16,7 @@
$Id$
"""
-from zope.index.interfaces.searchabletext import ISearchableText
+from zope.index.text.interfaces import ISearchableText
from zope.app.tests import ztapi
from zope.app.tests.placelesssetup import PlacelessSetup
from zope.app.zptpage.interfaces import IZPTPage
Modified: Zope3/trunk/src/zope/app/zptpage/textindex/zptpage.py
===================================================================
--- Zope3/trunk/src/zope/app/zptpage/textindex/zptpage.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/app/zptpage/textindex/zptpage.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -17,7 +17,7 @@
from zope.interface import implements
from zope.app.zptpage.interfaces import IZPTPage
-from zope.index.interfaces.searchabletext import ISearchableText
+from zope.index.text.interfaces import ISearchableText
import re
tag = re.compile(r"<[^>]+>")
Copied: Zope3/trunk/src/zope/index/field/README.txt (from rev 28607, Zope3/branches/jim-index-restructure-2004-12/src/zope/index/field/README.txt)
Modified: Zope3/trunk/src/zope/index/field/index.py
===================================================================
--- Zope3/trunk/src/zope/index/field/index.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/index/field/index.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -15,24 +15,25 @@
$Id$
"""
-from persistent import Persistent
+import persistent
from BTrees.IOBTree import IOBTree
from BTrees.OOBTree import OOBTree
-from BTrees.IIBTree import IITreeSet, IISet, union
+from BTrees.IFBTree import IFTreeSet, IFSet, multiunion
from BTrees.Length import Length
-from types import ListType, TupleType
-from zope.interface import implements
+import zope.interface
-from zope.index.interfaces import IInjection, ISimpleQuery
-from zope.index.interfaces import IStatistics, IRangeQuerying
+from zope.index import interfaces
+class FieldIndex(persistent.Persistent):
-class FieldIndex(Persistent):
+ zope.interface.implements(
+ interfaces.IInjection,
+ interfaces.IStatistics,
+ interfaces.IIndexSearch,
+ )
- implements(IRangeQuerying, IInjection, ISimpleQuery, IStatistics)
-
def __init__(self):
self.clear()
@@ -52,71 +53,47 @@
"""See interface IStatistics"""
return len(self._fwd_index)
- def has_doc(self, docid):
- return bool(self._rev_index.has_key(docid))
-
def index_doc(self, docid, value):
"""See interface IInjection"""
- if self.has_doc(docid): # unindex doc if present
+ rev_index = self._rev_index
+ if docid in rev_index:
+ # unindex doc if present
self.unindex_doc(docid)
- self._insert_forward(docid, value)
- self._insert_reverse(docid, value)
+ # Insert into forward index.
+ set = self._fwd_index.get(value)
+ if set is None:
+ set = IFTreeSet()
+ self._fwd_index[value] = set
+ set.insert(docid)
+
+ # increment doc count
+ self._num_docs.change(1)
+
+ # Insert into reverse index.
+ rev_index[docid] = value
+
def unindex_doc(self, docid):
"""See interface IInjection"""
- try: # ignore non-existing docids, don't raise
- value = self._rev_index[docid]
- except KeyError:
- return
+ rev_index = self._rev_index
+ value = rev_index.get(docid)
+ if value is None:
+ return # not in index
- del self._rev_index[docid]
+ del rev_index[docid]
try:
- self._fwd_index[value].remove(docid)
- if len(self._fwd_index[value]) == 0:
- del self._fwd_index[value]
+ set = self._fwd_index[value]
+ set.remove(docid)
except KeyError:
+ # This is fishy, but we don't want to raise an error.
+ # We should probably log something.
pass
- self._num_docs.change(-1)
- def search(self, values):
- "See interface ISimpleQuerying"
- # values can either be a single value or a sequence of
- # values to be searched.
- if isinstance(values, (ListType, TupleType)):
- result = IISet()
- for value in values:
- try:
- r = IISet(self._fwd_index[value])
- except KeyError:
- continue
- # the results of all subsearches are combined using OR
- result = union(result, r)
- else:
- try:
- result = IISet(self._fwd_index[values])
- except KeyError:
- result = IISet()
+ if not set:
+ del self._fwd_index[value]
- return result
+ self._num_docs.change(-1)
- def query(self, querytext, start=0, count=None):
- """See interface IQuerying"""
- res = self.search(querytext)
- if start or count:
- res = res[start:start+count]
- return res
-
- def rangesearch(self, minvalue, maxvalue):
- return IISet(self._fwd_index.keys(minvalue, maxvalue))
-
- def _insert_forward(self, docid, value):
- """Insert into forward index."""
- if not self._fwd_index.has_key(value):
- self._fwd_index[value] = IITreeSet()
- self._fwd_index[value].insert(docid)
- self._num_docs.change(1)
-
- def _insert_reverse(self, docid, value):
- """Insert into reverse index."""
- self._rev_index[docid] = value
+ def apply(self, query):
+ return multiunion(self._fwd_index.values(*query))
Copied: Zope3/trunk/src/zope/index/field/tests.py (from rev 28607, Zope3/branches/jim-index-restructure-2004-12/src/zope/index/field/tests.py)
Property changes on: Zope3/trunk/src/zope/index/field/tests.py
___________________________________________________________________
Name: cvs2svn:cvs-rev
+ 1.3
Name: svn:eol-style
+ native
Copied: Zope3/trunk/src/zope/index/interfaces.py (from rev 28607, Zope3/branches/jim-index-restructure-2004-12/src/zope/index/interfaces.py)
Property changes on: Zope3/trunk/src/zope/index/interfaces.py
___________________________________________________________________
Name: cvs2svn:cvs-rev
+ 1.8
Name: svn:keywords
+ Id
Name: svn:eol-style
+ native
Modified: Zope3/trunk/src/zope/index/keyword/index.py
===================================================================
--- Zope3/trunk/src/zope/index/keyword/index.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/index/keyword/index.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -23,7 +23,8 @@
from BTrees.Length import Length
from types import ListType, TupleType, StringTypes
-from zope.index.interfaces import IInjection, IKeywordQuerying, IStatistics
+from zope.index.interfaces import IInjection, IStatistics
+from zope.index.keyword.interfaces import IKeywordQuerying
from zope.interface import implements
class KeywordIndex(Persistent):
Copied: Zope3/trunk/src/zope/index/keyword/interfaces.py (from rev 28607, Zope3/branches/jim-index-restructure-2004-12/src/zope/index/keyword/interfaces.py)
Property changes on: Zope3/trunk/src/zope/index/keyword/interfaces.py
___________________________________________________________________
Name: cvs2svn:cvs-rev
+ 1.8
Name: svn:keywords
+ Id
Name: svn:eol-style
+ native
Modified: Zope3/trunk/src/zope/index/keyword/tests.py
===================================================================
--- Zope3/trunk/src/zope/index/keyword/tests.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/index/keyword/tests.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -16,7 +16,8 @@
from BTrees.IIBTree import IISet
from zope.index.keyword.index import KeywordIndex
-from zope.index.interfaces import IInjection, IStatistics, IKeywordQuerying
+from zope.index.interfaces import IInjection, IStatistics
+from zope.index.keyword.interfaces import IKeywordQuerying
from zope.interface.verify import verifyClass
class KeywordIndexTest(TestCase):
Copied: Zope3/trunk/src/zope/index/nbest.py (from rev 28607, Zope3/branches/jim-index-restructure-2004-12/src/zope/index/nbest.py)
Property changes on: Zope3/trunk/src/zope/index/nbest.py
___________________________________________________________________
Name: cvs2svn:cvs-rev
+ 1.1
Name: svn:eol-style
+ native
Copied: Zope3/trunk/src/zope/index/tests.py (from rev 28607, Zope3/branches/jim-index-restructure-2004-12/src/zope/index/tests.py)
Property changes on: Zope3/trunk/src/zope/index/tests.py
___________________________________________________________________
Name: cvs2svn:cvs-rev
+ 1.1
Name: svn:eol-style
+ native
Modified: Zope3/trunk/src/zope/index/text/__init__.py
===================================================================
--- Zope3/trunk/src/zope/index/text/__init__.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/index/text/__init__.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -1 +1 @@
-from zope.index.text.textindexwrapper import TextIndexWrapper as TextIndex
+from zope.index.text.textindex import TextIndex
Modified: Zope3/trunk/src/zope/index/text/baseindex.py
===================================================================
--- Zope3/trunk/src/zope/index/text/baseindex.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/index/text/baseindex.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -21,35 +21,21 @@
from zope.interface import implements
from BTrees.IOBTree import IOBTree
-from BTrees.IIBTree import IIBTree, IITreeSet
-from BTrees.IIBTree import intersection, difference
+from BTrees.IFBTree import IFBTree, IFTreeSet
+from BTrees.IFBTree import intersection, difference
from BTrees import Length
-from zope.index.interfaces import IInjection, IStatistics, IExtendedQuerying
+from zope.index.interfaces import IInjection, IStatistics
+
+from zope.index.text.interfaces import IExtendedQuerying
from zope.index.text import widcode
from zope.index.text.setops import mass_weightedIntersection, \
mass_weightedUnion
-# Instead of storing floats, we generally store scaled ints. Binary pickles
-# can store those more efficiently. The default SCALE_FACTOR of 1024
-# is large enough to get about 3 decimal digits of fractional info, and
-# small enough so that scaled values should almost always fit in a signed
-# 16-bit int (we're generally storing logs, so a few bits before the radix
-# point goes a long way; on the flip side, for reasonably small numbers x
-# most of the info in log(x) is in the fractional bits, so we do want to
-# save a lot of those).
-SCALE_FACTOR = 1024.0
-
-def scaled_int(f, scale=SCALE_FACTOR):
- # We expect only positive inputs, so "add a half and chop" is the
- # same as round(). Surprising, calling round() is significantly more
- # expensive.
- return int(f * scale + 0.5)
-
def unique(L):
"""Return a list of the unique elements in L."""
- return IITreeSet(L).keys()
+ return IFTreeSet(L).keys()
class BaseIndex(Persistent):
implements(IInjection, IStatistics, IExtendedQuerying)
@@ -76,7 +62,7 @@
# Different indexers have different notions of doc weight, but we
# expect each indexer to use ._docweight to map docids to its
# notion of what a doc weight is.
- self._docweight = IIBTree()
+ self._docweight = IFBTree()
# docid -> WidCode'd list of wids
# Used for un-indexing, and for phrase search.
@@ -125,8 +111,8 @@
new_wids = self._lexicon.sourceToWordIds(text)
new_wid2w, new_docw = self._get_frequencies(new_wids)
- old_widset = IITreeSet(old_wid2w.keys())
- new_widset = IITreeSet(new_wid2w.keys())
+ old_widset = IFTreeSet(old_wid2w.keys())
+ new_widset = IFTreeSet(new_wid2w.keys())
in_both_widset = intersection(old_widset, new_widset)
only_old_widset = difference(old_widset, in_both_widset)
@@ -192,13 +178,13 @@
cleaned_wids = self._remove_oov_wids(wids)
if len(wids) != len(cleaned_wids):
# At least one wid was OOV: can't possibly find it.
- return IIBTree()
+ return IFBTree()
scores = self._search_wids(wids)
hits = mass_weightedIntersection(scores)
if not hits:
return hits
code = widcode.encode(wids)
- result = IIBTree()
+ result = IFBTree()
for docid, weight in hits.items():
docwords = self._docwords[docid]
if docwords.find(code) >= 0:
@@ -209,8 +195,8 @@
return filter(self._wordinfo.has_key, wids)
# Subclass must override.
- # The workhorse. Return a list of (IIBucket, weight) pairs, one pair
- # for each wid t in wids. The IIBucket, times the weight, maps D to
+ # The workhorse. Return a list of (IFBucket, weight) pairs, one pair
+ # for each wid t in wids. The IFBucket, times the weight, maps D to
# TF(D,t) * IDF(t) for every docid D containing t. wids must not
# contain any OOV words.
def _search_wids(self, wids):
@@ -231,24 +217,24 @@
def _add_wordinfo(self, wid, f, docid):
# Store a wordinfo in a dict as long as there are less than
- # DICT_CUTOFF docids in the dict. Otherwise use an IIBTree.
+ # DICT_CUTOFF docids in the dict. Otherwise use an IFBTree.
# The pickle of a dict is smaller than the pickle of an
- # IIBTree, substantially so for small mappings. Thus, we use
+ # IFBTree, substantially so for small mappings. Thus, we use
# a dictionary until the mapping reaches DICT_CUTOFF elements.
# The cutoff is chosen based on the implementation
# characteristics of Python dictionaries. The dict hashtable
# always has 2**N slots and is resized whenever it is 2/3s
# full. A pickled dict with 10 elts is half the size of an
- # IIBTree with 10 elts, and 10 happens to be 2/3s of 2**4. So
+ # IFBTree with 10 elts, and 10 happens to be 2/3s of 2**4. So
# choose 10 as the cutoff for now.
- # The IIBTree has a smaller in-memory representation than a
+ # The IFBTree has a smaller in-memory representation than a
# dictionary, so pickle size isn't the only consideration when
# choosing the threshold. The pickle of a 500-elt dict is 92%
- # of the size of the same IIBTree, but the dict uses more
- # space when it is live in memory. An IIBTree stores two C
+ # of the size of the same IFBTree, but the dict uses more
+ # space when it is live in memory. An IFBTree stores two C
# arrays of ints, one for the keys and one for the values. It
# holds up to 120 key-value pairs in a single bucket.
doc2score = self._wordinfo.get(wid)
@@ -257,13 +243,13 @@
self.wordCount.change(1)
else:
# _add_wordinfo() is called for each update. If the map
- # size exceeds the DICT_CUTOFF, convert to an IIBTree.
+ # size exceeds the DICT_CUTOFF, convert to an IFBTree.
# Obscure: First check the type. If it's not a dict, it
# can't need conversion, and then we can avoid an expensive
- # len(IIBTree).
+ # len(IFBTree).
if (isinstance(doc2score, type({})) and
len(doc2score) == self.DICT_CUTOFF):
- doc2score = IIBTree(doc2score)
+ doc2score = IFBTree(doc2score)
doc2score[docid] = f
self._wordinfo[wid] = doc2score # not redundant: Persistency!
@@ -286,7 +272,7 @@
new_word_count += 1
elif (isinstance(doc2score, dicttype) and
len(doc2score) == self.DICT_CUTOFF):
- doc2score = IIBTree(doc2score)
+ doc2score = IFBTree(doc2score)
doc2score[docid] = weight
self._wordinfo[wid] = doc2score # not redundant: Persistency!
self.wordCount.change(new_word_count)
Modified: Zope3/trunk/src/zope/index/text/cosineindex.py
===================================================================
--- Zope3/trunk/src/zope/index/text/cosineindex.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/index/text/cosineindex.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -17,11 +17,10 @@
"""
import math
-from BTrees.IIBTree import IIBucket
+from BTrees.IFBTree import IFBucket
from zope.interface import implements
from zope.index.text.baseindex import BaseIndex, inverse_doc_frequency
-from zope.index.text.baseindex import scaled_int, SCALE_FACTOR
class CosineIndex(BaseIndex):
@@ -76,8 +75,8 @@
idf = inverse_doc_frequency(len(d2w), N) # an unscaled float
#print "idf = %.3f" % idf
if isinstance(d2w, DictType):
- d2w = IIBucket(d2w)
- L.append((d2w, scaled_int(idf)))
+ d2w = IFBucket(d2w)
+ L.append((d2w, idf))
return L
def query_weight(self, terms):
@@ -89,7 +88,7 @@
for wid in self._remove_oov_wids(wids):
wt = inverse_doc_frequency(len(self._wordinfo[wid]), N)
sum += wt ** 2.0
- return scaled_int(math.sqrt(sum))
+ return math.sqrt(sum)
def _get_frequencies(self, wids):
d = {}
@@ -105,16 +104,16 @@
#print "W = %.3f" % W
for wid, weight in d.items():
#print i, ":", "%.3f" % weight,
- d[wid] = scaled_int(weight / W)
+ d[wid] = weight / W
#print "->", d[wid]
- return d, scaled_int(W)
+ return d, W
# The rest are helper methods to support unit tests
def _get_wdt(self, d, t):
wid, = self._lexicon.termToWordIds(t)
map = self._wordinfo[wid]
- return map.get(d, 0) * self._docweight[d] / SCALE_FACTOR
+ return map.get(d, 0) * self._docweight[d]
def _get_Wd(self, d):
return self._docweight[d]
@@ -126,7 +125,7 @@
def _get_wt(self, t):
wid, = self._lexicon.termToWordIds(t)
map = self._wordinfo[wid]
- return scaled_int(math.log(1 + len(self._docweight) / float(len(map))))
+ return math.log(1 + len(self._docweight) / float(len(map)))
def doc_term_weight(count):
"""Return the doc-term weight for a term that appears count times."""
Modified: Zope3/trunk/src/zope/index/text/htmlsplitter.py
===================================================================
--- Zope3/trunk/src/zope/index/text/htmlsplitter.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/index/text/htmlsplitter.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -19,10 +19,8 @@
from zope.interface import implements
-from zope.index.interfaces.splitter import ISplitter
-from zope.index.text.pipelinefactory import element_factory
+from zope.index.text.interfaces import ISplitter
-
class HTMLWordSplitter(object):
implements(ISplitter)
@@ -45,10 +43,6 @@
text = re.sub(pat, " ", text)
return re.findall(wordpat, text)
-element_factory.registerFactory('Word Splitter',
- 'HTML aware splitter',
- HTMLWordSplitter)
-
if __name__ == "__main__":
import sys
splitter = HTMLWordSplitter()
Copied: Zope3/trunk/src/zope/index/text/interfaces.py (from rev 28607, Zope3/branches/jim-index-restructure-2004-12/src/zope/index/text/interfaces.py)
Property changes on: Zope3/trunk/src/zope/index/text/interfaces.py
___________________________________________________________________
Name: svn:keywords
+ Id
Name: svn:eol-style
+ native
Modified: Zope3/trunk/src/zope/index/text/lexicon.py
===================================================================
--- Zope3/trunk/src/zope/index/text/lexicon.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/index/text/lexicon.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -24,10 +24,9 @@
from persistent import Persistent
-from zope.index.interfaces.lexicon import ILexicon
+from zope.index.text.interfaces import ILexicon
from zope.index.text.stopdict import get_stopdict
from zope.index.text.parsetree import QueryError
-from zope.index.text.pipelinefactory import element_factory
class Lexicon(Persistent):
@@ -175,23 +174,11 @@
result += self.rxGlob.findall(s)
return result
-element_factory.registerFactory('Word Splitter',
- 'Whitespace splitter',
- Splitter)
-
class CaseNormalizer(object):
def process(self, lst):
return [w.lower() for w in lst]
-element_factory.registerFactory('Case Normalizer',
- 'Case Normalizer',
- CaseNormalizer)
-
-element_factory.registerFactory('Stop Words',
- ' Don\'t remove stop words',
- None)
-
class StopWordRemover(object):
dict = get_stopdict().copy()
@@ -206,16 +193,8 @@
def process(self, lst):
return self._process(self.dict, lst)
-element_factory.registerFactory('Stop Words',
- 'Remove listed stop words only',
- StopWordRemover)
-
class StopWordAndSingleCharRemover(StopWordRemover):
dict = get_stopdict().copy()
for c in range(255):
dict[chr(c)] = None
-
-element_factory.registerFactory('Stop Words',
- 'Remove listed and single char words',
- StopWordAndSingleCharRemover)
Deleted: Zope3/trunk/src/zope/index/text/nbest.py
===================================================================
--- Zope3/trunk/src/zope/index/text/nbest.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/index/text/nbest.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -1,79 +0,0 @@
-##############################################################################
-#
-# Copyright (c) 2002 Zope Corporation and Contributors.
-# All Rights Reserved.
-#
-# This software is subject to the provisions of the Zope Public License,
-# Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution.
-# THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED
-# WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
-# WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS
-# FOR A PARTICULAR PURPOSE
-#
-##############################################################################
-"""NBest
-
-An NBest object remembers the N best-scoring items ever passed to its
-.add(item, score) method. If .add() is called M times, the worst-case
-number of comparisons performed overall is M * log2(N).
-
-$Id$
-"""
-
-from bisect import bisect_left as bisect
-
-from zope.index.interfaces.nbest import INBest
-from zope.interface import implements
-
-class NBest(object):
- implements(INBest)
-
- def __init__(self, N):
- "Build an NBest object to remember the N best-scoring objects."
-
- if N < 1:
- raise ValueError("NBest() argument must be at least 1")
- self._capacity = N
-
- # This does a very simple thing with sorted lists. For large
- # N, a min-heap can be unboundedly better in terms of data
- # movement time.
- self._scores = []
- self._items = []
-
- def __len__(self):
- return len(self._scores)
-
- def capacity(self):
- return self._capacity
-
- def add(self, item, score):
- self.addmany([(item, score)])
-
- def addmany(self, sequence):
- scores, items, capacity = self._scores, self._items, self._capacity
- n = len(scores)
- for item, score in sequence:
- # When we're in steady-state, the usual case is that we're filled
- # to capacity, and that an incoming item is worse than any of
- # the best-seen so far.
- if n >= capacity and score <= scores[0]:
- continue
- i = bisect(scores, score)
- scores.insert(i, score)
- items.insert(i, item)
- if n == capacity:
- del items[0], scores[0]
- else:
- n += 1
- assert n == len(scores)
-
- def getbest(self):
- result = zip(self._items, self._scores)
- result.reverse()
- return result
-
- def pop_smallest(self):
- if self._scores:
- return self._items.pop(0), self._scores.pop(0)
- raise IndexError("pop_smallest() called on empty NBest object")
Modified: Zope3/trunk/src/zope/index/text/okapiindex.py
===================================================================
--- Zope3/trunk/src/zope/index/text/okapiindex.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/index/text/okapiindex.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -191,10 +191,10 @@
$Id$
"""
-from BTrees.IIBTree import IIBucket
+from BTrees.IFBTree import IFBucket
from zope.index.text.baseindex import BaseIndex
-from zope.index.text.baseindex import inverse_doc_frequency, scaled_int
+from zope.index.text.baseindex import inverse_doc_frequency
class OkapiIndex(BaseIndex):
@@ -234,12 +234,11 @@
self._totaldoclen -= self._docweight.get(docid, 0)
BaseIndex.unindex_doc(self, docid)
- # The workhorse. Return a list of (IIBucket, weight) pairs, one pair
- # for each wid t in wids. The IIBucket, times the weight, maps D to
+ # The workhorse. Return a list of (IFBucket, weight) pairs, one pair
+ # for each wid t in wids. The IFBucket, times the weight, maps D to
# TF(D,t) * IDF(t) for every docid D containing t.
- # As currently written, the weights are always 1, and the IIBucket maps
- # D to TF(D,t)*IDF(t) directly, where the product is computed as a float
- # but stored as a scaled_int.
+ # As currently written, the weights are always 1, and the IFBucket maps
+ # D to TF(D,t)*IDF(t) directly, where the product is computed as a float.
# NOTE: This may be overridden below, by a function that computes the
# same thing but with the inner scoring loop in C.
def _search_wids(self, wids):
@@ -261,11 +260,11 @@
for t in wids:
d2f = self._wordinfo[t] # map {docid -> f(docid, t)}
idf = inverse_doc_frequency(len(d2f), N) # an unscaled float
- result = IIBucket()
+ result = IFBucket()
for docid, f in d2f.items():
lenweight = B_from1 + B * docid2len[docid] / meandoclen
tf = f * K1_plus1 / (f + K1 * lenweight)
- result[docid] = scaled_int(tf * idf)
+ result[docid] = tf * idf
L.append((result, 1))
return L
@@ -305,7 +304,7 @@
for t in wids:
d2f = self._wordinfo[t] # map {docid -> f(docid, t)}
idf = inverse_doc_frequency(len(d2f), N) # an unscaled float
- result = IIBucket()
+ result = IFBucket()
score(result, d2f.items(), docid2len, idf, meandoclen)
L.append((result, 1))
return L
@@ -325,7 +324,7 @@
sum = 0
for t in self._remove_oov_wids(wids):
idf = inverse_doc_frequency(len(self._wordinfo[t]), N)
- sum += scaled_int(idf * tfmax)
+ sum += idf * tfmax
return sum
def _get_frequencies(self, wids):
Modified: Zope3/trunk/src/zope/index/text/parsetree.py
===================================================================
--- Zope3/trunk/src/zope/index/text/parsetree.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/index/text/parsetree.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -15,10 +15,11 @@
$Id$
"""
-from BTrees.IIBTree import difference
+from BTrees.IFBTree import difference
-from zope.index.interfaces.queryparsetree import IQueryParseTree
-from zope.index.text.setops import mass_weightedIntersection, mass_weightedUnion
+from zope.index.text.interfaces import IQueryParseTree
+from zope.index.text.setops import mass_weightedIntersection
+from zope.index.text.setops import mass_weightedUnion
from zope.interface import implements
Deleted: Zope3/trunk/src/zope/index/text/pipelinefactory.py
===================================================================
--- Zope3/trunk/src/zope/index/text/pipelinefactory.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/index/text/pipelinefactory.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -1,55 +0,0 @@
-##############################################################################
-#
-# Copyright (c) 2002 Zope Corporation and Contributors.
-# All Rights Reserved.
-#
-# This software is subject to the provisions of the Zope Public License,
-# Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution.
-# THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED
-# WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
-# WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS
-# FOR A PARTICULAR PURPOSE
-#
-##############################################################################
-"""Pipeline Element Factory
-
-$Id$
-"""
-from zope.index.interfaces.pipelineelementfactory import IPipelineElementFactory
-from zope.interface import implements
-
-class PipelineElementFactory(object):
-
- implements(IPipelineElementFactory)
-
- def __init__(self):
- self._groups = {}
-
- def registerFactory(self, group, name, factory):
- if self._groups.has_key(group) and \
- self._groups[group].has_key(name):
- raise ValueError('ZCTextIndex lexicon element "%s" '
- 'already registered in group "%s"'
- % (name, group))
-
- elements = self._groups.get(group)
- if elements is None:
- elements = self._groups[group] = {}
- elements[name] = factory
-
- def getFactoryGroups(self):
- groups = self._groups.keys()
- groups.sort()
- return groups
-
- def getFactoryNames(self, group):
- names = self._groups[group].keys()
- names.sort()
- return names
-
- def instantiate(self, group, name):
- factory = self._groups[group][name]
- if factory is not None:
- return factory()
-
-element_factory = PipelineElementFactory()
Modified: Zope3/trunk/src/zope/index/text/queryparser.py
===================================================================
--- Zope3/trunk/src/zope/index/text/queryparser.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/index/text/queryparser.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -59,7 +59,7 @@
import re
from zope.interface import implements
-from zope.index.interfaces.queryparser import IQueryParser
+from zope.index.text.interfaces import IQueryParser
from zope.index.text import parsetree
# Create unique symbols for token types.
Modified: Zope3/trunk/src/zope/index/text/setops.py
===================================================================
--- Zope3/trunk/src/zope/index/text/setops.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/index/text/setops.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -15,18 +15,17 @@
$Id$
"""
-from BTrees.IIBTree import \
- IIBucket, weightedIntersection, weightedUnion
+from BTrees.IFBTree import IFBucket, weightedIntersection, weightedUnion
-from zope.index.text.nbest import NBest
+from zope.index.nbest import NBest
def mass_weightedIntersection(L):
- "A list of (mapping, weight) pairs -> their weightedIntersection IIBucket."
+ "A list of (mapping, weight) pairs -> their weightedIntersection IFBucket."
L = [(x, wx) for (x, wx) in L if x is not None]
if len(L) < 2:
return _trivial(L)
# Intersect with smallest first. We expect the input maps to be
- # IIBuckets, so it doesn't hurt to get their lengths repeatedly
+ # IFBuckets, so it doesn't hurt to get their lengths repeatedly
# (len(Bucket) is fast; len(BTree) is slow).
L.sort(lambda x, y: cmp(len(x[0]), len(y[0])))
(x, wx), (y, wy) = L[:2]
@@ -36,7 +35,7 @@
return result
def mass_weightedUnion(L):
- "A list of (mapping, weight) pairs -> their weightedUnion IIBucket."
+ "A list of (mapping, weight) pairs -> their weightedUnion IFBucket."
if len(L) < 2:
return _trivial(L)
# Balance unions as closely as possible, smallest to largest.
@@ -57,8 +56,8 @@
# pair, we may still need to multiply the mapping by its weight.
assert len(L) <= 1
if len(L) == 0:
- return IIBucket()
+ return IFBucket()
[(result, weight)] = L
if weight != 1:
- dummy, result = weightedUnion(IIBucket(), result, 0, weight)
+ dummy, result = weightedUnion(IFBucket(), result, 0, weight)
return result
Modified: Zope3/trunk/src/zope/index/text/tests/queryhtml.py
===================================================================
--- Zope3/trunk/src/zope/index/text/tests/queryhtml.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/index/text/tests/queryhtml.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -43,7 +43,7 @@
return "http://www.python.org" + p[i:]
from Products.PluginIndexes.TextIndex.TextIndex import And, Or
-from zope.index.text.nbest import NBest
+from zope.index.nbest import NBest
def main(rt):
index = rt["index"]
Deleted: Zope3/trunk/src/zope/index/text/tests/test_nbest.py
===================================================================
--- Zope3/trunk/src/zope/index/text/tests/test_nbest.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/index/text/tests/test_nbest.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -1,100 +0,0 @@
-##############################################################################
-#
-# Copyright (c) 2002 Zope Corporation and Contributors.
-# All Rights Reserved.
-#
-# This software is subject to the provisions of the Zope Public License,
-# Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution.
-# THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED
-# WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
-# WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS
-# FOR A PARTICULAR PURPOSE.
-#
-##############################################################################
-"""N-Best index tests
-
-$Id$
-"""
-from unittest import TestCase, main, makeSuite
-
-from zope.index.text.nbest import NBest
-
-class NBestTest(TestCase):
-
- def testConstructor(self):
- self.assertRaises(ValueError, NBest, 0)
- self.assertRaises(ValueError, NBest, -1)
-
- for n in range(1, 11):
- nb = NBest(n)
- self.assertEqual(len(nb), 0)
- self.assertEqual(nb.capacity(), n)
-
- def testOne(self):
- nb = NBest(1)
- nb.add('a', 0)
- self.assertEqual(nb.getbest(), [('a', 0)])
-
- nb.add('b', 1)
- self.assertEqual(len(nb), 1)
- self.assertEqual(nb.capacity(), 1)
- self.assertEqual(nb.getbest(), [('b', 1)])
-
- nb.add('c', -1)
- self.assertEqual(len(nb), 1)
- self.assertEqual(nb.capacity(), 1)
- self.assertEqual(nb.getbest(), [('b', 1)])
-
- nb.addmany([('d', 3), ('e', -6), ('f', 5), ('g', 4)])
- self.assertEqual(len(nb), 1)
- self.assertEqual(nb.capacity(), 1)
- self.assertEqual(nb.getbest(), [('f', 5)])
-
- def testMany(self):
- import random
- inputs = [(-i, i) for i in range(50)]
-
- reversed_inputs = inputs[:]
- reversed_inputs.reverse()
-
- # Test the N-best for a variety of n (1, 6, 11, ... 50).
- for n in range(1, len(inputs)+1, 5):
- expected = inputs[-n:]
- expected.reverse()
-
- random_inputs = inputs[:]
- random.shuffle(random_inputs)
-
- for source in inputs, reversed_inputs, random_inputs:
- # Try feeding them one at a time.
- nb = NBest(n)
- for item, score in source:
- nb.add(item, score)
- self.assertEqual(len(nb), n)
- self.assertEqual(nb.capacity(), n)
- self.assertEqual(nb.getbest(), expected)
-
- # And again in one gulp.
- nb = NBest(n)
- nb.addmany(source)
- self.assertEqual(len(nb), n)
- self.assertEqual(nb.capacity(), n)
- self.assertEqual(nb.getbest(), expected)
-
- for i in range(1, n+1):
- self.assertEqual(nb.pop_smallest(), expected[-i])
- self.assertRaises(IndexError, nb.pop_smallest)
-
- def testAllSameScore(self):
- inputs = [(i, 0) for i in range(10)]
- for n in range(1, 12):
- nb = NBest(n)
- nb.addmany(inputs)
- outputs = nb.getbest()
- self.assertEqual(outputs, inputs[:len(outputs)])
-
-def test_suite():
- return makeSuite(NBestTest)
-
-if __name__=='__main__':
- main(defaultTest='test_suite')
Deleted: Zope3/trunk/src/zope/index/text/tests/test_pipelinefactory.py
===================================================================
--- Zope3/trunk/src/zope/index/text/tests/test_pipelinefactory.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/index/text/tests/test_pipelinefactory.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -1,53 +0,0 @@
-##############################################################################
-#
-# Copyright (c) 2002 Zope Corporation and Contributors.
-# All Rights Reserved.
-#
-# This software is subject to the provisions of the Zope Public License,
-# Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution.
-# THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED
-# WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
-# WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS
-# FOR A PARTICULAR PURPOSE
-#
-##############################################################################
-"""Pipeline Factory tests
-
-$Id$
-"""
-from unittest import TestCase, main, makeSuite
-from zope.index.interfaces.pipelineelement import IPipelineElement
-from zope.index.text.pipelinefactory import PipelineElementFactory
-from zope.interface import implements
-
-class NullPipelineElement(object):
- implements(IPipelineElement)
-
- def process(source):
- pass
-
-class PipelineFactoryTest(TestCase):
-
- def setUp(self):
- self.huey = NullPipelineElement()
- self.dooey = NullPipelineElement()
- self.louie = NullPipelineElement()
- self.daffy = NullPipelineElement()
-
- def testPipeline(self):
- pf = PipelineElementFactory()
- pf.registerFactory('donald', 'huey', self.huey)
- pf.registerFactory('donald', 'dooey', self.dooey)
- pf.registerFactory('donald', 'louie', self.louie)
- pf.registerFactory('looney', 'daffy', self.daffy)
- self.assertRaises(ValueError, pf.registerFactory,'donald', 'huey',
- self.huey)
- self.assertEqual(pf.getFactoryGroups(), ['donald', 'looney'])
- self.assertEqual(pf.getFactoryNames('donald'),
- ['dooey', 'huey', 'louie'])
-
-def test_suite():
- return makeSuite(PipelineFactoryTest)
-
-if __name__=='__main__':
- main(defaultTest='test_suite')
Modified: Zope3/trunk/src/zope/index/text/tests/test_queryengine.py
===================================================================
--- Zope3/trunk/src/zope/index/text/tests/test_queryengine.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/index/text/tests/test_queryengine.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -17,7 +17,7 @@
"""
import unittest
-from BTrees.IIBTree import IIBucket
+from BTrees.IFBTree import IFBucket
from zope.index.text.queryparser import QueryParser
from zope.index.text.parsetree import QueryError
@@ -26,7 +26,7 @@
class FauxIndex(object):
def search(self, term):
- b = IIBucket()
+ b = IFBucket()
if term == "foo":
b[1] = b[3] = 1
elif term == "bar":
Modified: Zope3/trunk/src/zope/index/text/tests/test_queryparser.py
===================================================================
--- Zope3/trunk/src/zope/index/text/tests/test_queryparser.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/index/text/tests/test_queryparser.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -19,8 +19,8 @@
from zope.interface.verify import verifyClass
-from zope.index.interfaces.queryparser import IQueryParser
-from zope.index.interfaces.queryparsetree import IQueryParseTree
+from zope.index.text.interfaces import IQueryParser
+from zope.index.text.interfaces import IQueryParseTree
from zope.index.text.queryparser import QueryParser
from zope.index.text.parsetree import ParseError, ParseTreeNode
Modified: Zope3/trunk/src/zope/index/text/tests/test_setops.py
===================================================================
--- Zope3/trunk/src/zope/index/text/tests/test_setops.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/index/text/tests/test_setops.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -17,7 +17,7 @@
"""
from unittest import TestCase, main, makeSuite
-from BTrees.IIBTree import IIBTree, IIBucket
+from BTrees.IFBTree import IFBTree, IFBucket
from zope.index.text.setops import mass_weightedIntersection
from zope.index.text.setops import mass_weightedUnion
@@ -29,8 +29,8 @@
self.assertEqual(len(mass_weightedUnion([])), 0)
def testIdentity(self):
- t = IIBTree([(1, 2)])
- b = IIBucket([(1, 2)])
+ t = IFBTree([(1, 2)])
+ b = IFBucket([(1, 2)])
for x in t, b:
for func in mass_weightedUnion, mass_weightedIntersection:
result = func([(x, 1)])
@@ -38,9 +38,9 @@
self.assertEqual(list(result.items()), list(x.items()))
def testScalarMultiply(self):
- t = IIBTree([(1, 2), (2, 3), (3, 4)])
+ t = IFBTree([(1, 2), (2, 3), (3, 4)])
allkeys = [1, 2, 3]
- b = IIBucket(t)
+ b = IFBucket(t)
for x in t, b:
self.assertEqual(list(x.keys()), allkeys)
for func in mass_weightedUnion, mass_weightedIntersection:
@@ -51,11 +51,11 @@
self.assertEqual(x[key] * factor, result[key])
def testPairs(self):
- t1 = IIBTree([(1, 10), (3, 30), (7, 70)])
- t2 = IIBTree([(3, 30), (5, 50), (7, 7), (9, 90)])
+ t1 = IFBTree([(1, 10), (3, 30), (7, 70)])
+ t2 = IFBTree([(3, 30), (5, 50), (7, 7), (9, 90)])
allkeys = [1, 3, 5, 7, 9]
- b1 = IIBucket(t1)
- b2 = IIBucket(t2)
+ b1 = IFBucket(t1)
+ b2 = IFBucket(t2)
for x in t1, t2, b1, b2:
for key in x.keys():
self.assertEqual(key in allkeys, 1)
@@ -87,12 +87,12 @@
def testMany(self):
import random
- N = 15 # number of IIBTrees to feed in
+ N = 15 # number of IFBTrees to feed in
L = []
commonkey = N * 1000
allkeys = {commonkey: 1}
for i in range(N):
- t = IIBTree()
+ t = IFBTree()
t[commonkey] = i
for j in range(N-i):
key = i + j
Modified: Zope3/trunk/src/zope/index/text/tests/test_textindexwrapper.py
===================================================================
--- Zope3/trunk/src/zope/index/text/tests/test_textindexwrapper.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/index/text/tests/test_textindexwrapper.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -18,119 +18,9 @@
import unittest
-from zope.index.text.textindexwrapper import TextIndexWrapper
-from zope.index.text import parsetree
-
-class TextIndexWrapperTest(unittest.TestCase):
-
- def setUp(self):
- w = TextIndexWrapper()
- doc = u"the quick brown fox jumps over the lazy dog"
- w.index_doc(1000, [doc])
- doc = u"the brown fox and the yellow fox don't need the retriever"
- w.index_doc(1001, [doc])
- self.wrapper = w
-
- def test_clear(self):
- self.wrapper.clear()
- self.assertEqual(self.wrapper.documentCount(), 0)
- self.assertEqual(self.wrapper.wordCount(), 0)
-
- def testCounts(self):
- w = self.wrapper
- self.assertEqual(self.wrapper.documentCount(), 2)
- self.assertEqual(self.wrapper.wordCount(), 12)
- doc = u"foo bar"
- w.index_doc(1002, [doc])
- self.assertEqual(self.wrapper.documentCount(), 3)
- self.assertEqual(self.wrapper.wordCount(), 14)
-
- def testOne(self):
- matches, total = self.wrapper.query(u"quick fox", 0, 10)
- self.assertEqual(total, 1)
- [(docid, rank)] = matches # if this fails there's a problem
- self.assertEqual(docid, 1000)
-
- def testDefaultBatch(self):
- matches, total = self.wrapper.query(u"fox", 0)
- self.assertEqual(total, 2)
- self.assertEqual(len(matches), 2)
- matches, total = self.wrapper.query(u"fox")
- self.assertEqual(total, 2)
- self.assertEqual(len(matches), 2)
- matches, total = self.wrapper.query(u" fox", 1)
- self.assertEqual(total, 2)
- self.assertEqual(len(matches), 1)
-
- def testGlobbing(self):
- matches, total = self.wrapper.query("fo*")
- self.assertEqual(total, 2)
- self.assertEqual(len(matches), 2)
-
- def testLatin1(self):
- w = self.wrapper
- doc = u"Fran\xe7ois"
- w.index_doc(1002, [doc])
- matches, total = self.wrapper.query(doc, 0, 10)
- self.assertEqual(total, 1)
- [(docid, rank)] = matches # if this fails there's a problem
- self.assertEqual(docid, 1002)
-
- def testUnicode(self):
- w = self.wrapper
- # Verbose, but easy to debug
- delta = u"\N{GREEK SMALL LETTER DELTA}"
- delta += u"\N{GREEK SMALL LETTER EPSILON}"
- delta += u"\N{GREEK SMALL LETTER LAMDA}"
- delta += u"\N{GREEK SMALL LETTER TAU}"
- delta += u"\N{GREEK SMALL LETTER ALPHA}"
- self.assert_(delta.islower())
- emdash = u"\N{EM DASH}"
- self.assert_(not emdash.isalnum())
- alpha = u"\N{GREEK SMALL LETTER ALPHA}"
- self.assert_(alpha.islower())
- lamda = u"\N{GREEK SMALL LETTER LAMDA}"
- lamda += u"\N{GREEK SMALL LETTER ALPHA}"
- self.assert_(lamda.islower())
- doc = delta + emdash + alpha
- w.index_doc(1002, [doc])
- for word in delta, alpha:
- matches, total = self.wrapper.query(word, 0, 10)
- self.assertEqual(total, 1)
- [(docid, rank)] = matches # if this fails there's a problem
- self.assertEqual(docid, 1002)
- self.assertRaises(parsetree.ParseError,
- self.wrapper.query, emdash, 0, 10)
- matches, total = self.wrapper.query(lamda, 0, 10)
- self.assertEqual(total, 0)
-
- def testNone(self):
- matches, total = self.wrapper.query(u"dalmatian", 0, 10)
- self.assertEqual(total, 0)
- self.assertEqual(len(matches), 0)
-
- def testAll(self):
- matches, total = self.wrapper.query(u"brown fox", 0, 10)
- self.assertEqual(total, 2)
- self.assertEqual(len(matches), 2)
- matches.sort()
- self.assertEqual(matches[0][0], 1000)
- self.assertEqual(matches[1][0], 1001)
-
- def testBatching(self):
- matches1, total = self.wrapper.query(u"brown fox", 0, 1)
- self.assertEqual(total, 2)
- self.assertEqual(len(matches1), 1)
- matches2, total = self.wrapper.query(u"brown fox", 1, 1)
- self.assertEqual(total, 2)
- self.assertEqual(len(matches2), 1)
- matches = matches1 + matches2
- matches.sort()
- self.assertEqual(matches[0][0], 1000)
- self.assertEqual(matches[1][0], 1001)
-
def test_suite():
- return unittest.makeSuite(TextIndexWrapperTest)
-
+ from zope.testing import doctest
+ return doctest.DocFileSuite("../textindex.txt")
+
if __name__=='__main__':
unittest.main(defaultTest='test_suite')
Copied: Zope3/trunk/src/zope/index/text/textindex.py (from rev 28607, Zope3/branches/jim-index-restructure-2004-12/src/zope/index/text/textindex.py)
Property changes on: Zope3/trunk/src/zope/index/text/textindex.py
___________________________________________________________________
Name: cvs2svn:cvs-rev
+ 1.3
Name: svn:keywords
+ Id
Name: svn:eol-style
+ native
Copied: Zope3/trunk/src/zope/index/text/textindex.txt (from rev 28607, Zope3/branches/jim-index-restructure-2004-12/src/zope/index/text/textindex.txt)
Property changes on: Zope3/trunk/src/zope/index/text/textindex.txt
___________________________________________________________________
Name: svn:eol-style
+ native
Deleted: Zope3/trunk/src/zope/index/text/textindexwrapper.py
===================================================================
--- Zope3/trunk/src/zope/index/text/textindexwrapper.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/index/text/textindexwrapper.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -1,90 +0,0 @@
-##############################################################################
-#
-# Copyright (c) 2002 Zope Corporation and Contributors.
-# All Rights Reserved.
-#
-# This software is subject to the provisions of the Zope Public License,
-# Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution.
-# THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED
-# WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
-# WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS
-# FOR A PARTICULAR PURPOSE.
-#
-##############################################################################
-"""Text index wrapper.
-
-This exists to implement IInjection and IQuerying.
-
-$Id$
-"""
-
-from persistent import Persistent
-from zope.interface import implements
-
-from zope.index.text.okapiindex import OkapiIndex
-from zope.index.text.lexicon import Lexicon
-from zope.index.text.lexicon import Splitter, CaseNormalizer, StopWordRemover
-from zope.index.text.queryparser import QueryParser
-from zope.index.text.nbest import NBest
-
-from zope.index.interfaces import \
- IInjection, IQuerying, IStatistics
-
-class TextIndexWrapper(Persistent):
-
- implements(IInjection, IQuerying, IStatistics)
-
- def __init__(self, lexicon=None, index=None):
- """Provisional constructor.
-
- This creates the lexicon and index if not passed in."""
- if lexicon is None:
- lexicon = Lexicon(Splitter(), CaseNormalizer(), StopWordRemover())
- if index is None:
- index = OkapiIndex(lexicon)
- self.lexicon = lexicon
- self.index = index
-
- # Methods implementing IInjection
-
- def index_doc(self, docid, text):
- self.index.index_doc(docid, text)
-
- def unindex_doc(self, docid):
- self.index.unindex_doc(docid)
-
- def clear(self):
- self.index.clear()
-
- # Methods implementing IQuerying
-
- def query(self, querytext, start=0, count=None):
- parser = QueryParser(self.lexicon)
- tree = parser.parseQuery(querytext)
- results = tree.executeQuery(self.index)
- if not results:
- return [], 0
- if count is None:
- count = max(0, len(results) - start)
- chooser = NBest(start + count)
- chooser.addmany(results.items())
- batch = chooser.getbest()
- batch = batch[start:]
- if batch:
- qw = self.index.query_weight(tree.terms())
- # Hack to avoid ZeroDivisionError
- if qw == 0:
- qw = batch[0][1] or 1
- qw *= 1.0
- batch = [(docid, score/qw) for docid, score in batch]
- return batch, len(results)
-
- # Methods implementing IStatistics
-
- def documentCount(self):
- """Return the number of documents in the index."""
- return self.index.documentCount()
-
- def wordCount(self):
- """Return the number of words in the index."""
- return self.index.wordCount()
Modified: Zope3/trunk/src/zope/index/topic/filter.py
===================================================================
--- Zope3/trunk/src/zope/index/topic/filter.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/index/topic/filter.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -16,7 +16,7 @@
$Id$
"""
from BTrees.IIBTree import IISet
-from zope.index.interfaces import ITopicFilteredSet
+from zope.index.topic.interfaces import ITopicFilteredSet
from zope.interface import implements
class FilteredSetBase(object):
Modified: Zope3/trunk/src/zope/index/topic/index.py
===================================================================
--- Zope3/trunk/src/zope/index/topic/index.py 2004-12-09 20:53:47 UTC (rev 28609)
+++ Zope3/trunk/src/zope/index/topic/index.py 2004-12-09 20:56:05 UTC (rev 28610)
@@ -23,7 +23,8 @@
from types import ListType, TupleType, StringTypes
from zope.interface import implements
-from zope.index.interfaces import IInjection, ITopicQuerying
+from zope.index.interfaces import IInjection
+from zope.index.topic.interfaces import ITopicQuerying
class TopicIndex(Persistent):
Copied: Zope3/trunk/src/zope/index/topic/interfaces.py (from rev 28607, Zope3/branches/jim-index-restructure-2004-12/src/zope/index/topic/interfaces.py)
Property changes on: Zope3/trunk/src/zope/index/topic/interfaces.py
___________________________________________________________________
Name: cvs2svn:cvs-rev
+ 1.8
Name: svn:keywords
+ Id
Name: svn:eol-style
+ native
More information about the Zope3-Checkins
mailing list