[Zope-CMF] Building a query using NOT

Tres Seaver tseaver@digicool.com
Mon, 21 May 2001 07:54:59 -0400


Flynt wrote:

> I was not so happy with the *Search Interface*, since first, there is a
> bug in the form, second, if you don't use the Date metadata the time
> stuff makes no sense at all (I made it depend on the modification time),
> and third and most irritating, if you leave the subject field empty, you
> get no hit at all. I would expect by leaving all fields empty, to get
> all documents, as is the case, when you click the *Go* button with an
> empty field.
> 
> So I introduced a python script at the top of the *search* method, to
> modify the query characteristics. As I read your posting, I thought, I'd
> try to include also at least a very simple *NOT* functionality, and this
> might be of interest to you as a starting point for what you like to do.

This is an ideal customization strategy.

> However, the *NOT* I introduced had to fight with the existence of
> whitespace or even empty tuples in the subject field of documents.
> Whitespace keywords are introduced first, if you put nothing into the
> subject field of e.g. a newsitem (look at subject entry of a newsitem in
> the portal_catalog) and second by users hitting newlines and doing other
> *invisible* stuff whenever they enter a category into the subject field
> of their document. So even if you exclude documents under certain
> categories from the search with *NOT*, they will show up nevertheless,
> **because besides the category they were given, they also participate in
> the category *''* as well**, even without you being aware of it.
> Furthermore, documents which were created automatically ( e.g. the
> default index_html in each member directory or favourites) have a *()*
> in their subject entry in the catalog. All this made the playing around
> and testing remarkably interesting and time consuming.

Hmm, the '()' should be a clue to the Subject keyword index not to
include that document at all.  We could, of course, assign a default
Subject to those pages: '("home page", member's ID)', maybe);  or we
could quit generating default content for the member, supplying them
instead with a "customization folder" which let them display topics,
local content, etc.  I would actually prefer the latter:  I don't like
the behavior of Documents as 'index_html' at all.

> So, the *NOT* that I switched in between the *search_form* and the
> *search* method excludes not only the listed categories in the subject
> field, but also all documents with *''* as a category in their subject
> field in the catalog. I give a hint to that on my customized
> *search_form* for the user.
> 
> Here is the python script (named *search_modifySubject*) which might
> help you as a starting point to what you would like to do. I call it
> just before the line <dtml-let results=portal_catalog> in the search
> method:
> ----SNIP *search* ---------------------------------
> <dtml-var standard_html_header>
> 
> <div class="Desktop">
> 
> <h1> Search Results </h1>
> 
> <dtml-call search_modifySubject>
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> <dtml-let results=portal_catalog>
> 
> <p>Found <dtml-var expr="_.len(results)" thousands_commas>
> items<dtml-if name="SearchableText"> matching
> "&dtml-SearchableText;"</dtml-if>.</p>
> ----SNIP *and so on* --------------------------------
> 
> And here is the script itself; essentially I build a list with all the
> existing categories via the *uniqueValuesFor* method of the catalog and
> then simply remove the items, which are given as tokens after the *NOT*
> in the subject field of the *search_form*. You can easily extend to put
> the *NOT* somewhere on the line, not just at the beginning as I do. And
> this script changes the characteristics of a search with all fields
> (including the subject field) left empty to give all documents.
> 
> Small note: in line 6 I have to set REQUEST( 'Subject' ) to '' and not
> to *subject*, the list of all uniques categories I had set up in lines 3
> and 4 already. Otherwise, I would not get the documents with () in the
> subject field in the portal_catalog (hairy beast as this portal_catalog
> is; took me quite some time to find this out).
> 
> ## Script (Python) "search_modifySubject"
> ##bind container=container
> ##bind context=context
> ##bind namespace=
> ##bind script=script
> ##bind subpath=traverse_subpath
> ##parameters=
> ##title=Modify Subject List for Search
> ##
> if context.REQUEST.has_key('Subject'):
>         subject = []
>         for item in context.portal_catalog.uniqueValuesFor( 'Subject' ):
>                 subject.append( item )
>         if context.REQUEST.get( 'Subject', () ) == []:
>                 context.REQUEST.set( 'Subject', '' )
>         elif context.REQUEST.get( 'Subject', () )[0] == 'NOT':
>                 for item in context.REQUEST.get( 'Subject', () ):
>                         try:
>                                 subject.remove( item )
>                         except ValueError, e:
>                                 pass
>                 try:
>                         subject.remove( '' )
>                 except ValueError, e:
>                         pass
>                 context.REQUEST.set( 'Subject', subject )
>         else:
>                 pass

I like this technique;  I'm still puzzled about the "empty" Subjects
showing up in the index.

Tres.
-- 
===============================================================
Tres Seaver                                tseaver@digicool.com
Digital Creations     "Zope Dealers"       http://www.zope.org